CRITICAL BUG FIX: Removed code that was appending asterisks to left-handed players' names and hash symbols to switch hitters' names in production. ## Changes ### Core Fix (retrosheet_data.py) - Removed name_suffix code from new_player_payload() (lines 1103-1108) - Players names now stored cleanly without visual indicators - Affected 20 left-handed batters in 2005 Live cardset ### New Utility Scripts - fix_player_names.py: PATCH player names to remove symbols (uses 'name' param) - check_player_names.py: Verify all players for asterisks/hashes - regenerate_lefty_cards.py: Update image URLs with cache-busting dates - upload_lefty_cards_to_s3.py: Fetch fresh cards and upload to S3 ### Documentation (CRITICAL - READ BEFORE WORKING WITH CARDS) - docs/LESSONS_LEARNED_ASTERISK_REGRESSION.md: Comprehensive guide * API parameter is 'name' NOT 'p_name' * Card generation caching requires timestamp cache-busting * S3 keys must not include query parameters * Player names only in 'players' table * Never append visual indicators to stored data - CLAUDE.md: Added critical warnings section at top ## Key Learnings 1. API param for player name is 'name', not 'p_name' 2. Cards are cached - use timestamp in ?d= parameter 3. S3 keys != S3 URLs (no query params in keys) 4. Fix data BEFORE generating/uploading cards 5. Visual indicators belong in UI, not database ## Impact - Fixed 20 player records in production - Regenerated and uploaded 20 clean cards to S3 - Documented to prevent future regressions 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
178 lines
5.0 KiB
Markdown
178 lines
5.0 KiB
Markdown
# Lessons Learned: Asterisk Regression & Card Upload Issues
|
|
|
|
**Date**: 2025-11-24
|
|
**Issue**: Left-handed players had asterisks appended to their names in production
|
|
|
|
---
|
|
|
|
## Critical Learnings
|
|
|
|
### 1. API Parameter Names vs Database Field Names
|
|
|
|
**WRONG**: Using database field name for API calls
|
|
```python
|
|
await db_patch('players', object_id=player_id, params=[('p_name', clean_name)]) # ❌
|
|
```
|
|
|
|
**CORRECT**: Use API parameter name
|
|
```python
|
|
await db_patch('players', object_id=player_id, params=[('name', clean_name)]) # ✅
|
|
```
|
|
|
|
**Key Point**: The API parameter is `name`, NOT `p_name`. The database field may be `p_name`, but the API expects `name`.
|
|
|
|
**Example PATCH URL**: `/api/v2/players/:player_id?name=Luis Garcia Jr`
|
|
|
|
---
|
|
|
|
### 2. Card Generation Caching
|
|
|
|
**Problem**: Cards are cached by the API. Using the same `?d=` parameter returns cached cards even after database changes.
|
|
|
|
**Solution**: Always use a timestamp for cache-busting when regenerating cards:
|
|
```python
|
|
import time
|
|
timestamp = int(time.time())
|
|
release_date = f'2025-11-25-{timestamp}'
|
|
card_url = f'{API_URL}/players/{id}/battingcard?d={release_date}'
|
|
```
|
|
|
|
**Key Point**: Static dates (like `2025-11-24`) will return cached cards. Use timestamps to force fresh generation.
|
|
|
|
---
|
|
|
|
### 3. S3 Keys Must Not Include Query Parameters
|
|
|
|
**WRONG**: Including query parameter in S3 key
|
|
```python
|
|
s3_key = f'cards/cardset-027/player-{id}/battingcard.png?d={date}' # ❌
|
|
# This creates a file literally named "battingcard.png?d=2025-11-24"
|
|
```
|
|
|
|
**CORRECT**: Separate key from query parameter
|
|
```python
|
|
s3_key = f'cards/cardset-027/player-{id}/battingcard.png' # ✅
|
|
s3_url = f'{S3_BASE_URL}/{s3_key}?d={date}' # Query param in URL, not key
|
|
```
|
|
|
|
**Key Point**: S3 object keys should be clean paths. Query parameters are for URLs only.
|
|
|
|
---
|
|
|
|
### 4. Name Suffix Code Should Never Be in Production
|
|
|
|
**The Bug**: Code was appending asterisks to left-handed players
|
|
```python
|
|
# This was in new_player_payload() - retrosheet_data.py lines 1103-1108
|
|
name_suffix = ''
|
|
if row.get('bat_hand') == 'L':
|
|
name_suffix = '*'
|
|
elif row.get('bat_hand') == 'S':
|
|
name_suffix = '#'
|
|
|
|
'p_name': f'{row["use_name"]} {row["last_name"]}{name_suffix}' # ❌
|
|
```
|
|
|
|
**Why It Existed**: Likely added for visual identification during development/testing.
|
|
|
|
**Why It's Wrong**:
|
|
- Stores corrupted data in production database
|
|
- Card images display asterisks
|
|
- Breaks searching/filtering by name
|
|
|
|
**Prevention**:
|
|
- Never append visual indicators to stored data
|
|
- Use separate display fields if needed
|
|
- Always review diffs before committing
|
|
|
|
---
|
|
|
|
### 5. Workflow Order Matters
|
|
|
|
**WRONG ORDER**:
|
|
1. Generate cards (with asterisks)
|
|
2. Upload to S3 (with asterisks)
|
|
3. Fix names in database
|
|
4. Try to re-upload (but get cached cards)
|
|
|
|
**CORRECT ORDER**:
|
|
1. Fix data issues in database FIRST
|
|
2. Verify fixes with GET requests
|
|
3. Use cache-busting parameters
|
|
4. Fetch fresh cards
|
|
5. Upload to S3
|
|
6. Verify uploaded images
|
|
|
|
**Key Point**: Always verify database changes before triggering card generation/upload.
|
|
|
|
---
|
|
|
|
### 6. Card Name Source
|
|
|
|
**Fact**: Player names are ONLY stored in the `players` table.
|
|
|
|
**When hitting** `/api/v2/players/{id}/battingcard?d={date}`:
|
|
- API pulls name from `players.p_name` field in real-time
|
|
- `battingcards` and `pitchingcards` tables DO NOT store names
|
|
- Card generation is live, not pre-rendered
|
|
|
|
**Key Point**: To fix card names, only update the `players` table. No need to update card tables.
|
|
|
|
---
|
|
|
|
## Prevention Checklist
|
|
|
|
Before any card regeneration/upload:
|
|
|
|
- [ ] Verify player names in database (no asterisks, hashes, or special chars)
|
|
- [ ] Use timestamp-based cache-busting for fresh card generation
|
|
- [ ] Confirm S3 keys don't include query parameters
|
|
- [ ] Test with ONE card before batch processing
|
|
- [ ] Verify uploaded S3 image is correct (spot check)
|
|
|
|
---
|
|
|
|
## Quick Reference
|
|
|
|
### API Parameter Names
|
|
- **Player name**: `name` (not `p_name`)
|
|
- **Player image**: `image`
|
|
- **Player positions**: `pos_1`, `pos_2`, etc.
|
|
|
|
### Cache-Busting Pattern
|
|
```python
|
|
import time
|
|
timestamp = int(time.time())
|
|
url = f'{API_URL}/players/{id}/battingcard?d=2025-11-25-{timestamp}'
|
|
```
|
|
|
|
### S3 Upload Pattern
|
|
```python
|
|
s3_key = f'cards/cardset-{cardset:03d}/player-{id}/battingcard.png'
|
|
s3_client.put_object(Bucket=bucket, Key=s3_key, Body=image_bytes)
|
|
s3_url = f'{S3_BASE_URL}/{s3_key}?d={cache_bust_param}'
|
|
```
|
|
|
|
---
|
|
|
|
## Files to Review for Similar Issues
|
|
|
|
1. **retrosheet_data.py**: Check for name suffix code
|
|
2. **live_series_update.py**: Check for name suffix code
|
|
3. **check_cards_and_upload.py**: Verify S3 key handling
|
|
4. **Any script that does db_patch('players', ...)**: Verify parameter names
|
|
|
|
---
|
|
|
|
## Impact Summary
|
|
|
|
**Issue Duration**: One card generation cycle
|
|
**Players Affected**: 20 left-handed batters in 2005 Live cardset
|
|
**Data Corrupted**: Player names had asterisks
|
|
**Cards Affected**: 20 cards on S3 with asterisks
|
|
**Resolution Time**: ~1 hour (including troubleshooting)
|
|
|
|
**Root Cause**: Development code (name suffix) left in production
|
|
**Fix Complexity**: Simple code removal + database patches
|
|
**Prevention**: Code review + testing before deployment
|