CRITICAL BUG FIX: Removed code that was appending asterisks to left-handed players' names and hash symbols to switch hitters' names in production. ## Changes ### Core Fix (retrosheet_data.py) - Removed name_suffix code from new_player_payload() (lines 1103-1108) - Players names now stored cleanly without visual indicators - Affected 20 left-handed batters in 2005 Live cardset ### New Utility Scripts - fix_player_names.py: PATCH player names to remove symbols (uses 'name' param) - check_player_names.py: Verify all players for asterisks/hashes - regenerate_lefty_cards.py: Update image URLs with cache-busting dates - upload_lefty_cards_to_s3.py: Fetch fresh cards and upload to S3 ### Documentation (CRITICAL - READ BEFORE WORKING WITH CARDS) - docs/LESSONS_LEARNED_ASTERISK_REGRESSION.md: Comprehensive guide * API parameter is 'name' NOT 'p_name' * Card generation caching requires timestamp cache-busting * S3 keys must not include query parameters * Player names only in 'players' table * Never append visual indicators to stored data - CLAUDE.md: Added critical warnings section at top ## Key Learnings 1. API param for player name is 'name', not 'p_name' 2. Cards are cached - use timestamp in ?d= parameter 3. S3 keys != S3 URLs (no query params in keys) 4. Fix data BEFORE generating/uploading cards 5. Visual indicators belong in UI, not database ## Impact - Fixed 20 player records in production - Regenerated and uploaded 20 clean cards to S3 - Documented to prevent future regressions 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
5.0 KiB
Lessons Learned: Asterisk Regression & Card Upload Issues
Date: 2025-11-24 Issue: Left-handed players had asterisks appended to their names in production
Critical Learnings
1. API Parameter Names vs Database Field Names
WRONG: Using database field name for API calls
await db_patch('players', object_id=player_id, params=[('p_name', clean_name)]) # ❌
CORRECT: Use API parameter name
await db_patch('players', object_id=player_id, params=[('name', clean_name)]) # ✅
Key Point: The API parameter is name, NOT p_name. The database field may be p_name, but the API expects name.
Example PATCH URL: /api/v2/players/:player_id?name=Luis Garcia Jr
2. Card Generation Caching
Problem: Cards are cached by the API. Using the same ?d= parameter returns cached cards even after database changes.
Solution: Always use a timestamp for cache-busting when regenerating cards:
import time
timestamp = int(time.time())
release_date = f'2025-11-25-{timestamp}'
card_url = f'{API_URL}/players/{id}/battingcard?d={release_date}'
Key Point: Static dates (like 2025-11-24) will return cached cards. Use timestamps to force fresh generation.
3. S3 Keys Must Not Include Query Parameters
WRONG: Including query parameter in S3 key
s3_key = f'cards/cardset-027/player-{id}/battingcard.png?d={date}' # ❌
# This creates a file literally named "battingcard.png?d=2025-11-24"
CORRECT: Separate key from query parameter
s3_key = f'cards/cardset-027/player-{id}/battingcard.png' # ✅
s3_url = f'{S3_BASE_URL}/{s3_key}?d={date}' # Query param in URL, not key
Key Point: S3 object keys should be clean paths. Query parameters are for URLs only.
4. Name Suffix Code Should Never Be in Production
The Bug: Code was appending asterisks to left-handed players
# This was in new_player_payload() - retrosheet_data.py lines 1103-1108
name_suffix = ''
if row.get('bat_hand') == 'L':
name_suffix = '*'
elif row.get('bat_hand') == 'S':
name_suffix = '#'
'p_name': f'{row["use_name"]} {row["last_name"]}{name_suffix}' # ❌
Why It Existed: Likely added for visual identification during development/testing.
Why It's Wrong:
- Stores corrupted data in production database
- Card images display asterisks
- Breaks searching/filtering by name
Prevention:
- Never append visual indicators to stored data
- Use separate display fields if needed
- Always review diffs before committing
5. Workflow Order Matters
WRONG ORDER:
- Generate cards (with asterisks)
- Upload to S3 (with asterisks)
- Fix names in database
- Try to re-upload (but get cached cards)
CORRECT ORDER:
- Fix data issues in database FIRST
- Verify fixes with GET requests
- Use cache-busting parameters
- Fetch fresh cards
- Upload to S3
- Verify uploaded images
Key Point: Always verify database changes before triggering card generation/upload.
6. Card Name Source
Fact: Player names are ONLY stored in the players table.
When hitting /api/v2/players/{id}/battingcard?d={date}:
- API pulls name from
players.p_namefield in real-time battingcardsandpitchingcardstables DO NOT store names- Card generation is live, not pre-rendered
Key Point: To fix card names, only update the players table. No need to update card tables.
Prevention Checklist
Before any card regeneration/upload:
- Verify player names in database (no asterisks, hashes, or special chars)
- Use timestamp-based cache-busting for fresh card generation
- Confirm S3 keys don't include query parameters
- Test with ONE card before batch processing
- Verify uploaded S3 image is correct (spot check)
Quick Reference
API Parameter Names
- Player name:
name(notp_name) - Player image:
image - Player positions:
pos_1,pos_2, etc.
Cache-Busting Pattern
import time
timestamp = int(time.time())
url = f'{API_URL}/players/{id}/battingcard?d=2025-11-25-{timestamp}'
S3 Upload Pattern
s3_key = f'cards/cardset-{cardset:03d}/player-{id}/battingcard.png'
s3_client.put_object(Bucket=bucket, Key=s3_key, Body=image_bytes)
s3_url = f'{S3_BASE_URL}/{s3_key}?d={cache_bust_param}'
Files to Review for Similar Issues
- retrosheet_data.py: Check for name suffix code
- live_series_update.py: Check for name suffix code
- check_cards_and_upload.py: Verify S3 key handling
- Any script that does db_patch('players', ...): Verify parameter names
Impact Summary
Issue Duration: One card generation cycle Players Affected: 20 left-handed batters in 2005 Live cardset Data Corrupted: Player names had asterisks Cards Affected: 20 cards on S3 with asterisks Resolution Time: ~1 hour (including troubleshooting)
Root Cause: Development code (name suffix) left in production Fix Complexity: Simple code removal + database patches Prevention: Code review + testing before deployment