# Lessons Learned: Asterisk Regression & Card Upload Issues **Date**: 2025-11-24 **Issue**: Left-handed players had asterisks appended to their names in production --- ## Critical Learnings ### 1. API Parameter Names vs Database Field Names **WRONG**: Using database field name for API calls ```python await db_patch('players', object_id=player_id, params=[('p_name', clean_name)]) # ❌ ``` **CORRECT**: Use API parameter name ```python await db_patch('players', object_id=player_id, params=[('name', clean_name)]) # ✅ ``` **Key Point**: The API parameter is `name`, NOT `p_name`. The database field may be `p_name`, but the API expects `name`. **Example PATCH URL**: `/api/v2/players/:player_id?name=Luis Garcia Jr` --- ### 2. Card Generation Caching **Problem**: Cards are cached by the API. Using the same `?d=` parameter returns cached cards even after database changes. **Solution**: Always use a timestamp for cache-busting when regenerating cards: ```python import time timestamp = int(time.time()) release_date = f'2025-11-25-{timestamp}' card_url = f'{API_URL}/players/{id}/battingcard?d={release_date}' ``` **Key Point**: Static dates (like `2025-11-24`) will return cached cards. Use timestamps to force fresh generation. --- ### 3. S3 Keys Must Not Include Query Parameters **WRONG**: Including query parameter in S3 key ```python s3_key = f'cards/cardset-027/player-{id}/battingcard.png?d={date}' # ❌ # This creates a file literally named "battingcard.png?d=2025-11-24" ``` **CORRECT**: Separate key from query parameter ```python s3_key = f'cards/cardset-027/player-{id}/battingcard.png' # ✅ s3_url = f'{S3_BASE_URL}/{s3_key}?d={date}' # Query param in URL, not key ``` **Key Point**: S3 object keys should be clean paths. Query parameters are for URLs only. --- ### 4. Name Suffix Code Should Never Be in Production **The Bug**: Code was appending asterisks to left-handed players ```python # This was in new_player_payload() - retrosheet_data.py lines 1103-1108 name_suffix = '' if row.get('bat_hand') == 'L': name_suffix = '*' elif row.get('bat_hand') == 'S': name_suffix = '#' 'p_name': f'{row["use_name"]} {row["last_name"]}{name_suffix}' # ❌ ``` **Why It Existed**: Likely added for visual identification during development/testing. **Why It's Wrong**: - Stores corrupted data in production database - Card images display asterisks - Breaks searching/filtering by name **Prevention**: - Never append visual indicators to stored data - Use separate display fields if needed - Always review diffs before committing --- ### 5. Workflow Order Matters **WRONG ORDER**: 1. Generate cards (with asterisks) 2. Upload to S3 (with asterisks) 3. Fix names in database 4. Try to re-upload (but get cached cards) **CORRECT ORDER**: 1. Fix data issues in database FIRST 2. Verify fixes with GET requests 3. Use cache-busting parameters 4. Fetch fresh cards 5. Upload to S3 6. Verify uploaded images **Key Point**: Always verify database changes before triggering card generation/upload. --- ### 6. Card Name Source **Fact**: Player names are ONLY stored in the `players` table. **When hitting** `/api/v2/players/{id}/battingcard?d={date}`: - API pulls name from `players.p_name` field in real-time - `battingcards` and `pitchingcards` tables DO NOT store names - Card generation is live, not pre-rendered **Key Point**: To fix card names, only update the `players` table. No need to update card tables. --- ## Prevention Checklist Before any card regeneration/upload: - [ ] Verify player names in database (no asterisks, hashes, or special chars) - [ ] Use timestamp-based cache-busting for fresh card generation - [ ] Confirm S3 keys don't include query parameters - [ ] Test with ONE card before batch processing - [ ] Verify uploaded S3 image is correct (spot check) --- ## Quick Reference ### API Parameter Names - **Player name**: `name` (not `p_name`) - **Player image**: `image` - **Player positions**: `pos_1`, `pos_2`, etc. ### Cache-Busting Pattern ```python import time timestamp = int(time.time()) url = f'{API_URL}/players/{id}/battingcard?d=2025-11-25-{timestamp}' ``` ### S3 Upload Pattern ```python s3_key = f'cards/cardset-{cardset:03d}/player-{id}/battingcard.png' s3_client.put_object(Bucket=bucket, Key=s3_key, Body=image_bytes) s3_url = f'{S3_BASE_URL}/{s3_key}?d={cache_bust_param}' ``` --- ## Files to Review for Similar Issues 1. **retrosheet_data.py**: Check for name suffix code 2. **live_series_update.py**: Check for name suffix code 3. **check_cards_and_upload.py**: Verify S3 key handling 4. **Any script that does db_patch('players', ...)**: Verify parameter names --- ## Impact Summary **Issue Duration**: One card generation cycle **Players Affected**: 20 left-handed batters in 2005 Live cardset **Data Corrupted**: Player names had asterisks **Cards Affected**: 20 cards on S3 with asterisks **Resolution Time**: ~1 hour (including troubleshooting) **Root Cause**: Development code (name suffix) left in production **Fix Complexity**: Simple code removal + database patches **Prevention**: Code review + testing before deployment