paper-dynasty-card-creation/docs/LESSONS_LEARNED_ASTERISK_REGRESSION.md
Cal Corum cc5f93eb66 Fix critical asterisk regression in player names
CRITICAL BUG FIX: Removed code that was appending asterisks to left-handed
players' names and hash symbols to switch hitters' names in production.

## Changes

### Core Fix (retrosheet_data.py)
- Removed name_suffix code from new_player_payload() (lines 1103-1108)
- Players names now stored cleanly without visual indicators
- Affected 20 left-handed batters in 2005 Live cardset

### New Utility Scripts
- fix_player_names.py: PATCH player names to remove symbols (uses 'name' param)
- check_player_names.py: Verify all players for asterisks/hashes
- regenerate_lefty_cards.py: Update image URLs with cache-busting dates
- upload_lefty_cards_to_s3.py: Fetch fresh cards and upload to S3

### Documentation (CRITICAL - READ BEFORE WORKING WITH CARDS)
- docs/LESSONS_LEARNED_ASTERISK_REGRESSION.md: Comprehensive guide
  * API parameter is 'name' NOT 'p_name'
  * Card generation caching requires timestamp cache-busting
  * S3 keys must not include query parameters
  * Player names only in 'players' table
  * Never append visual indicators to stored data

- CLAUDE.md: Added critical warnings section at top

## Key Learnings
1. API param for player name is 'name', not 'p_name'
2. Cards are cached - use timestamp in ?d= parameter
3. S3 keys != S3 URLs (no query params in keys)
4. Fix data BEFORE generating/uploading cards
5. Visual indicators belong in UI, not database

## Impact
- Fixed 20 player records in production
- Regenerated and uploaded 20 clean cards to S3
- Documented to prevent future regressions

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-24 14:38:04 -06:00

5.0 KiB

Lessons Learned: Asterisk Regression & Card Upload Issues

Date: 2025-11-24 Issue: Left-handed players had asterisks appended to their names in production


Critical Learnings

1. API Parameter Names vs Database Field Names

WRONG: Using database field name for API calls

await db_patch('players', object_id=player_id, params=[('p_name', clean_name)])  # ❌

CORRECT: Use API parameter name

await db_patch('players', object_id=player_id, params=[('name', clean_name)])  # ✅

Key Point: The API parameter is name, NOT p_name. The database field may be p_name, but the API expects name.

Example PATCH URL: /api/v2/players/:player_id?name=Luis Garcia Jr


2. Card Generation Caching

Problem: Cards are cached by the API. Using the same ?d= parameter returns cached cards even after database changes.

Solution: Always use a timestamp for cache-busting when regenerating cards:

import time
timestamp = int(time.time())
release_date = f'2025-11-25-{timestamp}'
card_url = f'{API_URL}/players/{id}/battingcard?d={release_date}'

Key Point: Static dates (like 2025-11-24) will return cached cards. Use timestamps to force fresh generation.


3. S3 Keys Must Not Include Query Parameters

WRONG: Including query parameter in S3 key

s3_key = f'cards/cardset-027/player-{id}/battingcard.png?d={date}'  # ❌
# This creates a file literally named "battingcard.png?d=2025-11-24"

CORRECT: Separate key from query parameter

s3_key = f'cards/cardset-027/player-{id}/battingcard.png'  # ✅
s3_url = f'{S3_BASE_URL}/{s3_key}?d={date}'  # Query param in URL, not key

Key Point: S3 object keys should be clean paths. Query parameters are for URLs only.


4. Name Suffix Code Should Never Be in Production

The Bug: Code was appending asterisks to left-handed players

# This was in new_player_payload() - retrosheet_data.py lines 1103-1108
name_suffix = ''
if row.get('bat_hand') == 'L':
    name_suffix = '*'
elif row.get('bat_hand') == 'S':
    name_suffix = '#'

'p_name': f'{row["use_name"]} {row["last_name"]}{name_suffix}'  # ❌

Why It Existed: Likely added for visual identification during development/testing.

Why It's Wrong:

  • Stores corrupted data in production database
  • Card images display asterisks
  • Breaks searching/filtering by name

Prevention:

  • Never append visual indicators to stored data
  • Use separate display fields if needed
  • Always review diffs before committing

5. Workflow Order Matters

WRONG ORDER:

  1. Generate cards (with asterisks)
  2. Upload to S3 (with asterisks)
  3. Fix names in database
  4. Try to re-upload (but get cached cards)

CORRECT ORDER:

  1. Fix data issues in database FIRST
  2. Verify fixes with GET requests
  3. Use cache-busting parameters
  4. Fetch fresh cards
  5. Upload to S3
  6. Verify uploaded images

Key Point: Always verify database changes before triggering card generation/upload.


6. Card Name Source

Fact: Player names are ONLY stored in the players table.

When hitting /api/v2/players/{id}/battingcard?d={date}:

  • API pulls name from players.p_name field in real-time
  • battingcards and pitchingcards tables DO NOT store names
  • Card generation is live, not pre-rendered

Key Point: To fix card names, only update the players table. No need to update card tables.


Prevention Checklist

Before any card regeneration/upload:

  • Verify player names in database (no asterisks, hashes, or special chars)
  • Use timestamp-based cache-busting for fresh card generation
  • Confirm S3 keys don't include query parameters
  • Test with ONE card before batch processing
  • Verify uploaded S3 image is correct (spot check)

Quick Reference

API Parameter Names

  • Player name: name (not p_name)
  • Player image: image
  • Player positions: pos_1, pos_2, etc.

Cache-Busting Pattern

import time
timestamp = int(time.time())
url = f'{API_URL}/players/{id}/battingcard?d=2025-11-25-{timestamp}'

S3 Upload Pattern

s3_key = f'cards/cardset-{cardset:03d}/player-{id}/battingcard.png'
s3_client.put_object(Bucket=bucket, Key=s3_key, Body=image_bytes)
s3_url = f'{S3_BASE_URL}/{s3_key}?d={cache_bust_param}'

Files to Review for Similar Issues

  1. retrosheet_data.py: Check for name suffix code
  2. live_series_update.py: Check for name suffix code
  3. check_cards_and_upload.py: Verify S3 key handling
  4. Any script that does db_patch('players', ...): Verify parameter names

Impact Summary

Issue Duration: One card generation cycle Players Affected: 20 left-handed batters in 2005 Live cardset Data Corrupted: Player names had asterisks Cards Affected: 20 cards on S3 with asterisks Resolution Time: ~1 hour (including troubleshooting)

Root Cause: Development code (name suffix) left in production Fix Complexity: Simple code removal + database patches Prevention: Code review + testing before deployment