claude-configs/skills/paper-dynasty/workflows/card-generation.md
Cal Corum 197848749d Add live series workflow and PotM documentation for Paper Dynasty
- New: live-series-update.md workflow (FanGraphs data sourcing, PotM variant)
- Updated: card-generation.md with retrosheet PotM variant section
- Updated: SKILL.md with live series workflow references and load table
- Updated: CLAUDE.md, claude-pulse submodule

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 20:24:30 -06:00

7.3 KiB

Card Generation Workflow

Pre-Flight

Ask the user before starting:

  1. Refresh or new date range? (refresh keeps existing config)
  2. Which environment? (prod or dev)
  3. Which cardset? (e.g., 27 for "2005 Live")

All commands run from /mnt/NV2/Development/paper-dynasty/card-creation/.

Steps

# 1. Verify config (dry-run shows settings without executing)
pd-cards retrosheet process <year> -c <cardset_id> -d <description> --dry-run

# 2. Generate cards (POSTs player data to API)
pd-cards retrosheet process <year> -c <cardset_id> -d <description> --end <YYYYMMDD>

# 3. Validate positions (DH count MUST be <5; high DH = defense calc failure)
pd-cards retrosheet validate <cardset_id>

# 4. Generate images WITHOUT upload (triggers rendering; groundball_b bug can occur here)
pd-cards upload check -c "<cardset name>"

# 5. CRITICAL: Validate database for negative groundball_b — STOP if errors found
#    (see "Bug Prevention" section below)

# 6. Upload to S3 (fast — uses cached images from step 4)
pd-cards upload s3 -c "<cardset name>"

# 7. Generate scouting reports
pd-cards scouting all -c <cardset_id>

# 8. Upload scouting CSVs to production server
scp scouting/*.csv sba-db:container-data/pd-database/storage/

Verify scouting upload: ssh sba-db "ls -lh container-data/pd-database/storage/ | grep -E 'batting|pitching'"


Bug Prevention: The Double-Run Pattern

Card image generation (step 4) can create negative groundball_b values that crash game simulation. The prevention strategy:

  1. Step 4: Run upload check (no S3 upload) — triggers image rendering and caches images
  2. Step 5: Query database for negative groundball_b — STOP if any found
  3. Step 6: Run upload s3 — uploads the already-cached (validated) images. Fast because images are cached from step 4.

Never skip step 5. Broken cards uploaded to S3 affect all players immediately.

Step 5 Validation Query

from db_calls import db_get
import asyncio

async def check_cards():
    cards = await db_get('battingcards', params=[('cardset', 27)])
    errors = []
    for card in cards:
        if card.get('groundball_b', 0) < 0:
            errors.append(f"Player {card.get('player_id')}: groundball_b = {card.get('groundball_b')}")
        for field in ['gb_b', 'fb_b', 'ld_b']:
            val = card.get(field, 0)
            if val < 0 or val > 100:
                errors.append(f"Player {card.get('player_id')}: {field} = {val}")
    if errors:
        print('\n'.join(errors))
        print('\nDO NOT PROCEED — fix data and re-run step 2')
    else:
        print('Validation passed')

asyncio.run(check_cards())

Architecture

  • retrosheet_data.py processes Retrosheet play-by-play data, calculates ratings, POSTs to API
  • API stores cards in production database; cards are rendered on-demand via URL
  • nginx caches rendered card images by date parameter (?d=YYYY-MM-DD)
  • All operations are idempotent and safe to re-run

Data sources: Retrosheet events CSV, Baseball Reference defense CSVs (data-input/), FanGraphs splits (if needed)

Required input files:

  • data-input/retrosheet/retrosheets_events_*.csv
  • data-input/<cardset name>/defense_*.csv (defense_c.csv, defense_1b.csv, etc.)
  • data-input/<cardset name>/pitching.csv, running.csv

Scouting output: 4 CSVs in scouting/batting-basic.csv, batting-ratings.csv, pitching-basic.csv, pitching-ratings.csv


Common Issues

"No players found" after successful run: Wrong database environment, wrong CARDSET_ID, or DATE mismatch. Check alt_database in db_calls.py. For promos, ensure PROMO_INCLUSION_RETRO_IDS is populated.

High DH count (50+ players): Defense calculation failed. Check defense CSVs exist and column names match (tz_runs_total not tz_runs_outfield). Re-run step 2 after fixing.

S3 upload fails: Check ~/.aws/credentials, verify cards render at API URL manually, re-run (idempotent).

"surplus of X.XX chances" / "Adding X.XX results": Normal rounding adjustments in card generation — informational, not errors.


Players of the Month (PotM) Variant

PotM cards use the same retrosheet pipeline but with a narrower date range, a promo cardset, and a curated player list.

Key Differences from Full Cardset

Setting Full Cardset PotM
--description Live <Month> PotM (e.g., April PotM)
--cardset-id Live cardset (e.g., 27) Promo cardset (e.g., 28)
--start / --end Full season range Single month (e.g., 20050401 - 20050430)
--min-pa-vl / --min-pa-vr 20 / 40 (auto) 1 / 1 (auto when description != "Live")
Player filtering All qualifying players Only PROMO_INCLUSION_RETRO_IDS
Position updates Yes Skipped (promo players keep existing positions)

PotM Pre-Flight Checklist

  1. Choose players — Typically 2 IF, 2 OF, 1 SP, 1 RP per league (AL/NL)
  2. Get Retro IDs — Look up each player's key_retro (e.g., rodra001 for A-Rod)
  3. Determine date range — First and last day of the month in YYYYMMDD format
  4. Confirm promo cardset ID — Usually a separate cardset from the live one

PotM Steps

# 1. Dry-run to verify config
pd-cards retrosheet process <year> -c <promo_cardset_id> \
  -d "<Month> PotM" \
  --start <YYYYMMDD> --end <YYYYMMDD> \
  --dry-run

# 2. Generate promo cards
pd-cards retrosheet process <year> -c <promo_cardset_id> \
  -d "<Month> PotM" \
  --start <YYYYMMDD> --end <YYYYMMDD>

# 3. Validate (expect higher DH count — promo players may lack defense data for short windows)
pd-cards retrosheet validate <promo_cardset_id>

# 4-6. Image validation and S3 upload (same as full cardset)
pd-cards upload check -c "<promo cardset name>"
# Run groundball_b validation (step 5 from main workflow)
pd-cards upload s3 -c "<promo cardset name>"

# 7-8. Scouting reports — ALWAYS regenerate for ALL cardsets
pd-cards scouting all
pd-cards scouting upload

PotM-Specific Gotchas

  • PROMO_INCLUSION_RETRO_IDS must be populated — If description is not "Live", retrosheet_data.py filters to only these IDs. Empty list = 0 players generated.
  • Don't mix Live and PotM — If PROMO_INCLUSION_RETRO_IDS has entries but description is "Live", the script warns and exits.
  • Description protection — Once a player has a PotM description (e.g., "April PotM"), it is never overwritten by subsequent live series runs. Promo cardset descriptions are also protected: existing cards keep their original month.
  • Scouting must cover ALL cardsets — PotM players appear in scouting alongside live players. Always run pd-cards scouting all without --cardset-id to avoid overwriting the unified scouting data with partial results.

Example: May 2005 PotM

# Players: A-Rod (IF), Delgado (IF), Mench (OF), Abreu (OF), Colon (SP), Ryan (RP), Harang (SP), Hoffman (RP)
# Retro IDs configured in retrosheet_data.py PROMO_INCLUSION_RETRO_IDS

pd-cards retrosheet process 2005 -c 28 -d "May PotM" --start 20050501 --end 20050531 --dry-run
pd-cards retrosheet process 2005 -c 28 -d "May PotM" --start 20050501 --end 20050531
pd-cards retrosheet validate 28
pd-cards upload check -c "2005 Promos"
pd-cards upload s3 -c "2005 Promos"
pd-cards scouting all
pd-cards scouting upload

Last Updated: 2026-02-14 Version: 3.1 (Added Players of the Month variant)