- Remove /commit and /commit-push commands (native /commit replaces both) - Update paper-dynasty to use cognitive-memory MCP instead of archived MemoryGraph - Fix scouting upload verify path (sba-db -> akamai) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
7.4 KiB
Live Series Update Workflow
Used during the MLB regular season to generate cards from current-year FanGraphs split data and Baseball Reference fielding/running stats.
Pre-Flight
Ask the user before starting:
- Which cardset? (e.g., "2025 Season")
- How many games played? (determines season percentage for min PA thresholds)
- Which environment? (prod or dev — check
alt_databaseindb_calls.py)
All commands run from /mnt/NV2/Development/paper-dynasty/card-creation/.
Data Sourcing
Live series uses FanGraphs splits for batting/pitching and Baseball Reference for defense/running.
FanGraphs Data (Manual Download)
FanGraphs split data must be downloaded manually via scripts/fangraphs_scrape.py or the FanGraphs web UI. The scraper uses Selenium to export 8 CSV files:
| File | Content |
|---|---|
Batting_vLHP_Standard.csv |
Batting vs LHP — standard stats |
Batting_vLHP_BattedBalls.csv |
Batting vs LHP — batted ball profile |
Batting_vRHP_Standard.csv |
Batting vs RHP — standard stats |
Batting_vRHP_BattedBalls.csv |
Batting vs RHP — batted ball profile |
Pitching_vLHH_Standard.csv |
Pitching vs LHH — standard stats |
Pitching_vLHH_BattedBalls.csv |
Pitching vs LHH — batted ball profile |
Pitching_vRHH_Standard.csv |
Pitching vs RHH — standard stats |
Pitching_vRHH_BattedBalls.csv |
Pitching vs RHH — batted ball profile |
These map to the expected input files in data-input/{cardset} Cardset/:
vlhp-basic.csv/vlhp-rate.csvvrhp-basic.csv/vrhp-rate.csvvlhh-basic.csv/vlhh-rate.csvvrhh-basic.csv/vrhh-rate.csv
For PotM: Adjust the startDate and endDate in the scraper to cover only the target month.
Baseball Reference Data
Fielding stats are pulled automatically during card generation when --pull-fielding is enabled (default). Running and pitching stats come from CSVs in the data-input directory.
Steps
# 1. Download FanGraphs splits data
# Run the scraper or manually download from FanGraphs splits leaderboard
# Place CSVs in data-input/{cardset} Cardset/
# 2. Verify config (dry-run)
pd-cards live-series update --cardset "<cardset name>" --games <N> --dry-run
# 3. Generate cards (POSTs player data to API)
pd-cards live-series update --cardset "<cardset name>" --games <N>
# 4. Generate images WITHOUT upload (triggers rendering)
pd-cards upload check -c "<cardset name>"
# 5. CRITICAL: Validate database for negative groundball_b — STOP if errors found
# (see card-generation.md "Bug Prevention" section for validation query)
# 6. Upload to S3 (fast — uses cached images from step 4)
pd-cards upload s3 -c "<cardset name>"
# 7. Generate scouting reports (ALWAYS run for ALL cardsets)
pd-cards scouting all
# 8. Upload scouting CSVs to production server
pd-cards scouting upload
Verify scouting upload: ssh akamai "ls -lh container-data/paper-dynasty/storage/ | grep -E 'batting|pitching'"
Key Differences from Retrosheet Workflow
| Aspect | Live Series | Retrosheet |
|---|---|---|
| Data source | FanGraphs splits + BBRef | Retrosheet play-by-play events |
| CLI command | pd-cards live-series update |
pd-cards retrosheet process |
| Season progress | --games N (1-162) |
--season-pct + --start/--end dates |
| Defense data | Auto-pulled from BBRef (--pull-fielding) |
Pre-downloaded defense CSVs |
| Position validation | Built-in (skips for promo cardsets) | Separate pd-cards retrosheet validate step |
| Arm ratings | Not applicable (BBRef has current data) | Generated from Retrosheet events |
| Recency bias | Not applicable | --last-twoweeks-ratio (auto-enabled after May 30) |
| Player ID lookup | FanGraphs/BBRef IDs in CSV | Retrosheet IDs → pybaseball reverse lookup |
Players of the Month (PotM) Variant
During the regular season, PotM cards are generated from the same FanGraphs pipeline but filtered to a single month's stats and posted to a promo cardset.
Key Differences from Full Update
| Setting | Full Update | PotM |
|---|---|---|
| Cardset | Season cardset (e.g., "2025 Season") | Promo cardset (e.g., "2025 Promos") |
| FanGraphs date range | Season start → current date | Month start → month end |
--games |
Cumulative games played | Games in that month (~27) |
--ignore-limits |
Usually no | Usually yes (short sample) |
| Position updates | Yes | Skipped (cardset name contains "promos") |
PotM Pre-Flight Checklist
- Choose players — Typically 2 IF, 2 OF, 1 SP, 1 RP per league
- Download month-specific FanGraphs data — Set date range in scraper to the target month only
- Confirm promo cardset exists in the database
- Place CSVs in the promo cardset's data-input directory
PotM Steps
# 1. Download FanGraphs splits for the target month only
# Adjust startDate/endDate in fangraphs_scrape.py or manual download
# Place in data-input/{promo cardset} Cardset/
# 2. Dry-run
pd-cards live-series update --cardset "<promo cardset>" --games <month_games> \
--description "<Month> PotM" --ignore-limits --dry-run
# 3. Generate cards
pd-cards live-series update --cardset "<promo cardset>" --games <month_games> \
--description "<Month> PotM" --ignore-limits
# 4-6. Image validation and S3 upload (same pattern)
pd-cards upload check -c "<promo cardset name>"
# Run groundball_b validation
pd-cards upload s3 -c "<promo cardset name>"
# 7-8. Scouting reports — ALWAYS regenerate for ALL cardsets
pd-cards scouting all
pd-cards scouting upload
PotM-Specific Notes
- Position updates are skipped when the cardset name contains "promos" (both live_series_update.py and the CLI check for this).
- Description protection — PotM descriptions (e.g., "April PotM") are never overwritten by subsequent full-cardset runs. The
should_update_player_description()helper checks for "potm" in the existing description. --ignore-limitsis typically needed because a single month may not produce enough PA/TBF to meet normal thresholds (20 vL / 40 vR).- Scouting must cover ALL cardsets — PotM players appear alongside live players. Always run
pd-cards scouting allwithout--cardset-idto preserve the unified scouting view.
Example: June 2025 PotM
# Download June-only FanGraphs splits (June 1 - June 30)
# Place CSVs in data-input/2025 Promos Cardset/
pd-cards live-series update --cardset "2025 Promos" --games 27 \
--description "June PotM" --ignore-limits --dry-run
pd-cards live-series update --cardset "2025 Promos" --games 27 \
--description "June PotM" --ignore-limits
pd-cards upload check -c "2025 Promos"
pd-cards upload s3 -c "2025 Promos"
pd-cards scouting all
pd-cards scouting upload
Common Issues
"No players found": Wrong cardset name or database environment. Verify alt_database in db_calls.py.
Missing FanGraphs CSVs: The scraper requires Chrome/Selenium. If it fails, download manually from FanGraphs splits leaderboard with the correct date range and stat group settings.
High DH count: Defense pull failed or BBRef was rate-limited. Re-run with --pull-fielding or manually download defense CSVs.
Early-season runs: Use --ignore-limits when games played is low (< ~40) to avoid filtering out most players.
Last Updated: 2026-02-14 Version: 1.0 (Initial workflow documentation)