Add live series workflow and PotM documentation for Paper Dynasty

- New: live-series-update.md workflow (FanGraphs data sourcing, PotM variant)
- Updated: card-generation.md with retrosheet PotM variant section
- Updated: SKILL.md with live series workflow references and load table
- Updated: CLAUDE.md, claude-pulse submodule

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Cal Corum 2026-02-14 20:24:30 -06:00
parent 08b74ec6d6
commit 197848749d
5 changed files with 269 additions and 9 deletions

View File

@ -6,6 +6,7 @@ Automatic loads are NOT enough — Read loads required CLAUDE.md context along t
- User's name is Cal (he/him)
- If not confident in an answer, say so. Offer hypothesis + options to investigate.
- When writing tests, include detailed docstrings explaining "what" and "why"
- Launch sub-agents with Sonnet model unless another model is specified by the user
## Git Commits
- NEVER commit/add/push/tag without explicit user approval ("commit this", "go ahead")
@ -21,6 +22,7 @@ Automatic loads are NOT enough — Read loads required CLAUDE.md context along t
## Tech Preferences
- Python with uv for package/environment management
- Utilize dependency injection pattern whenever possible
- Never add lazy imports to middle of file
## Memory Protocol (Cognitive Memory)

@ -1 +1 @@
Subproject commit 398dc639a4a8ead9852843cf322e954bccdf1dbe
Subproject commit c4c2ee6a4ed494d30c60a6d26e6a604d1dfccee4

View File

@ -85,7 +85,8 @@ Only fall back to the Python API (`api_client.py`) for complex multi-step operat
| **Gauntlet Cleanup** | "Clean up gauntlet team X" | `$PD gauntlet cleanup Gauntlet-X -e N -y` |
| **Pack Distribution** | "Give N packs to everyone" | `$PD pack distribute --num N` |
| **Scouting Update** | "Update scouting" | `pd-cards scouting all -c 27` |
| **Card Generation** | "Generate cards for 2005" | `pd-cards retrosheet process 2005 -c 27 -d Live` |
| **Card Generation (Retrosheet)** | "Generate cards for 2005" | `pd-cards retrosheet process 2005 -c 27 -d Live` |
| **Card Generation (Live Series)** | "Update live series cards" | `pd-cards live-series update -c "2025 Season" -g 81` |
| **Custom Cards** | "Create custom player" | `pd-cards custom preview name` |
| **S3 Upload** | "Upload cards to S3" | `pd-cards upload s3 -c "2005 Live"` |
| **Bot Troubleshooting** | "Check bot logs" | `ssh sba-bots "docker logs paper-dynasty_discord-app_1 --tail 100"` |
@ -240,10 +241,11 @@ $PD gauntlet list/teams/cleanup # Gauntlet operations
│ ├── api-reference.md # Endpoints, authentication, client examples
│ └── cli-reference.md # Full paperdomo & pd-cards commands
├── workflows/
│ ├── card-generation.md
│ ├── card_utilities.py # Card refresh pipeline (fetch → S3 → update)
│ ├── card-generation.md # Retrosheet card workflow + PotM variant
│ ├── live-series-update.md # Live series (FanGraphs) workflow + PotM variant
│ ├── card_utilities.py # Card refresh pipeline (fetch → S3 → update)
│ ├── custom-card-creation.md # Archetypes, manual creation, rating rules
│ └── TROUBLESHOOTING.md # Card rendering issues
│ └── TROUBLESHOOTING.md # Card rendering issues
└── scripts/
├── distribute_packs.py
├── gauntlet_cleanup.py
@ -264,9 +266,11 @@ $PD gauntlet list/teams/cleanup # Gauntlet operations
| Database model details | `reference/database-schema.md` |
| API endpoints & client usage | `reference/api-reference.md` |
| Full CLI command reference | `reference/cli-reference.md` |
| Retrosheet card workflow / PotM | `workflows/card-generation.md` |
| Live series workflow / PotM | `workflows/live-series-update.md` |
| Card rendering issues | `workflows/TROUBLESHOOTING.md` |
---
**Last Updated**: 2026-02-12
**Version**: 2.5 (CLI-first: rewrote Critical Rules, Common Patterns, and Workflows to prefer CLI over raw API calls)
**Last Updated**: 2026-02-14
**Version**: 2.6 (Added live series workflow, PotM documentation for both data sources)

View File

@ -108,5 +108,78 @@ asyncio.run(check_cards())
---
**Last Updated**: 2026-02-12
**Version**: 3.0 (Trimmed to essentials; pd-cards CLI handles individual steps)
## Players of the Month (PotM) Variant
PotM cards use the same retrosheet pipeline but with a narrower date range, a promo cardset, and a curated player list.
### Key Differences from Full Cardset
| Setting | Full Cardset | PotM |
|---------|-------------|------|
| `--description` | `Live` | `<Month> PotM` (e.g., `April PotM`) |
| `--cardset-id` | Live cardset (e.g., 27) | Promo cardset (e.g., 28) |
| `--start` / `--end` | Full season range | Single month (e.g., `20050401` - `20050430`) |
| `--min-pa-vl` / `--min-pa-vr` | 20 / 40 (auto) | 1 / 1 (auto when description != "Live") |
| Player filtering | All qualifying players | Only `PROMO_INCLUSION_RETRO_IDS` |
| Position updates | Yes | Skipped (promo players keep existing positions) |
### PotM Pre-Flight Checklist
1. **Choose players** — Typically 2 IF, 2 OF, 1 SP, 1 RP per league (AL/NL)
2. **Get Retro IDs** — Look up each player's `key_retro` (e.g., `rodra001` for A-Rod)
3. **Determine date range** — First and last day of the month in `YYYYMMDD` format
4. **Confirm promo cardset ID** — Usually a separate cardset from the live one
### PotM Steps
```bash
# 1. Dry-run to verify config
pd-cards retrosheet process <year> -c <promo_cardset_id> \
-d "<Month> PotM" \
--start <YYYYMMDD> --end <YYYYMMDD> \
--dry-run
# 2. Generate promo cards
pd-cards retrosheet process <year> -c <promo_cardset_id> \
-d "<Month> PotM" \
--start <YYYYMMDD> --end <YYYYMMDD>
# 3. Validate (expect higher DH count — promo players may lack defense data for short windows)
pd-cards retrosheet validate <promo_cardset_id>
# 4-6. Image validation and S3 upload (same as full cardset)
pd-cards upload check -c "<promo cardset name>"
# Run groundball_b validation (step 5 from main workflow)
pd-cards upload s3 -c "<promo cardset name>"
# 7-8. Scouting reports — ALWAYS regenerate for ALL cardsets
pd-cards scouting all
pd-cards scouting upload
```
### PotM-Specific Gotchas
- **`PROMO_INCLUSION_RETRO_IDS` must be populated** — If description is not "Live", retrosheet_data.py filters to only these IDs. Empty list = 0 players generated.
- **Don't mix Live and PotM** — If `PROMO_INCLUSION_RETRO_IDS` has entries but description is "Live", the script warns and exits.
- **Description protection** — Once a player has a PotM description (e.g., "April PotM"), it is never overwritten by subsequent live series runs. Promo cardset descriptions are also protected: existing cards keep their original month.
- **Scouting must cover ALL cardsets** — PotM players appear in scouting alongside live players. Always run `pd-cards scouting all` without `--cardset-id` to avoid overwriting the unified scouting data with partial results.
### Example: May 2005 PotM
```bash
# Players: A-Rod (IF), Delgado (IF), Mench (OF), Abreu (OF), Colon (SP), Ryan (RP), Harang (SP), Hoffman (RP)
# Retro IDs configured in retrosheet_data.py PROMO_INCLUSION_RETRO_IDS
pd-cards retrosheet process 2005 -c 28 -d "May PotM" --start 20050501 --end 20050531 --dry-run
pd-cards retrosheet process 2005 -c 28 -d "May PotM" --start 20050501 --end 20050531
pd-cards retrosheet validate 28
pd-cards upload check -c "2005 Promos"
pd-cards upload s3 -c "2005 Promos"
pd-cards scouting all
pd-cards scouting upload
```
---
**Last Updated**: 2026-02-14
**Version**: 3.1 (Added Players of the Month variant)

View File

@ -0,0 +1,181 @@
# Live Series Update Workflow
Used during the MLB regular season to generate cards from current-year FanGraphs split data and Baseball Reference fielding/running stats.
## Pre-Flight
Ask the user before starting:
1. **Which cardset?** (e.g., "2025 Season")
2. **How many games played?** (determines season percentage for min PA thresholds)
3. **Which environment?** (prod or dev — check `alt_database` in `db_calls.py`)
All commands run from `/mnt/NV2/Development/paper-dynasty/card-creation/`.
## Data Sourcing
Live series uses **FanGraphs splits** for batting/pitching and **Baseball Reference** for defense/running.
### FanGraphs Data (Manual Download)
FanGraphs split data must be downloaded manually via `scripts/fangraphs_scrape.py` or the FanGraphs web UI. The scraper uses Selenium to export 8 CSV files:
| File | Content |
|------|---------|
| `Batting_vLHP_Standard.csv` | Batting vs LHP — standard stats |
| `Batting_vLHP_BattedBalls.csv` | Batting vs LHP — batted ball profile |
| `Batting_vRHP_Standard.csv` | Batting vs RHP — standard stats |
| `Batting_vRHP_BattedBalls.csv` | Batting vs RHP — batted ball profile |
| `Pitching_vLHH_Standard.csv` | Pitching vs LHH — standard stats |
| `Pitching_vLHH_BattedBalls.csv` | Pitching vs LHH — batted ball profile |
| `Pitching_vRHH_Standard.csv` | Pitching vs RHH — standard stats |
| `Pitching_vRHH_BattedBalls.csv` | Pitching vs RHH — batted ball profile |
These map to the expected input files in `data-input/{cardset} Cardset/`:
- `vlhp-basic.csv` / `vlhp-rate.csv`
- `vrhp-basic.csv` / `vrhp-rate.csv`
- `vlhh-basic.csv` / `vlhh-rate.csv`
- `vrhh-basic.csv` / `vrhh-rate.csv`
**For PotM**: Adjust the `startDate` and `endDate` in the scraper to cover only the target month.
### Baseball Reference Data
Fielding stats are pulled automatically during card generation when `--pull-fielding` is enabled (default). Running and pitching stats come from CSVs in the data-input directory.
---
## Steps
```bash
# 1. Download FanGraphs splits data
# Run the scraper or manually download from FanGraphs splits leaderboard
# Place CSVs in data-input/{cardset} Cardset/
# 2. Verify config (dry-run)
pd-cards live-series update --cardset "<cardset name>" --games <N> --dry-run
# 3. Generate cards (POSTs player data to API)
pd-cards live-series update --cardset "<cardset name>" --games <N>
# 4. Generate images WITHOUT upload (triggers rendering)
pd-cards upload check -c "<cardset name>"
# 5. CRITICAL: Validate database for negative groundball_b — STOP if errors found
# (see card-generation.md "Bug Prevention" section for validation query)
# 6. Upload to S3 (fast — uses cached images from step 4)
pd-cards upload s3 -c "<cardset name>"
# 7. Generate scouting reports (ALWAYS run for ALL cardsets)
pd-cards scouting all
# 8. Upload scouting CSVs to production server
pd-cards scouting upload
```
**Verify scouting upload**: `ssh sba-db "ls -lh container-data/pd-database/storage/ | grep -E 'batting|pitching'"`
---
## Key Differences from Retrosheet Workflow
| Aspect | Live Series | Retrosheet |
|--------|-------------|------------|
| **Data source** | FanGraphs splits + BBRef | Retrosheet play-by-play events |
| **CLI command** | `pd-cards live-series update` | `pd-cards retrosheet process` |
| **Season progress** | `--games N` (1-162) | `--season-pct` + `--start`/`--end` dates |
| **Defense data** | Auto-pulled from BBRef (`--pull-fielding`) | Pre-downloaded defense CSVs |
| **Position validation** | Built-in (skips for promo cardsets) | Separate `pd-cards retrosheet validate` step |
| **Arm ratings** | Not applicable (BBRef has current data) | Generated from Retrosheet events |
| **Recency bias** | Not applicable | `--last-twoweeks-ratio` (auto-enabled after May 30) |
| **Player ID lookup** | FanGraphs/BBRef IDs in CSV | Retrosheet IDs → pybaseball reverse lookup |
---
## Players of the Month (PotM) Variant
During the regular season, PotM cards are generated from the same FanGraphs pipeline but filtered to a single month's stats and posted to a promo cardset.
### Key Differences from Full Update
| Setting | Full Update | PotM |
|---------|------------|------|
| Cardset | Season cardset (e.g., "2025 Season") | Promo cardset (e.g., "2025 Promos") |
| FanGraphs date range | Season start → current date | Month start → month end |
| `--games` | Cumulative games played | Games in that month (~27) |
| `--ignore-limits` | Usually no | Usually yes (short sample) |
| Position updates | Yes | Skipped (cardset name contains "promos") |
### PotM Pre-Flight Checklist
1. **Choose players** — Typically 2 IF, 2 OF, 1 SP, 1 RP per league
2. **Download month-specific FanGraphs data** — Set date range in scraper to the target month only
3. **Confirm promo cardset exists** in the database
4. **Place CSVs** in the promo cardset's data-input directory
### PotM Steps
```bash
# 1. Download FanGraphs splits for the target month only
# Adjust startDate/endDate in fangraphs_scrape.py or manual download
# Place in data-input/{promo cardset} Cardset/
# 2. Dry-run
pd-cards live-series update --cardset "<promo cardset>" --games <month_games> \
--description "<Month> PotM" --ignore-limits --dry-run
# 3. Generate cards
pd-cards live-series update --cardset "<promo cardset>" --games <month_games> \
--description "<Month> PotM" --ignore-limits
# 4-6. Image validation and S3 upload (same pattern)
pd-cards upload check -c "<promo cardset name>"
# Run groundball_b validation
pd-cards upload s3 -c "<promo cardset name>"
# 7-8. Scouting reports — ALWAYS regenerate for ALL cardsets
pd-cards scouting all
pd-cards scouting upload
```
### PotM-Specific Notes
- **Position updates are skipped** when the cardset name contains "promos" (both live_series_update.py and the CLI check for this).
- **Description protection** — PotM descriptions (e.g., "April PotM") are never overwritten by subsequent full-cardset runs. The `should_update_player_description()` helper checks for "potm" in the existing description.
- **`--ignore-limits`** is typically needed because a single month may not produce enough PA/TBF to meet normal thresholds (20 vL / 40 vR).
- **Scouting must cover ALL cardsets** — PotM players appear alongside live players. Always run `pd-cards scouting all` without `--cardset-id` to preserve the unified scouting view.
### Example: June 2025 PotM
```bash
# Download June-only FanGraphs splits (June 1 - June 30)
# Place CSVs in data-input/2025 Promos Cardset/
pd-cards live-series update --cardset "2025 Promos" --games 27 \
--description "June PotM" --ignore-limits --dry-run
pd-cards live-series update --cardset "2025 Promos" --games 27 \
--description "June PotM" --ignore-limits
pd-cards upload check -c "2025 Promos"
pd-cards upload s3 -c "2025 Promos"
pd-cards scouting all
pd-cards scouting upload
```
---
## Common Issues
**"No players found"**: Wrong cardset name or database environment. Verify `alt_database` in `db_calls.py`.
**Missing FanGraphs CSVs**: The scraper requires Chrome/Selenium. If it fails, download manually from FanGraphs splits leaderboard with the correct date range and stat group settings.
**High DH count**: Defense pull failed or BBRef was rate-limited. Re-run with `--pull-fielding` or manually download defense CSVs.
**Early-season runs**: Use `--ignore-limits` when games played is low (< ~40) to avoid filtering out most players.
---
**Last Updated**: 2026-02-14
**Version**: 1.0 (Initial workflow documentation)