- Remove homelab special-case from commit-push command (all repos now use origin) - Update sync-config to use origin remote instead of homelab - Enhance card generation with season-pct params, CLI reference, and validation fixes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
224 lines
9.2 KiB
Markdown
224 lines
9.2 KiB
Markdown
# Card Generation Workflow
|
|
|
|
## Pre-Flight
|
|
|
|
Ask the user before starting:
|
|
1. **Refresh or new date range?** (refresh keeps existing config)
|
|
2. **Which environment?** (prod or dev)
|
|
3. **Which cardset?** (e.g., 27 for "2005 Live")
|
|
4. **Season progress?** (games played or date range for season-pct calculation)
|
|
|
|
All commands run from `/mnt/NV2/Development/paper-dynasty/card-creation/`.
|
|
|
|
## Steps
|
|
|
|
```bash
|
|
# 1. Verify config (dry-run shows settings without executing)
|
|
pd-cards retrosheet process <year> -c <cardset_id> -d <description> \
|
|
--start <YYYYMMDD> --end <YYYYMMDD> --season-pct <0.0-1.0> --dry-run
|
|
|
|
# 2. Generate cards (POSTs player data to API)
|
|
pd-cards retrosheet process <year> -c <cardset_id> -d <description> \
|
|
--start <YYYYMMDD> --end <YYYYMMDD> --season-pct <0.0-1.0>
|
|
|
|
# 3. Validate positions (DH count MUST be <5; high DH = defense calc failure)
|
|
pd-cards retrosheet validate <cardset_id>
|
|
|
|
# 4. Generate images WITHOUT upload (triggers rendering; groundball_b bug can occur here)
|
|
pd-cards upload check -c "<cardset name>"
|
|
|
|
# 5. CRITICAL: Validate database for negative groundball_b — STOP if errors found
|
|
# (see "Bug Prevention" section below)
|
|
|
|
# 6. Upload to S3
|
|
pd-cards upload s3 -c "<cardset name>"
|
|
|
|
# 7. Generate scouting reports (ALWAYS run without --cardset-id to cover all cardsets)
|
|
pd-cards scouting all
|
|
|
|
# 8. Upload scouting CSVs to production server
|
|
pd-cards scouting upload
|
|
```
|
|
|
|
### CLI Parameter Reference
|
|
|
|
| Parameter | Description | Example |
|
|
|-----------|-------------|---------|
|
|
| `--start` | Season start date (YYYYMMDD) | `--start 20050403` |
|
|
| `--end` | Data cutoff date (YYYYMMDD) | `--end 20050815` |
|
|
| `--season-pct` | Fraction of season completed (0.0-1.0) | `--season-pct 0.728` |
|
|
| `--min-pa-vl` | Min plate appearances vs LHP (default: 20 Live, 1 PotM) | `--min-pa-vl 20` |
|
|
| `--min-pa-vr` | Min plate appearances vs RHP (default: 40 Live, 1 PotM) | `--min-pa-vr 40` |
|
|
| `--last-twoweeks-ratio` | Recency bias weight (auto-enabled at 0.2 after May 30) | `--last-twoweeks-ratio 0.2` |
|
|
| `--dry-run` / `-n` | Preview without saving to database | |
|
|
|
|
### Example: 2005 Live Series Update (Mid-August)
|
|
|
|
```bash
|
|
pd-cards retrosheet process 2005 -c 27 -d Live --start 20050403 --end 20050815 --season-pct 0.728 --dry-run
|
|
pd-cards retrosheet process 2005 -c 27 -d Live --start 20050403 --end 20050815 --season-pct 0.728
|
|
pd-cards retrosheet validate 27
|
|
pd-cards upload check -c "2005 Live"
|
|
# Run groundball_b validation (step 5)
|
|
pd-cards upload s3 -c "2005 Live"
|
|
pd-cards scouting all
|
|
pd-cards scouting upload
|
|
```
|
|
|
|
---
|
|
|
|
## Bug Prevention: The Double-Run Pattern
|
|
|
|
Card image generation (step 4) can create **negative groundball_b values** that crash game simulation. The prevention strategy:
|
|
|
|
1. **Step 4**: Run `upload check` (no S3 upload) — triggers image rendering and caches images
|
|
2. **Step 5**: Query database for negative groundball_b — **STOP if any found**
|
|
3. **Step 6**: Run `upload s3` — uploads the already-cached (validated) images. Fast because images are cached from step 4.
|
|
|
|
**Never skip step 5.** Broken cards uploaded to S3 affect all players immediately.
|
|
|
|
### Step 5 Validation Script
|
|
|
|
There is no CLI command for this validation yet. Run this Python script via `uv run python -c`:
|
|
|
|
```python
|
|
uv run python -c "
|
|
from db_calls import db_get
|
|
import asyncio
|
|
|
|
async def check_cards():
|
|
result = await db_get('battingcards', params=[('cardset', CARDSET_ID)])
|
|
cards = result.get('cards', [])
|
|
errors = []
|
|
for card in cards:
|
|
player = card.get('player', {})
|
|
pid = player.get('player_id', card.get('id'))
|
|
gb = card.get('groundball_b')
|
|
if gb is not None and gb < 0:
|
|
errors.append(f'Player {pid}: groundball_b = {gb}')
|
|
for field in ['gb_b', 'fb_b', 'ld_b']:
|
|
val = card.get(field)
|
|
if val is not None and (val < 0 or val > 100):
|
|
errors.append(f'Player {pid}: {field} = {val}')
|
|
if errors:
|
|
print('ERRORS FOUND:')
|
|
print('\n'.join(errors))
|
|
print('\nDO NOT PROCEED — fix data and re-run step 2')
|
|
else:
|
|
print(f'Validation passed — {len(cards)} batting cards checked, no issues')
|
|
|
|
asyncio.run(check_cards())
|
|
"
|
|
```
|
|
|
|
**Note:** Replace `CARDSET_ID` with the actual cardset ID (e.g., 27). The API returns `{'count': N, 'cards': [...]}` — always use `result.get('cards', [])` to extract the card list.
|
|
|
|
---
|
|
|
|
## Architecture
|
|
|
|
- `retrosheet_data.py` processes Retrosheet play-by-play data, calculates ratings, POSTs to API
|
|
- API stores cards in production database; cards are rendered on-demand via URL
|
|
- nginx caches rendered card images by date parameter (`?d=YYYY-MM-DD`)
|
|
- All operations are idempotent and safe to re-run
|
|
|
|
**Data sources**: Retrosheet events CSV, Baseball Reference defense CSVs (`data-input/`), FanGraphs splits (if needed)
|
|
|
|
**Required input files**:
|
|
- `data-input/retrosheet/retrosheets_events_*.csv`
|
|
- `data-input/<cardset name>/defense_*.csv` (defense_c.csv, defense_1b.csv, etc.)
|
|
- `data-input/<cardset name>/pitching.csv`, `running.csv`
|
|
|
|
**Scouting output**: 4 CSVs in `scouting/` — `batting-basic.csv`, `batting-ratings.csv`, `pitching-basic.csv`, `pitching-ratings.csv`
|
|
|
|
---
|
|
|
|
## Common Issues
|
|
|
|
**"No players found" after successful run**: Wrong database environment, wrong CARDSET_ID, or DATE mismatch. Check `alt_database` in `db_calls.py`. For promos, ensure PROMO_INCLUSION_RETRO_IDS is populated.
|
|
|
|
**High DH count (50+ players)**: Defense calculation failed. Check defense CSVs exist and column names match (`tz_runs_total` not `tz_runs_outfield`). Re-run step 2 after fixing.
|
|
|
|
**S3 upload fails**: Check `~/.aws/credentials`, verify cards render at API URL manually, re-run (idempotent).
|
|
|
|
**"surplus of X.XX chances" / "Adding X.XX results"**: Normal rounding adjustments in card generation — informational, not errors.
|
|
|
|
---
|
|
|
|
## Players of the Month (PotM) Variant
|
|
|
|
PotM cards use the same retrosheet pipeline but with a narrower date range, a promo cardset, and a curated player list.
|
|
|
|
### Key Differences from Full Cardset
|
|
|
|
| Setting | Full Cardset | PotM |
|
|
|---------|-------------|------|
|
|
| `--description` | `Live` | `<Month> PotM` (e.g., `April PotM`) |
|
|
| `--cardset-id` | Live cardset (e.g., 27) | Promo cardset (e.g., 28) |
|
|
| `--start` / `--end` | Full season range | Single month (e.g., `20050401` - `20050430`) |
|
|
| `--min-pa-vl` / `--min-pa-vr` | 20 / 40 (auto) | 1 / 1 (auto when description != "Live") |
|
|
| Player filtering | All qualifying players | Only `PROMO_INCLUSION_RETRO_IDS` |
|
|
| Position updates | Yes | Skipped (promo players keep existing positions) |
|
|
|
|
### PotM Pre-Flight Checklist
|
|
|
|
1. **Choose players** — Typically 2 IF, 2 OF, 1 SP, 1 RP per league (AL/NL)
|
|
2. **Get Retro IDs** — Look up each player's `key_retro` (e.g., `rodra001` for A-Rod)
|
|
3. **Determine date range** — First and last day of the month in `YYYYMMDD` format
|
|
4. **Confirm promo cardset ID** — Usually a separate cardset from the live one
|
|
|
|
### PotM Steps
|
|
|
|
```bash
|
|
# 1. Dry-run to verify config
|
|
pd-cards retrosheet process <year> -c <promo_cardset_id> \
|
|
-d "<Month> PotM" \
|
|
--start <YYYYMMDD> --end <YYYYMMDD> \
|
|
--dry-run
|
|
|
|
# 2. Generate promo cards
|
|
pd-cards retrosheet process <year> -c <promo_cardset_id> \
|
|
-d "<Month> PotM" \
|
|
--start <YYYYMMDD> --end <YYYYMMDD>
|
|
|
|
# 3. Validate (expect higher DH count — promo players may lack defense data for short windows)
|
|
pd-cards retrosheet validate <promo_cardset_id>
|
|
|
|
# 4-5. Image validation (same as full cardset — check, validate groundball_b, then upload)
|
|
pd-cards upload check -c "<promo cardset name>"
|
|
# Run groundball_b validation (step 5 from main workflow)
|
|
pd-cards upload s3 -c "<promo cardset name>"
|
|
|
|
# 6-7. Scouting reports — ALWAYS regenerate for ALL cardsets (no --cardset-id filter)
|
|
pd-cards scouting all
|
|
pd-cards scouting upload
|
|
```
|
|
|
|
### PotM-Specific Gotchas
|
|
|
|
- **`PROMO_INCLUSION_RETRO_IDS` must be populated** — If description is not "Live", retrosheet_data.py filters to only these IDs. Empty list = 0 players generated.
|
|
- **Don't mix Live and PotM** — If `PROMO_INCLUSION_RETRO_IDS` has entries but description is "Live", the script warns and exits.
|
|
- **Description protection** — Once a player has a PotM description (e.g., "April PotM"), it is never overwritten by subsequent live series runs. Promo cardset descriptions are also protected: existing cards keep their original month.
|
|
- **Scouting must cover ALL cardsets** — PotM players appear in scouting alongside live players. Always run `pd-cards scouting all` without `--cardset-id` to avoid overwriting the unified scouting data with partial results.
|
|
|
|
### Example: May 2005 PotM
|
|
|
|
```bash
|
|
# Players: A-Rod (IF), Delgado (IF), Mench (OF), Abreu (OF), Colon (SP), Ryan (RP), Harang (SP), Hoffman (RP)
|
|
# Retro IDs configured in retrosheet_data.py PROMO_INCLUSION_RETRO_IDS
|
|
|
|
pd-cards retrosheet process 2005 -c 28 -d "May PotM" --start 20050501 --end 20050531 --dry-run
|
|
pd-cards retrosheet process 2005 -c 28 -d "May PotM" --start 20050501 --end 20050531
|
|
pd-cards retrosheet validate 28
|
|
pd-cards upload check -c "2005 Promos"
|
|
# Run groundball_b validation
|
|
pd-cards upload s3 -c "2005 Promos"
|
|
pd-cards scouting all
|
|
pd-cards scouting upload
|
|
```
|
|
|
|
---
|
|
|
|
**Last Updated**: 2026-02-15
|
|
**Version**: 3.2 (Fixed scouting commands to use CLI, fixed groundball_b validation script, added CLI parameter reference and example)
|