Commit Graph

10 Commits

Author SHA1 Message Date
Cal Corum
a72abc01a3 Add FullCard/CardColumn/CardResult models and card builder pipeline
- card_layout.py: Port PlayResult, PLAY_RESULTS, EXACT_CHANCES, get_chances(),
  CardResult, CardColumn, FullCard, FullBattingCard, FullPitchingCard from
  database/app/card_creation.py. card_output() uses col_* key names.
  get_chances() always returns Decimal to avoid float/Decimal type errors.

- batters/card_builder.py: Port get_batter_card_data() algorithm as
  build_batter_full_cards(ratings_vl, ratings_vr, offense_col, player_id, hand).
  assign_bchances() returns float tuples for compatibility with float-based
  BattingCardRatingsModel fields.

- pitchers/card_builder.py: Port get_pitcher_card_data() algorithm as
  build_pitcher_full_cards(). assign_pchances() returns float tuples.
  Includes card.add_fatigue() at end of each card iteration.

- batters/calcs_batter.py: Integrate card builder in get_batter_ratings().
  After computing raw ratings, call build_batter_full_cards() and merge
  9 col_* rendered column fields into each ratings dict. Lazy import to
  avoid circular dependency.

- pitchers/calcs_pitcher.py: Same integration for get_pitcher_ratings().

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-02-25 16:21:26 -06:00
Cal Corum
4e9e8d351d CLAUDE: Add Retrosheet CSV transformer and fix data processing issues
This commit adds support for the new Retrosheet CSV format and resolves
multiple data processing issues in retrosheet_data.py.

New Features:
- Created retrosheet_transformer.py with smart caching system
  - Transforms new Retrosheet CSV format to legacy format
  - Checks file timestamps to avoid redundant transformations
  - Caches normalized data for instant subsequent loads (~5s → <1s)
  - Handles column mapping: gid→game_id, bathand→batter_hand, etc.
  - Derives event_type from multiple boolean columns
  - Converts handedness values R/L → r/l
  - Explicitly sets string dtypes for hit_val, hit_location, batted_ball_type

Configuration Updates:
- Updated retrosheet_data.py for 2005 season data
  - START_DATE: 19980301 → 20050403 (2005 Opening Day)
  - END_DATE: 19980430 → 20051002 (2005 Regular Season End)
  - SEASON_PCT: 28/162 → 162/162 (full season)
  - MIN_PA_VL/VR: 20/40 → 50/75 (full season minimums)
  - CARDSET_ID: Updated for 2005 cardsets
  - EVENTS_FILENAME: Updated to use retrosheets_events_2005.csv

Bug Fixes:
1. Multi-team player duplicates
   - Players traded during season had duplicate rows (one per team + combined)
   - Added filtering to keep only combined totals (2TM, 3TM, etc.)
   - Prevents duplicate key_bbref values in ratings dataframes

2. Column name conflicts
   - Fixed Tm column conflict when merging periph_stats and defense_p
   - Drop duplicate Tm from defense data before merge

3. Pitcher rating calculations (pitchers/calcs_pitcher.py)
   - Fixed "truth value is ambiguous" error in min() comparisons
   - Explicitly convert pandas values to float before min() operations

4. Dictionary column corruption in ratings
   - Fixed ratings_vL and ratings_vR corruption during DataFrame merges
   - Only merge specific columns (key_bbref, player_id, card_id) instead of full DataFrame
   - Removed unnecessary .set_index() calls from post_batting_cards() and post_pitching_cards()

Documentation:
- Updated CLAUDE.md with comprehensive troubleshooting section
- Added Retrosheet transformation documentation
- Documented defense CSV requirements and column naming
- Added configuration checklist for retrosheet_data.py
- Documented common issues: multi-team players, dictionary corruption, string types

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-08 16:11:52 -06:00
Cal Corum
c89e1eb507 Claude introduction & Live Series Update 2025-07-22 09:24:34 -05:00
Cal Corum
25d4d9a63c Migrate to rotating file logger 2024-11-10 14:42:12 -06:00
Cal Corum
cdb5820dbc Pitchers are complete 2024-11-01 08:50:29 -05:00
Cal Corum
93b8a230db All pitcher data is built, ready to post data 2024-10-27 23:41:44 -05:00
Cal Corum
3388c4e0c5 Pitching peripherals done 2024-10-26 20:18:54 -05:00
Cal Corum
72968f5e5d Add MLB Player support 2024-05-26 10:53:15 -05:00
Cal Corum
0c77f3971d S7 cleanup and SSS bug fixes 2024-04-28 15:32:05 -05:00
Cal Corum
92e5240e65 Refactor pit/bat/def to modules 2023-11-05 12:18:42 -06:00