paper-dynasty-card-creation

Author	SHA1	Message	Date
Cal Corum	4e9e8d351d	CLAUDE: Add Retrosheet CSV transformer and fix data processing issues This commit adds support for the new Retrosheet CSV format and resolves multiple data processing issues in retrosheet_data.py. New Features: - Created retrosheet_transformer.py with smart caching system - Transforms new Retrosheet CSV format to legacy format - Checks file timestamps to avoid redundant transformations - Caches normalized data for instant subsequent loads (~5s → <1s) - Handles column mapping: gid→game_id, bathand→batter_hand, etc. - Derives event_type from multiple boolean columns - Converts handedness values R/L → r/l - Explicitly sets string dtypes for hit_val, hit_location, batted_ball_type Configuration Updates: - Updated retrosheet_data.py for 2005 season data - START_DATE: 19980301 → 20050403 (2005 Opening Day) - END_DATE: 19980430 → 20051002 (2005 Regular Season End) - SEASON_PCT: 28/162 → 162/162 (full season) - MIN_PA_VL/VR: 20/40 → 50/75 (full season minimums) - CARDSET_ID: Updated for 2005 cardsets - EVENTS_FILENAME: Updated to use retrosheets_events_2005.csv Bug Fixes: 1. Multi-team player duplicates - Players traded during season had duplicate rows (one per team + combined) - Added filtering to keep only combined totals (2TM, 3TM, etc.) - Prevents duplicate key_bbref values in ratings dataframes 2. Column name conflicts - Fixed Tm column conflict when merging periph_stats and defense_p - Drop duplicate Tm from defense data before merge 3. Pitcher rating calculations (pitchers/calcs_pitcher.py) - Fixed "truth value is ambiguous" error in min() comparisons - Explicitly convert pandas values to float before min() operations 4. Dictionary column corruption in ratings - Fixed ratings_vL and ratings_vR corruption during DataFrame merges - Only merge specific columns (key_bbref, player_id, card_id) instead of full DataFrame - Removed unnecessary .set_index() calls from post_batting_cards() and post_pitching_cards() Documentation: - Updated CLAUDE.md with comprehensive troubleshooting section - Added Retrosheet transformation documentation - Documented defense CSV requirements and column naming - Added configuration checklist for retrosheet_data.py - Documented common issues: multi-team players, dictionary corruption, string types 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-08 16:11:52 -06:00
Cal Corum	db2d81a6d1	CLAUDE: Add default OPS constants and type hints to improve code clarity This commit adds default OPS value constants and type hints to key functions, improving code documentation and IDE support. ## Changes Made 1. Add default OPS constants (creation_helpers.py) - DEFAULT_BATTER_OPS: Default OPS by rarity (1-5) - DEFAULT_STARTER_OPS: Default OPS-against for starters (99, 1-5) - DEFAULT_RELIEVER_OPS: Default OPS-against for relievers (99, 1-5) - Comprehensive comments explaining usage - Single source of truth for fallback values 2. Update batters/creation.py - Import DEFAULT_BATTER_OPS - Replace 6 hardcoded if-checks with clean loop over constants - Add type hints to post_player_updates function - Import Dict from typing 3. Update pitchers/creation.py - Import DEFAULT_STARTER_OPS and DEFAULT_RELIEVER_OPS - Replace 12 hardcoded if-checks with clean loops over constants - Add type hints to post_player_updates function - Import Dict from typing 4. Add typing import (creation_helpers.py) - Import Dict, List, Tuple, Optional for type hints - Enables type hints throughout helper functions ## Impact ### Before ```python # Scattered hardcoded values (batters) if 1 not in average_ops: average_ops[1] = 1.066 if 2 not in average_ops: average_ops[2] = 0.938 # ... 4 more if-checks # Scattered hardcoded values (pitchers) if 99 not in sp_average_ops: sp_average_ops[99] = 0.388 # ... 5 more if-checks for starters # ... 6 more if-checks for relievers ``` ### After ```python # Clean, data-driven approach (batters) for rarity, default_ops in DEFAULT_BATTER_OPS.items(): if rarity not in average_ops: average_ops[rarity] = default_ops # Clean, data-driven approach (pitchers) for rarity, default_ops in DEFAULT_STARTER_OPS.items(): if rarity not in sp_average_ops: sp_average_ops[rarity] = default_ops for rarity, default_ops in DEFAULT_RELIEVER_OPS.items(): if rarity not in rp_average_ops: rp_average_ops[rarity] = default_ops ``` ### Benefits ✅ Eliminates 18 if-checks across batters and pitchers ✅ Single source of truth for default OPS values ✅ Easy to modify values (change constant, not scattered code) ✅ Self-documenting with clear constant names and comments ✅ Type hints improve IDE support and catch errors early ✅ Function signatures now document expected types ✅ Consistent with other recent refactorings ## Test Results ✅ 42/42 tests pass ✅ All existing functionality preserved ✅ 100% backward compatible ## Files Modified - creation_helpers.py: +35 lines (3 constants + typing import) - batters/creation.py: -4 lines net (cleaner code + type hints) - pitchers/creation.py: -8 lines net (cleaner code + type hints) Net change: More constants, less scattered magic numbers, better types. Part of ongoing refactoring to reduce code fragility.	2025-10-31 23:28:49 -05:00
Cal Corum	cb471d8057	CLAUDE: Extract rarity cost adjustment logic into data-driven function This commit eliminates 150+ lines of duplicated, error-prone nested if/elif logic by extracting rarity cost calculations into a lookup table and function. ## Changes Made 1. Add RARITY_COST_ADJUSTMENTS lookup table (creation_helpers.py) - Maps (old_rarity, new_rarity) → (cost_adjustment, minimum_cost) - Covers all 30 possible rarity transitions - Self-documenting with comments for each rarity tier - Single source of truth for all cost adjustments 2. Add calculate_rarity_cost_adjustment() function (creation_helpers.py) - Takes old_rarity, new_rarity, old_cost - Returns new cost with adjustments and minimums applied - Includes comprehensive docstring with examples - Handles edge cases (same rarity, undefined transitions) - Logs warnings for undefined transitions 3. Update batters/creation.py - Import calculate_rarity_cost_adjustment - Replace 75-line nested if/elif block with 7-line function call - Identical behavior, much cleaner code 4. Update pitchers/creation.py - Import calculate_rarity_cost_adjustment - Replace 75-line nested if/elif block with 7-line function call - Eliminates duplication between batters and pitchers 5. Add comprehensive tests (tests/test_rarity_cost_adjustments.py) - 22 tests covering all scenarios - Tests individual transitions (Diamond→Gold, Common→Bronze, etc.) - Tests all upward and downward transitions - Tests minimum cost enforcement - Tests edge cases (zero cost, very high cost, negative cost) - Tests symmetry (up then down returns close to original) ## Impact ### Lines Eliminated - Batters: 75 lines → 7 lines (89% reduction) - Pitchers: 75 lines → 7 lines (89% reduction) - Total: 150 lines of nested logic eliminated ### Benefits ✅ Eliminates 150+ lines of duplicated code ✅ Data-driven approach makes adjustments clear and modifiable ✅ Single source of truth prevents inconsistencies ✅ Independently testable business logic ✅ 22 comprehensive tests ensure correctness ✅ Easy to add new rarity tiers or modify costs ✅ Reduced risk of typos in magic numbers ## Test Results ✅ 22/22 new tests pass ✅ All existing tests still pass ✅ 100% backward compatible - identical behavior ## Files Modified - creation_helpers.py: +104 lines (table + function + docs) - batters/creation.py: -68 lines (replaced nested logic) - pitchers/creation.py: -68 lines (replaced nested logic) - tests/test_rarity_cost_adjustments.py: +174 lines (new tests) Net change: 150+ lines of complex logic replaced with simple, tested, data-driven approach. Part of ongoing refactoring to reduce code fragility.	2025-10-31 22:49:35 -05:00
Cal Corum	bd1cc7e90b	CLAUDE: Refactor to reduce code fragility - extract business logic and add constants This commit implements high value-to-time ratio improvements to make the codebase more maintainable and less fragile: ## Changes Made 1. Add constants for magic numbers (creation_helpers.py) - NEW_PLAYER_COST = 99999 (replaces hardcoded sentinel value) - RARITY_BASE_COSTS dict (replaces duplicate cost dictionaries) - Benefits: Self-documenting, single source of truth, easy to update 2. Extract business logic into testable function (creation_helpers.py) - Added should_update_player_description() with full docstring - Consolidates duplicated logic from batters and pitchers modules - Independently testable, clear decision logic with examples - Benefits: DRY principle, better testing, easier to modify 3. Add debug logging for description updates (batters & pitchers) - Logs when descriptions ARE updated (with details) - Logs when descriptions are SKIPPED (with reason) - Benefits: Easy troubleshooting, visibility into decisions 4. Update batters/creation.py and pitchers/creation.py - Replace hardcoded 99999 with NEW_PLAYER_COST - Replace base_costs dict with RARITY_BASE_COSTS - Replace inline logic with should_update_player_description() - Improved docstring for post_player_updates() - Benefits: Cleaner, more maintainable code 5. Add comprehensive tests (tests/test_promo_description_protection.py) - 6 new direct unit tests for should_update_player_description() - Tests cover: promo/regular cardsets, new/existing players, PotM cards - Case-insensitive detection tests - Benefits: Confidence in behavior, prevent regressions 6. Add documentation (PROMO_CARD_FIX.md, REFACTORING_SUMMARY.md) - PROMO_CARD_FIX.md: Details the promo card renaming fix - REFACTORING_SUMMARY.md: Comprehensive refactoring documentation - Benefits: Future developers understand the code and changes ## Test Results ✅ 13/13 tests pass (7 existing + 6 new) ✅ No regressions in existing tests ✅ 100% backward compatible ## Impact - Magic numbers: 100% eliminated - Duplicated logic: 50% reduction (2 files → 1 function) - Test coverage: +86% (7 → 13 tests) - Code clarity: Significantly improved - Maintainability: Much easier to modify and debug ## Files Modified - creation_helpers.py: +82 lines (constants, function, docs) - batters/creation.py: Simplified using new constants/function - pitchers/creation.py: Simplified using new constants/function - tests/test_promo_description_protection.py: +66 lines (new tests) - PROMO_CARD_FIX.md: New documentation - REFACTORING_SUMMARY.md: New documentation Total: ~228 lines added/modified for significant maintainability gain Related to earlier promo card description protection fix.	2025-10-31 22:03:22 -05:00
Cal Corum	c89e1eb507	Claude introduction & Live Series Update	2025-07-22 09:24:34 -05:00
Cal Corum	25d4d9a63c	Migrate to rotating file logger	2024-11-10 14:42:12 -06:00
Cal Corum	cdb5820dbc	Pitchers are complete	2024-11-01 08:50:29 -05:00
Cal Corum	93b8a230db	All pitcher data is built, ready to post data	2024-10-27 23:41:44 -05:00
Cal Corum	3388c4e0c5	Pitching peripherals done	2024-10-26 20:18:54 -05:00
Cal Corum	cd62e3807a	Fix PotM renaming	2024-07-03 09:54:25 -05:00
Cal Corum	72968f5e5d	Add MLB Player support	2024-05-26 10:53:15 -05:00
Cal Corum	c4d9e0524f	May 05 Card Data	2024-05-12 13:45:15 -05:00
Cal Corum	0c77f3971d	S7 cleanup and SSS bug fixes	2024-04-28 15:32:05 -05:00
Cal Corum	63b5487c44	Adding support for custom card creation	2024-03-08 00:26:54 -06:00
Cal Corum	14bf66212e	Add ignore_limits parameter	2023-11-29 10:34:10 -06:00
Cal Corum	dae6b7e8df	Refactor creation to modules	2023-11-05 20:05:11 -06:00
Cal Corum	92e5240e65	Refactor pit/bat/def to modules	2023-11-05 12:18:42 -06:00

17 Commits