sba-scouting/rust/PHASE3_PROJECT_PLAN.json

{
  "meta": {
    "version": "1.0.0",
    "created": "2026-02-27",
    "lastUpdated": "2026-02-27",
    "planType": "migration",
    "phase": "Phase 3: Calc Layer (League Stats, Matchup Scoring, Score Cache)",
    "description": "Port the matchup calculation pipeline \u2014 league stat distributions, batter/pitcher component scoring, matchup result assembly, and score cache \u2014 from Python to Rust. Builds on the existing standardize_value/weights stubs from Phase 1.",
    "totalEstimatedHours": 14,
    "totalTasks": 9,
    "completedTasks": 9,
    "existingCode": "weights.rs is complete (StatWeight, BATTER_WEIGHTS, PITCHER_WEIGHTS, max scores). matchup.rs has standardize_value and calculate_weighted_score with tests. league_stats.rs has only the StatDistribution struct."
  },
  "categories": {
    "critical": "Foundation \u2014 league stats computation that everything else depends on",
    "high": "Core matchup calculation logic",
    "medium": "Score cache for performance",
    "low": "Cache validation and convenience functions"
  },
  "tasks": [
    {
      "id": "CRIT-001",
      "name": "Implement league stats structs and distribution calculation",
      "description": "Add BatterLeagueStats and PitcherLeagueStats structs to league_stats.rs, plus the _calc_distribution function. This is the mathematical foundation for all scoring.\n\nBatterLeagueStats has 18 StatDistribution fields (9 stats x 2 splits: vlhp, vrhp). PitcherLeagueStats has 18 fields (9 stats x 2 splits: vlhb, vrhb). Stats: so, bb, hit, ob, tb, hr, dp, bphr, bp1b.\n\nCRITICAL MATH: _calc_distribution(values: &[f64]) -> StatDistribution\n- avg: mean of NON-ZERO values only (filter out 0.0, then mean). If fewer than 2 non-zero values, avg = 0.0.\n- stdev: sample standard deviation of ALL values including zeros (Bessel's correction, divide by N-1). If fewer than 2 values, stdev = 1.0. If stdev == 0.0, use 1.0 to prevent division by zero.\n- This asymmetry (avg excludes zeros, stdev includes zeros) is intentional and matches the Python statistics.mean/statistics.stdev behavior.\n\nImplement sample stdev manually: sqrt(sum((x - mean_all)^2) / (n - 1)). Do NOT use the population stdev formula. The mean used in the stdev formula should be the mean of ALL values (including zeros), not the non-zero mean used for the returned avg field.",
      "category": "critical",
      "priority": 1,
      "completed": true,
      "tested": true,
      "dependencies": [],
      "files": [
        {
          "path": "rust/src/calc/league_stats.rs",
          "lines": [
            1,
            6
          ],
          "issue": "Only has StatDistribution struct, no calculation functions"
        },
        {
          "path": "src/sba_scout/calc/league_stats.py",
          "lines": [
            1,
            80
          ],
          "issue": "Python reference \u2014 _calc_distribution and stats structs"
        }
      ],
      "suggestedFix": "1. Add structs:\n```rust\npub struct BatterLeagueStats {\n    pub so_vlhp: StatDistribution, pub bb_vlhp: StatDistribution, ...\n    pub so_vrhp: StatDistribution, pub bb_vrhp: StatDistribution, ...\n}\npub struct PitcherLeagueStats {\n    pub so_vlhb: StatDistribution, pub bb_vlhb: StatDistribution, ...\n    pub so_vrhb: StatDistribution, pub bb_vrhb: StatDistribution, ...\n}\n```\n\n2. Add calc_distribution:\n```rust\nfn calc_distribution(values: &[f64]) -> StatDistribution {\n    let non_zero: Vec<f64> = values.iter().copied().filter(|&v| v > 0.0).collect();\n    let avg = if non_zero.len() >= 2 {\n        non_zero.iter().sum::<f64>() / non_zero.len() as f64\n    } else { 0.0 };\n    \n    let n = values.len();\n    if n < 2 { return StatDistribution { avg, stdev: 1.0 }; }\n    let mean_all = values.iter().sum::<f64>() / n as f64;\n    let variance = values.iter().map(|&x| (x - mean_all).powi(2)).sum::<f64>() / (n - 1) as f64;\n    let stdev = variance.sqrt();\n    let stdev = if stdev == 0.0 { 1.0 } else { stdev };\n    StatDistribution { avg, stdev }\n}\n```\n\n3. Add Default impls that return all StatDistribution { avg: 0.0, stdev: 1.0 }.\n\n4. Write unit tests covering: empty input, single value, all zeros, mixed values with zeros, normal distribution.",
      "estimatedHours": 2,
      "notes": "The stdev formula is the trickiest part. Python's statistics.stdev uses N-1 (sample stdev). The mean used inside the stdev calculation must be the mean of ALL values (including zeros) \u2014 NOT the non-zero mean that's returned as the avg field. These are two different means for two different purposes."
    },
    {
      "id": "CRIT-002",
      "name": "Implement league stats DB queries (calculate + cache)",
      "description": "Add async functions to compute BatterLeagueStats and PitcherLeagueStats from the database, plus an in-memory cache with invalidation.\n\ncalculate_batter_league_stats(pool) fetches ALL batter_cards, extracts each stat column into a Vec<f64>, and calls calc_distribution for each of the 18 stat+split combinations.\n\ncalculate_pitcher_league_stats(pool) does the same for pitcher_cards.\n\nThe cache uses module-level state (OnceLock or tokio::sync::OnceCell) to avoid recomputing on every matchup. clear_league_stats_cache() resets the cache (called after card imports).",
      "category": "critical",
      "priority": 2,
      "completed": true,
      "tested": true,
      "dependencies": [
        "CRIT-001"
      ],
      "files": [
        {
          "path": "rust/src/calc/league_stats.rs",
          "lines": [
            1,
            6
          ],
          "issue": "No DB functions"
        },
        {
          "path": "src/sba_scout/calc/league_stats.py",
          "lines": [
            80,
            160
          ],
          "issue": "Python reference \u2014 calculate_*_league_stats, get_*_league_stats, clear cache"
        }
      ],
      "suggestedFix": "1. Add DB query functions:\n```rust\npub async fn calculate_batter_league_stats(pool: &SqlitePool) -> Result<BatterLeagueStats>\npub async fn calculate_pitcher_league_stats(pool: &SqlitePool) -> Result<PitcherLeagueStats>\n```\n\nFor batter stats: `SELECT so_vlhp, bb_vlhp, hit_vlhp, ... FROM batter_cards`. Then for each column, collect into Vec<f64> and call calc_distribution.\n\nFor the in-memory cache, use `tokio::sync::OnceCell` or `std::sync::OnceLock` with a `Mutex` wrapper:\n```rust\nstatic BATTER_STATS: OnceLock<Mutex<Option<BatterLeagueStats>>> = OnceLock::new();\nstatic PITCHER_STATS: OnceLock<Mutex<Option<PitcherLeagueStats>>> = OnceLock::new();\n\npub async fn get_batter_league_stats(pool: &SqlitePool) -> Result<BatterLeagueStats>\npub async fn get_pitcher_league_stats(pool: &SqlitePool) -> Result<PitcherLeagueStats>\npub fn clear_league_stats_cache()\n```\n\n2. Since BatterLeagueStats/PitcherLeagueStats will be shared across threads, they need Clone. The cache getter should return a cloned value.\n\n3. Wire clear_league_stats_cache() into the importer: add a call at the end of import_all_cards in api/importer.rs (replace the Phase 3 TODO comment).",
      "estimatedHours": 2,
      "notes": "The Python cache is never invalidated automatically \u2014 only by explicit clear_league_stats_cache(). Match this behavior. The DB query fetches ALL cards (no season filter) \u2014 this is intentional since league stats should represent the full card pool."
    },
    {
      "id": "HIGH-001",
      "name": "Implement MatchupResult and get_tier",
      "description": "Add the MatchupResult struct and tier assignment function to matchup.rs.\n\nMatchupResult holds the complete output of a matchup calculation: the player, their rating, tier grade, effective batting hand (after switch-hitter resolution), split labels, and individual batter/pitcher component scores.\n\nget_tier maps a rating to A/B/C/D/F tier grades.",
      "category": "high",
      "priority": 3,
      "completed": true,
      "tested": true,
      "dependencies": [],
      "files": [
        {
          "path": "rust/src/calc/matchup.rs",
          "lines": [
            1,
            43
          ],
          "issue": "Has standardize_value and calculate_weighted_score, but no MatchupResult or get_tier"
        },
        {
          "path": "src/sba_scout/calc/matchup.py",
          "lines": [
            1,
            60
          ],
          "issue": "Python reference \u2014 MatchupResult dataclass and get_tier"
        }
      ],
      "suggestedFix": "Add to matchup.rs:\n\n```rust\nuse crate::db::models::Player;\n\n#[derive(Debug, Clone)]\npub struct MatchupResult {\n    pub player: Player,\n    pub rating: Option<f64>,\n    pub tier: String,              // \"A\", \"B\", \"C\", \"D\", \"F\", or \"--\"\n    pub batter_hand: String,       // \"L\" or \"R\" (effective, after switch-hitter)\n    pub batter_split: String,      // \"vLHP\" or \"vRHP\"\n    pub pitcher_split: String,     // \"vLHB\" or \"vRHB\"\n    pub batter_component: Option<f64>,\n    pub pitcher_component: Option<f64>,\n}\n\npub fn get_tier(rating: Option<f64>) -> &'static str {\n    match rating {\n        None => \"--\",\n        Some(r) if r >= 40.0 => \"A\",\n        Some(r) if r >= 20.0 => \"B\",\n        Some(r) if r >= -19.0 => \"C\",\n        Some(r) if r >= -39.0 => \"D\",\n        Some(_) => \"F\",\n    }\n}\n```\n\nAdd display helpers:\n- `rating_display(&self) -> String` \u2014 formats as \"+15\", \"-3\", \"N/A\"\n- `split_display(&self) -> String` \u2014 formats as \"vL/vR\"\n\nWrite tests for get_tier covering all boundaries: 40, 20, 0, -19, -20, -39, -40, None.",
      "estimatedHours": 1,
      "notes": "The tier boundaries are asymmetric: C spans -19 to +19 (39 units), A is >= 40, F is < -39. Player struct needs Clone derive if it doesn't have it already (check models.rs)."
    },
    {
      "id": "HIGH-002",
      "name": "Implement batter and pitcher component calculation",
      "description": "Add _calculate_batter_component and _calculate_pitcher_component functions to matchup.rs. These compute the individual batter and pitcher scores by applying weighted standardization across all 9 stats for the appropriate handedness split.\n\nThe batter component iterates BATTER_WEIGHTS, looks up the stat value on the BatterCard for the correct split (vlhp or vrhp based on pitcher hand), looks up the corresponding distribution from BatterLeagueStats, and sums the weighted scores.\n\nThe pitcher component does the same with PITCHER_WEIGHTS, PitcherCard, and PitcherLeagueStats.",
      "category": "high",
      "priority": 4,
      "completed": true,
      "tested": true,
      "dependencies": [
        "CRIT-001"
      ],
      "files": [
        {
          "path": "rust/src/calc/matchup.rs",
          "lines": [
            36,
            43
          ],
          "issue": "Has calculate_weighted_score but no component functions"
        },
        {
          "path": "src/sba_scout/calc/matchup.py",
          "lines": [
            60,
            130
          ],
          "issue": "Python reference \u2014 _calculate_batter_component and _calculate_pitcher_component"
        }
      ],
      "suggestedFix": "The tricky part is mapping stat name strings to struct fields. Since Rust doesn't have runtime attribute access like Python's getattr(), use a helper that maps stat name + split to the card field value and the league stats distribution.\n\nOption A (recommended): Write explicit match arms:\n```rust\nfn get_batter_stat(card: &BatterCard, stat: &str, vs_hand: &str) -> f64 {\n    match (stat, vs_hand) {\n        (\"so\", \"L\") => card.so_vlhp.unwrap_or(0.0),\n        (\"so\", \"R\") => card.so_vrhp.unwrap_or(0.0),\n        (\"bb\", \"L\") => card.bb_vlhp.unwrap_or(0.0),\n        // ... all 18 combinations\n    }\n}\nfn get_batter_dist<'a>(stats: &'a BatterLeagueStats, stat: &str, vs_hand: &str) -> &'a StatDistribution {\n    match (stat, vs_hand) {\n        (\"so\", \"L\") => &stats.so_vlhp,\n        // ... all 18\n    }\n}\n```\n\nThen the component functions are clean:\n```rust\nfn calculate_batter_component(card: &BatterCard, pitcher_hand: &str, league_stats: &BatterLeagueStats) -> f64 {\n    BATTER_WEIGHTS.iter().map(|(stat, weight)| {\n        let value = get_batter_stat(card, stat, pitcher_hand);\n        let dist = get_batter_dist(league_stats, stat, pitcher_hand);\n        calculate_weighted_score(value, dist, weight)\n    }).sum()\n}\n```\n\nSame pattern for pitcher. Result range: batter [-66, +66], pitcher [-69, +69].\n\nWrite tests with known distributions and card values to verify correct summation.",
      "estimatedHours": 2.5,
      "notes": "vs_hand for batter component is the PITCHER's throwing hand (L or R). vs_hand for pitcher component is the BATTER's effective batting hand. The caller resolves switch hitters before calling these functions. All BatterCard stat fields are Option<f64> \u2014 unwrap_or(0.0) matches the Python behavior where None \u2192 standardize_value returns 3."
    },
    {
      "id": "HIGH-003",
      "name": "Implement calculate_matchup and calculate_team_matchups",
      "description": "Add the main matchup orchestration functions to matchup.rs. These handle switch-hitter resolution, call the component functions, combine scores with pitcher inversion, and assign tiers.\n\ncalculate_matchup: single batter vs single pitcher. Resolves effective batting hand for switch hitters (S bats left vs RHP, right vs LHP). Combines batter_component + (-pitcher_component) for total rating.\n\ncalculate_team_matchups: runs calculate_matchup for a list of batters, sorts by rating descending (None-rated last).",
      "category": "high",
      "priority": 5,
      "completed": true,
      "tested": true,
      "dependencies": [
        "HIGH-001",
        "HIGH-002"
      ],
      "files": [
        {
          "path": "rust/src/calc/matchup.rs",
          "lines": [],
          "issue": "No matchup orchestration functions"
        },
        {
          "path": "src/sba_scout/calc/matchup.py",
          "lines": [
            130,
            230
          ],
          "issue": "Python reference \u2014 calculate_matchup, calculate_team_matchups"
        }
      ],
      "suggestedFix": "```rust\npub fn calculate_matchup(\n    player: &Player,\n    batter_card: Option<&BatterCard>,\n    pitcher: &Player,\n    pitcher_card: Option<&PitcherCard>,\n    batter_league_stats: &BatterLeagueStats,\n    pitcher_league_stats: &PitcherLeagueStats,\n) -> MatchupResult\n```\n\nSwitch-hitter resolution:\n```rust\nlet batter_hand = player.hand.as_deref().unwrap_or(\"R\");\nlet pitcher_hand = pitcher.hand.as_deref().unwrap_or(\"R\");\nlet effective_batting_hand = if batter_hand == \"S\" {\n    if pitcher_hand == \"R\" { \"L\" } else { \"R\" }\n} else { batter_hand };\nlet batter_split = if pitcher_hand == \"L\" { \"vLHP\" } else { \"vRHP\" };\nlet pitcher_split = if effective_batting_hand == \"L\" { \"vLHB\" } else { \"vRHB\" };\n```\n\nScore combination:\n```rust\nlet batter_component = calculate_batter_component(card, pitcher_hand, batter_stats);\nlet pitcher_component = pitcher_card.map(|pc| calculate_pitcher_component(pc, effective_batting_hand, pitcher_stats));\nlet total = match pitcher_component {\n    Some(pc) => batter_component + (-pc),   // INVERT pitcher score\n    None => batter_component,\n};\n```\n\ncalculate_team_matchups: iterate batters, call calculate_matchup for each, sort by (has_rating desc, rating desc).\n\nTests: switch-hitter resolution (S vs R = bats L, S vs L = bats R), batter-only matchup (no pitcher card), full matchup with inversion.",
      "estimatedHours": 2,
      "notes": "The pitcher component is INVERTED (negated) when combining. A high pitcher score means the pitcher is good, which is BAD for the batter \u2014 hence the negation. This is the most important sign convention to get right."
    },
    {
      "id": "MED-001",
      "name": "Add StandardizedScoreCache DB queries",
      "description": "Add query functions for the StandardizedScoreCache table to db/queries.rs. These are used by both the cache rebuild and the cached matchup path.",
      "category": "medium",
      "priority": 6,
      "completed": true,
      "tested": true,
      "dependencies": [],
      "files": [
        {
          "path": "rust/src/db/queries.rs",
          "lines": [],
          "issue": "Has MatchupCache queries but no StandardizedScoreCache queries"
        },
        {
          "path": "src/sba_scout/calc/score_cache.py",
          "lines": [
            120,
            160
          ],
          "issue": "Python reference \u2014 get_cached_batter_score, get_cached_pitcher_score"
        }
      ],
      "suggestedFix": "Add to queries.rs:\n\n```rust\npub async fn get_cached_batter_score(\n    pool: &SqlitePool,\n    batter_card_id: i64,\n    split: &str,\n) -> Result<Option<StandardizedScoreCache>>\n```\n`SELECT * FROM standardized_score_cache WHERE batter_card_id = ? AND split = ?`\n\n```rust\npub async fn get_cached_pitcher_score(\n    pool: &SqlitePool,\n    pitcher_card_id: i64,\n    split: &str,\n) -> Result<Option<StandardizedScoreCache>>\n```\n`SELECT * FROM standardized_score_cache WHERE pitcher_card_id = ? AND split = ?`\n\n```rust\npub async fn clear_score_cache(pool: &SqlitePool) -> Result<u64>\n```\n`DELETE FROM standardized_score_cache` \u2014 returns rows_affected.\n\n```rust\npub async fn insert_score_cache(\n    pool: &SqlitePool,\n    batter_card_id: Option<i64>,\n    pitcher_card_id: Option<i64>,\n    split: &str,\n    total_score: f64,\n    stat_scores: &str,\n    weights_hash: &str,\n    league_stats_hash: &str,\n) -> Result<()>\n```\n`INSERT INTO standardized_score_cache (...) VALUES (...)`",
      "estimatedHours": 1,
      "notes": "The StandardizedScoreCache model already exists in models.rs with correct fields. These are straightforward single-table queries."
    },
    {
      "id": "MED-002",
      "name": "Implement score cache rebuild (score_cache.rs)",
      "description": "Create calc/score_cache.rs with the cache rebuild logic: compute per-stat scores for every card\u00d7split combination, serialize as JSON, and insert into the StandardizedScoreCache table.\n\nAlso implement the hash functions for cache validity checking.",
      "category": "medium",
      "priority": 7,
      "completed": true,
      "tested": true,
      "dependencies": [
        "CRIT-002",
        "MED-001"
      ],
      "files": [
        {
          "path": "rust/src/calc/mod.rs",
          "lines": [
            1,
            3
          ],
          "issue": "No score_cache module"
        },
        {
          "path": "src/sba_scout/calc/score_cache.py",
          "lines": [
            1,
            120
          ],
          "issue": "Python reference \u2014 full score cache implementation"
        }
      ],
      "suggestedFix": "Create `rust/src/calc/score_cache.rs` and add `pub mod score_cache;` to calc/mod.rs.\n\n1. Define StatScore:\n```rust\n#[derive(Debug, Serialize, Deserialize)]\npub struct StatScore {\n    pub raw: f64,\n    pub std: i32,\n    pub weighted: f64,\n}\n```\n\n2. Hash functions:\n```rust\nfn compute_weights_hash() -> String\n```\nSerialize BATTER_WEIGHTS + PITCHER_WEIGHTS as JSON, SHA-256, take first 16 hex chars. Use the `sha2` crate (already in Cargo.toml).\n\n```rust\nfn compute_league_stats_hash(batter: &BatterLeagueStats, pitcher: &PitcherLeagueStats) -> String\n```\nHash 5 representative values: batter hit_vrhp avg/stdev, batter so_vrhp avg, pitcher hit_vrhb avg, pitcher so_vrhb avg.\n\n3. Split score calculation:\n```rust\nfn calculate_batter_split_scores(card: &BatterCard, split: &str, stats: &BatterLeagueStats) -> (f64, HashMap<String, StatScore>)\nfn calculate_pitcher_split_scores(card: &PitcherCard, split: &str, stats: &PitcherLeagueStats) -> (f64, HashMap<String, StatScore>)\n```\nIterate BATTER/PITCHER_WEIGHTS, compute raw/std/weighted for each stat, collect into HashMap, sum total.\n\n4. Main rebuild:\n```rust\npub async fn rebuild_score_cache(pool: &SqlitePool) -> Result<CacheRebuildResult>\n```\n- Compute league stats\n- Clear existing cache\n- For each batter card: compute vlhp + vrhp splits, insert both\n- For each pitcher card: compute vlhb + vrhb splits, insert both\n- Use transaction for atomicity\n- Return counts\n\n5. Validity check:\n```rust\npub async fn is_cache_valid(pool: &SqlitePool) -> Result<bool>\npub async fn ensure_cache_exists(pool: &SqlitePool) -> Result<()>\n```",
      "estimatedHours": 2.5,
      "notes": "The stat_scores HashMap is serialized to JSON string via serde_json::to_string for DB storage. The Python version uses SQLAlchemy's JSON column type which auto-serializes. In Rust, serialize explicitly before passing to insert_score_cache."
    },
    {
      "id": "MED-003",
      "name": "Implement cached matchup functions",
      "description": "Add calculate_matchup_cached and calculate_team_matchups_cached async functions to matchup.rs. These use the StandardizedScoreCache table instead of computing from raw card data, for faster UI rendering.",
      "category": "medium",
      "priority": 8,
      "completed": true,
      "tested": true,
      "dependencies": [
        "HIGH-003",
        "MED-001"
      ],
      "files": [
        {
          "path": "rust/src/calc/matchup.rs",
          "lines": [],
          "issue": "No cached matchup path"
        },
        {
          "path": "src/sba_scout/calc/matchup.py",
          "lines": [
            230,
            310
          ],
          "issue": "Python reference \u2014 calculate_matchup_cached, calculate_team_matchups_cached"
        }
      ],
      "suggestedFix": "```rust\npub async fn calculate_matchup_cached(\n    pool: &SqlitePool,\n    player: &Player,\n    batter_card: Option<&BatterCard>,\n    pitcher: &Player,\n    pitcher_card: Option<&PitcherCard>,\n) -> Result<MatchupResult>\n```\n\n1. Same switch-hitter resolution as calculate_matchup\n2. Convert split labels to DB keys: \"vLHP\" \u2192 \"vlhp\", \"vRHP\" \u2192 \"vrhp\", \"vLHB\" \u2192 \"vlhb\", \"vRHB\" \u2192 \"vrhb\"\n3. DB lookup: get_cached_batter_score(pool, card.id, batter_split_key)\n4. If batter cache miss: return MatchupResult with rating=None, tier=\"--\"\n5. DB lookup: get_cached_pitcher_score (only if pitcher_card exists)\n6. Combine: batter_score + (-pitcher_score), same inversion as real-time\n\n```rust\npub async fn calculate_team_matchups_cached(\n    pool: &SqlitePool,\n    batters: &[(Player, Option<BatterCard>)],\n    pitcher: &Player,\n    pitcher_card: Option<&PitcherCard>,\n) -> Result<Vec<MatchupResult>>\n```\nIterate, call cached version, sort same as real-time.",
      "estimatedHours": 1.5,
      "notes": "The cached path avoids recomputing league stats and standardization on every request. The cache should be rebuilt after card imports (ensure_cache_exists) and when weights change (is_cache_valid check)."
    },
    {
      "id": "LOW-001",
      "name": "Wire cache rebuild into importer and add Phase 3 integration test",
      "description": "Replace the TODO comment in api/importer.rs import_all_cards with an actual call to rebuild_score_cache. Add an integration test that imports test CSV data, rebuilds the cache, and verifies cached matchup scores match real-time scores.",
      "category": "low",
      "priority": 9,
      "completed": true,
      "tested": true,
      "dependencies": [
        "MED-002"
      ],
      "files": [
        {
          "path": "rust/src/api/importer.rs",
          "lines": [
            513
          ],
          "issue": "TODO comment for Phase 3 cache rebuild"
        }
      ],
      "suggestedFix": "1. In import_all_cards, replace the TODO with:\n```rust\nif batters.imported > 0 || pitchers.imported > 0 {\n    crate::calc::league_stats::clear_league_stats_cache();\n    crate::calc::score_cache::rebuild_score_cache(pool).await?;\n}\n```\n\n2. Create tests/calc_integration.rs:\n- Create in-memory SQLite DB\n- Insert a few test players, batter cards, pitcher cards with known stat values\n- Call rebuild_score_cache\n- Call calculate_matchup (real-time) and calculate_matchup_cached\n- Assert both produce the same rating and tier\n- Test switch-hitter resolution\n- Test batter-only matchup (no pitcher card)",
      "estimatedHours": 1.5,
      "notes": "This is the integration point that ties Phase 2 (import) to Phase 3 (calc). The test doesn't need real CSV files \u2014 just insert data directly into the DB."
    }
  ],
  "quickWins": [
    {
      "taskId": "HIGH-001",
      "estimatedMinutes": 30,
      "impact": "MatchupResult and get_tier are self-contained, no dependencies"
    },
    {
      "taskId": "MED-001",
      "estimatedMinutes": 30,
      "impact": "Simple CRUD queries, no complex logic"
    }
  ],
  "productionBlockers": [
    {
      "taskId": "CRIT-001",
      "reason": "League stats are the input to all scoring. Get the math wrong here and everything downstream is wrong."
    },
    {
      "taskId": "CRIT-002",
      "reason": "Without computed league stats from the DB, no matchup can be scored."
    }
  ],
  "weeklyRoadmap": {
    "session1": {
      "theme": "League Stats + Matchup Foundation",
      "tasks": [
        "CRIT-001",
        "HIGH-001",
        "MED-001"
      ],
      "estimatedHours": 4,
      "notes": "calc_distribution math, MatchupResult/get_tier, and cache DB queries. All independent \u2014 can run in parallel."
    },
    "session2": {
      "theme": "Component Scoring + DB Integration",
      "tasks": [
        "CRIT-002",
        "HIGH-002"
      ],
      "estimatedHours": 4.5,
      "notes": "League stats from DB + batter/pitcher component scoring. CRIT-002 depends on CRIT-001."
    },
    "session3": {
      "theme": "Matchup Assembly + Cache",
      "tasks": [
        "HIGH-003",
        "MED-002",
        "MED-003",
        "LOW-001"
      ],
      "estimatedHours": 7.5,
      "notes": "Full matchup pipeline, score cache rebuild, cached matchup path, and integration test. This completes Phase 3."
    }
  },
  "architecturalDecisions": {
    "stat_field_access_via_match": "Use explicit match arms to map stat name + split to struct fields, since Rust doesn't have runtime getattr(). Verbose but compile-time safe. A macro could reduce boilerplate but adds complexity.",
    "sample_stdev": "Use sample standard deviation (N-1 denominator / Bessel's correction) to match Python's statistics.stdev. Implement manually \u2014 no external stats crate needed for just mean + stdev.",
    "avg_excludes_zeros_stdev_includes": "The avg field excludes zero values (AVERAGEIF semantics) while stdev includes all values including zeros. This asymmetry is intentional and critical to preserve.",
    "league_stats_cache": "Use OnceLock<Mutex<Option<T>>> for thread-safe in-memory cache with explicit invalidation. OnceLock for lazy initialization, Mutex for interior mutability, Option for nullable cache state.",
    "score_cache_json_serialization": "Serialize stat_scores HashMap as JSON string via serde_json before DB insert. Deserialize on read when needed. Matches Python's SQLAlchemy JSON column behavior.",
    "no_external_stats_crate": "Implement mean and sample stdev manually (< 10 lines) rather than adding a dependency like statrs. The math is simple enough that a crate adds more weight than value."
  },
  "testingStrategy": {
    "calc_distribution": "Test with known data sets where avg and stdev can be hand-computed. Include edge cases: empty, single value, all same, all zeros, mix of zeros and non-zeros.",
    "component_scoring": "Create a BatterCard/PitcherCard with known values and a BatterLeagueStats/PitcherLeagueStats with known distributions. Verify component scores match hand-calculated expectations.",
    "matchup_assembly": "Test switch-hitter resolution (3 cases: L, R, S), pitcher inversion sign, batter-only fallback, None card handling.",
    "cache_round_trip": "Insert cards \u2192 rebuild cache \u2192 verify cached scores match real-time computed scores for the same inputs. This is the ultimate correctness check.",
    "tier_boundaries": "Test get_tier at exact boundary values: 40, 39, 20, 19, 0, -19, -20, -39, -40, None."
  }
}