Implement Phase 3: calc layer with matchup scoring, league stats, and score cache

Port the Python calc layer to Rust: league stat distributions (avg excludes zeros,
stdev N-1 includes zeros), weighted standardized matchup scoring with switch-hitter
resolution and pitcher inversion, and SHA-256-validated score cache with automatic
rebuild after card imports. 105 tests passing (76 unit + 5 integration + 24 DB).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Cal Corum 2026-02-27 23:35:01 -06:00
parent 3c70ecc71a
commit ebe4196bfc
9 changed files with 2011 additions and 8 deletions

View File

@ -0,0 +1,299 @@
{
"meta": {
"version": "1.0.0",
"created": "2026-02-27",
"lastUpdated": "2026-02-27",
"planType": "migration",
"phase": "Phase 3: Calc Layer (League Stats, Matchup Scoring, Score Cache)",
"description": "Port the matchup calculation pipeline — league stat distributions, batter/pitcher component scoring, matchup result assembly, and score cache — from Python to Rust. Builds on the existing standardize_value/weights stubs from Phase 1.",
"totalEstimatedHours": 14,
"totalTasks": 9,
"completedTasks": 0,
"existingCode": "weights.rs is complete (StatWeight, BATTER_WEIGHTS, PITCHER_WEIGHTS, max scores). matchup.rs has standardize_value and calculate_weighted_score with tests. league_stats.rs has only the StatDistribution struct."
},
"categories": {
"critical": "Foundation — league stats computation that everything else depends on",
"high": "Core matchup calculation logic",
"medium": "Score cache for performance",
"low": "Cache validation and convenience functions"
},
"tasks": [
{
"id": "CRIT-001",
"name": "Implement league stats structs and distribution calculation",
"description": "Add BatterLeagueStats and PitcherLeagueStats structs to league_stats.rs, plus the _calc_distribution function. This is the mathematical foundation for all scoring.\n\nBatterLeagueStats has 18 StatDistribution fields (9 stats x 2 splits: vlhp, vrhp). PitcherLeagueStats has 18 fields (9 stats x 2 splits: vlhb, vrhb). Stats: so, bb, hit, ob, tb, hr, dp, bphr, bp1b.\n\nCRITICAL MATH: _calc_distribution(values: &[f64]) -> StatDistribution\n- avg: mean of NON-ZERO values only (filter out 0.0, then mean). If fewer than 2 non-zero values, avg = 0.0.\n- stdev: sample standard deviation of ALL values including zeros (Bessel's correction, divide by N-1). If fewer than 2 values, stdev = 1.0. If stdev == 0.0, use 1.0 to prevent division by zero.\n- This asymmetry (avg excludes zeros, stdev includes zeros) is intentional and matches the Python statistics.mean/statistics.stdev behavior.\n\nImplement sample stdev manually: sqrt(sum((x - mean_all)^2) / (n - 1)). Do NOT use the population stdev formula. The mean used in the stdev formula should be the mean of ALL values (including zeros), not the non-zero mean used for the returned avg field.",
"category": "critical",
"priority": 1,
"completed": false,
"tested": false,
"dependencies": [],
"files": [
{
"path": "rust/src/calc/league_stats.rs",
"lines": [1, 6],
"issue": "Only has StatDistribution struct, no calculation functions"
},
{
"path": "src/sba_scout/calc/league_stats.py",
"lines": [1, 80],
"issue": "Python reference — _calc_distribution and stats structs"
}
],
"suggestedFix": "1. Add structs:\n```rust\npub struct BatterLeagueStats {\n pub so_vlhp: StatDistribution, pub bb_vlhp: StatDistribution, ...\n pub so_vrhp: StatDistribution, pub bb_vrhp: StatDistribution, ...\n}\npub struct PitcherLeagueStats {\n pub so_vlhb: StatDistribution, pub bb_vlhb: StatDistribution, ...\n pub so_vrhb: StatDistribution, pub bb_vrhb: StatDistribution, ...\n}\n```\n\n2. Add calc_distribution:\n```rust\nfn calc_distribution(values: &[f64]) -> StatDistribution {\n let non_zero: Vec<f64> = values.iter().copied().filter(|&v| v > 0.0).collect();\n let avg = if non_zero.len() >= 2 {\n non_zero.iter().sum::<f64>() / non_zero.len() as f64\n } else { 0.0 };\n \n let n = values.len();\n if n < 2 { return StatDistribution { avg, stdev: 1.0 }; }\n let mean_all = values.iter().sum::<f64>() / n as f64;\n let variance = values.iter().map(|&x| (x - mean_all).powi(2)).sum::<f64>() / (n - 1) as f64;\n let stdev = variance.sqrt();\n let stdev = if stdev == 0.0 { 1.0 } else { stdev };\n StatDistribution { avg, stdev }\n}\n```\n\n3. Add Default impls that return all StatDistribution { avg: 0.0, stdev: 1.0 }.\n\n4. Write unit tests covering: empty input, single value, all zeros, mixed values with zeros, normal distribution.",
"estimatedHours": 2,
"notes": "The stdev formula is the trickiest part. Python's statistics.stdev uses N-1 (sample stdev). The mean used inside the stdev calculation must be the mean of ALL values (including zeros) — NOT the non-zero mean that's returned as the avg field. These are two different means for two different purposes."
},
{
"id": "CRIT-002",
"name": "Implement league stats DB queries (calculate + cache)",
"description": "Add async functions to compute BatterLeagueStats and PitcherLeagueStats from the database, plus an in-memory cache with invalidation.\n\ncalculate_batter_league_stats(pool) fetches ALL batter_cards, extracts each stat column into a Vec<f64>, and calls calc_distribution for each of the 18 stat+split combinations.\n\ncalculate_pitcher_league_stats(pool) does the same for pitcher_cards.\n\nThe cache uses module-level state (OnceLock or tokio::sync::OnceCell) to avoid recomputing on every matchup. clear_league_stats_cache() resets the cache (called after card imports).",
"category": "critical",
"priority": 2,
"completed": false,
"tested": false,
"dependencies": ["CRIT-001"],
"files": [
{
"path": "rust/src/calc/league_stats.rs",
"lines": [1, 6],
"issue": "No DB functions"
},
{
"path": "src/sba_scout/calc/league_stats.py",
"lines": [80, 160],
"issue": "Python reference — calculate_*_league_stats, get_*_league_stats, clear cache"
}
],
"suggestedFix": "1. Add DB query functions:\n```rust\npub async fn calculate_batter_league_stats(pool: &SqlitePool) -> Result<BatterLeagueStats>\npub async fn calculate_pitcher_league_stats(pool: &SqlitePool) -> Result<PitcherLeagueStats>\n```\n\nFor batter stats: `SELECT so_vlhp, bb_vlhp, hit_vlhp, ... FROM batter_cards`. Then for each column, collect into Vec<f64> and call calc_distribution.\n\nFor the in-memory cache, use `tokio::sync::OnceCell` or `std::sync::OnceLock` with a `Mutex` wrapper:\n```rust\nstatic BATTER_STATS: OnceLock<Mutex<Option<BatterLeagueStats>>> = OnceLock::new();\nstatic PITCHER_STATS: OnceLock<Mutex<Option<PitcherLeagueStats>>> = OnceLock::new();\n\npub async fn get_batter_league_stats(pool: &SqlitePool) -> Result<BatterLeagueStats>\npub async fn get_pitcher_league_stats(pool: &SqlitePool) -> Result<PitcherLeagueStats>\npub fn clear_league_stats_cache()\n```\n\n2. Since BatterLeagueStats/PitcherLeagueStats will be shared across threads, they need Clone. The cache getter should return a cloned value.\n\n3. Wire clear_league_stats_cache() into the importer: add a call at the end of import_all_cards in api/importer.rs (replace the Phase 3 TODO comment).",
"estimatedHours": 2,
"notes": "The Python cache is never invalidated automatically — only by explicit clear_league_stats_cache(). Match this behavior. The DB query fetches ALL cards (no season filter) — this is intentional since league stats should represent the full card pool."
},
{
"id": "HIGH-001",
"name": "Implement MatchupResult and get_tier",
"description": "Add the MatchupResult struct and tier assignment function to matchup.rs.\n\nMatchupResult holds the complete output of a matchup calculation: the player, their rating, tier grade, effective batting hand (after switch-hitter resolution), split labels, and individual batter/pitcher component scores.\n\nget_tier maps a rating to A/B/C/D/F tier grades.",
"category": "high",
"priority": 3,
"completed": false,
"tested": false,
"dependencies": [],
"files": [
{
"path": "rust/src/calc/matchup.rs",
"lines": [1, 43],
"issue": "Has standardize_value and calculate_weighted_score, but no MatchupResult or get_tier"
},
{
"path": "src/sba_scout/calc/matchup.py",
"lines": [1, 60],
"issue": "Python reference — MatchupResult dataclass and get_tier"
}
],
"suggestedFix": "Add to matchup.rs:\n\n```rust\nuse crate::db::models::Player;\n\n#[derive(Debug, Clone)]\npub struct MatchupResult {\n pub player: Player,\n pub rating: Option<f64>,\n pub tier: String, // \"A\", \"B\", \"C\", \"D\", \"F\", or \"--\"\n pub batter_hand: String, // \"L\" or \"R\" (effective, after switch-hitter)\n pub batter_split: String, // \"vLHP\" or \"vRHP\"\n pub pitcher_split: String, // \"vLHB\" or \"vRHB\"\n pub batter_component: Option<f64>,\n pub pitcher_component: Option<f64>,\n}\n\npub fn get_tier(rating: Option<f64>) -> &'static str {\n match rating {\n None => \"--\",\n Some(r) if r >= 40.0 => \"A\",\n Some(r) if r >= 20.0 => \"B\",\n Some(r) if r >= -19.0 => \"C\",\n Some(r) if r >= -39.0 => \"D\",\n Some(_) => \"F\",\n }\n}\n```\n\nAdd display helpers:\n- `rating_display(&self) -> String` — formats as \"+15\", \"-3\", \"N/A\"\n- `split_display(&self) -> String` — formats as \"vL/vR\"\n\nWrite tests for get_tier covering all boundaries: 40, 20, 0, -19, -20, -39, -40, None.",
"estimatedHours": 1,
"notes": "The tier boundaries are asymmetric: C spans -19 to +19 (39 units), A is >= 40, F is < -39. Player struct needs Clone derive if it doesn't have it already (check models.rs)."
},
{
"id": "HIGH-002",
"name": "Implement batter and pitcher component calculation",
"description": "Add _calculate_batter_component and _calculate_pitcher_component functions to matchup.rs. These compute the individual batter and pitcher scores by applying weighted standardization across all 9 stats for the appropriate handedness split.\n\nThe batter component iterates BATTER_WEIGHTS, looks up the stat value on the BatterCard for the correct split (vlhp or vrhp based on pitcher hand), looks up the corresponding distribution from BatterLeagueStats, and sums the weighted scores.\n\nThe pitcher component does the same with PITCHER_WEIGHTS, PitcherCard, and PitcherLeagueStats.",
"category": "high",
"priority": 4,
"completed": false,
"tested": false,
"dependencies": ["CRIT-001"],
"files": [
{
"path": "rust/src/calc/matchup.rs",
"lines": [36, 43],
"issue": "Has calculate_weighted_score but no component functions"
},
{
"path": "src/sba_scout/calc/matchup.py",
"lines": [60, 130],
"issue": "Python reference — _calculate_batter_component and _calculate_pitcher_component"
}
],
"suggestedFix": "The tricky part is mapping stat name strings to struct fields. Since Rust doesn't have runtime attribute access like Python's getattr(), use a helper that maps stat name + split to the card field value and the league stats distribution.\n\nOption A (recommended): Write explicit match arms:\n```rust\nfn get_batter_stat(card: &BatterCard, stat: &str, vs_hand: &str) -> f64 {\n match (stat, vs_hand) {\n (\"so\", \"L\") => card.so_vlhp.unwrap_or(0.0),\n (\"so\", \"R\") => card.so_vrhp.unwrap_or(0.0),\n (\"bb\", \"L\") => card.bb_vlhp.unwrap_or(0.0),\n // ... all 18 combinations\n }\n}\nfn get_batter_dist<'a>(stats: &'a BatterLeagueStats, stat: &str, vs_hand: &str) -> &'a StatDistribution {\n match (stat, vs_hand) {\n (\"so\", \"L\") => &stats.so_vlhp,\n // ... all 18\n }\n}\n```\n\nThen the component functions are clean:\n```rust\nfn calculate_batter_component(card: &BatterCard, pitcher_hand: &str, league_stats: &BatterLeagueStats) -> f64 {\n BATTER_WEIGHTS.iter().map(|(stat, weight)| {\n let value = get_batter_stat(card, stat, pitcher_hand);\n let dist = get_batter_dist(league_stats, stat, pitcher_hand);\n calculate_weighted_score(value, dist, weight)\n }).sum()\n}\n```\n\nSame pattern for pitcher. Result range: batter [-66, +66], pitcher [-69, +69].\n\nWrite tests with known distributions and card values to verify correct summation.",
"estimatedHours": 2.5,
"notes": "vs_hand for batter component is the PITCHER's throwing hand (L or R). vs_hand for pitcher component is the BATTER's effective batting hand. The caller resolves switch hitters before calling these functions. All BatterCard stat fields are Option<f64> — unwrap_or(0.0) matches the Python behavior where None → standardize_value returns 3."
},
{
"id": "HIGH-003",
"name": "Implement calculate_matchup and calculate_team_matchups",
"description": "Add the main matchup orchestration functions to matchup.rs. These handle switch-hitter resolution, call the component functions, combine scores with pitcher inversion, and assign tiers.\n\ncalculate_matchup: single batter vs single pitcher. Resolves effective batting hand for switch hitters (S bats left vs RHP, right vs LHP). Combines batter_component + (-pitcher_component) for total rating.\n\ncalculate_team_matchups: runs calculate_matchup for a list of batters, sorts by rating descending (None-rated last).",
"category": "high",
"priority": 5,
"completed": false,
"tested": false,
"dependencies": ["HIGH-001", "HIGH-002"],
"files": [
{
"path": "rust/src/calc/matchup.rs",
"lines": [],
"issue": "No matchup orchestration functions"
},
{
"path": "src/sba_scout/calc/matchup.py",
"lines": [130, 230],
"issue": "Python reference — calculate_matchup, calculate_team_matchups"
}
],
"suggestedFix": "```rust\npub fn calculate_matchup(\n player: &Player,\n batter_card: Option<&BatterCard>,\n pitcher: &Player,\n pitcher_card: Option<&PitcherCard>,\n batter_league_stats: &BatterLeagueStats,\n pitcher_league_stats: &PitcherLeagueStats,\n) -> MatchupResult\n```\n\nSwitch-hitter resolution:\n```rust\nlet batter_hand = player.hand.as_deref().unwrap_or(\"R\");\nlet pitcher_hand = pitcher.hand.as_deref().unwrap_or(\"R\");\nlet effective_batting_hand = if batter_hand == \"S\" {\n if pitcher_hand == \"R\" { \"L\" } else { \"R\" }\n} else { batter_hand };\nlet batter_split = if pitcher_hand == \"L\" { \"vLHP\" } else { \"vRHP\" };\nlet pitcher_split = if effective_batting_hand == \"L\" { \"vLHB\" } else { \"vRHB\" };\n```\n\nScore combination:\n```rust\nlet batter_component = calculate_batter_component(card, pitcher_hand, batter_stats);\nlet pitcher_component = pitcher_card.map(|pc| calculate_pitcher_component(pc, effective_batting_hand, pitcher_stats));\nlet total = match pitcher_component {\n Some(pc) => batter_component + (-pc), // INVERT pitcher score\n None => batter_component,\n};\n```\n\ncalculate_team_matchups: iterate batters, call calculate_matchup for each, sort by (has_rating desc, rating desc).\n\nTests: switch-hitter resolution (S vs R = bats L, S vs L = bats R), batter-only matchup (no pitcher card), full matchup with inversion.",
"estimatedHours": 2,
"notes": "The pitcher component is INVERTED (negated) when combining. A high pitcher score means the pitcher is good, which is BAD for the batter — hence the negation. This is the most important sign convention to get right."
},
{
"id": "MED-001",
"name": "Add StandardizedScoreCache DB queries",
"description": "Add query functions for the StandardizedScoreCache table to db/queries.rs. These are used by both the cache rebuild and the cached matchup path.",
"category": "medium",
"priority": 6,
"completed": false,
"tested": false,
"dependencies": [],
"files": [
{
"path": "rust/src/db/queries.rs",
"lines": [],
"issue": "Has MatchupCache queries but no StandardizedScoreCache queries"
},
{
"path": "src/sba_scout/calc/score_cache.py",
"lines": [120, 160],
"issue": "Python reference — get_cached_batter_score, get_cached_pitcher_score"
}
],
"suggestedFix": "Add to queries.rs:\n\n```rust\npub async fn get_cached_batter_score(\n pool: &SqlitePool,\n batter_card_id: i64,\n split: &str,\n) -> Result<Option<StandardizedScoreCache>>\n```\n`SELECT * FROM standardized_score_cache WHERE batter_card_id = ? AND split = ?`\n\n```rust\npub async fn get_cached_pitcher_score(\n pool: &SqlitePool,\n pitcher_card_id: i64,\n split: &str,\n) -> Result<Option<StandardizedScoreCache>>\n```\n`SELECT * FROM standardized_score_cache WHERE pitcher_card_id = ? AND split = ?`\n\n```rust\npub async fn clear_score_cache(pool: &SqlitePool) -> Result<u64>\n```\n`DELETE FROM standardized_score_cache` — returns rows_affected.\n\n```rust\npub async fn insert_score_cache(\n pool: &SqlitePool,\n batter_card_id: Option<i64>,\n pitcher_card_id: Option<i64>,\n split: &str,\n total_score: f64,\n stat_scores: &str,\n weights_hash: &str,\n league_stats_hash: &str,\n) -> Result<()>\n```\n`INSERT INTO standardized_score_cache (...) VALUES (...)`",
"estimatedHours": 1,
"notes": "The StandardizedScoreCache model already exists in models.rs with correct fields. These are straightforward single-table queries."
},
{
"id": "MED-002",
"name": "Implement score cache rebuild (score_cache.rs)",
"description": "Create calc/score_cache.rs with the cache rebuild logic: compute per-stat scores for every card×split combination, serialize as JSON, and insert into the StandardizedScoreCache table.\n\nAlso implement the hash functions for cache validity checking.",
"category": "medium",
"priority": 7,
"completed": false,
"tested": false,
"dependencies": ["CRIT-002", "MED-001"],
"files": [
{
"path": "rust/src/calc/mod.rs",
"lines": [1, 3],
"issue": "No score_cache module"
},
{
"path": "src/sba_scout/calc/score_cache.py",
"lines": [1, 120],
"issue": "Python reference — full score cache implementation"
}
],
"suggestedFix": "Create `rust/src/calc/score_cache.rs` and add `pub mod score_cache;` to calc/mod.rs.\n\n1. Define StatScore:\n```rust\n#[derive(Debug, Serialize, Deserialize)]\npub struct StatScore {\n pub raw: f64,\n pub std: i32,\n pub weighted: f64,\n}\n```\n\n2. Hash functions:\n```rust\nfn compute_weights_hash() -> String\n```\nSerialize BATTER_WEIGHTS + PITCHER_WEIGHTS as JSON, SHA-256, take first 16 hex chars. Use the `sha2` crate (already in Cargo.toml).\n\n```rust\nfn compute_league_stats_hash(batter: &BatterLeagueStats, pitcher: &PitcherLeagueStats) -> String\n```\nHash 5 representative values: batter hit_vrhp avg/stdev, batter so_vrhp avg, pitcher hit_vrhb avg, pitcher so_vrhb avg.\n\n3. Split score calculation:\n```rust\nfn calculate_batter_split_scores(card: &BatterCard, split: &str, stats: &BatterLeagueStats) -> (f64, HashMap<String, StatScore>)\nfn calculate_pitcher_split_scores(card: &PitcherCard, split: &str, stats: &PitcherLeagueStats) -> (f64, HashMap<String, StatScore>)\n```\nIterate BATTER/PITCHER_WEIGHTS, compute raw/std/weighted for each stat, collect into HashMap, sum total.\n\n4. Main rebuild:\n```rust\npub async fn rebuild_score_cache(pool: &SqlitePool) -> Result<CacheRebuildResult>\n```\n- Compute league stats\n- Clear existing cache\n- For each batter card: compute vlhp + vrhp splits, insert both\n- For each pitcher card: compute vlhb + vrhb splits, insert both\n- Use transaction for atomicity\n- Return counts\n\n5. Validity check:\n```rust\npub async fn is_cache_valid(pool: &SqlitePool) -> Result<bool>\npub async fn ensure_cache_exists(pool: &SqlitePool) -> Result<()>\n```",
"estimatedHours": 2.5,
"notes": "The stat_scores HashMap is serialized to JSON string via serde_json::to_string for DB storage. The Python version uses SQLAlchemy's JSON column type which auto-serializes. In Rust, serialize explicitly before passing to insert_score_cache."
},
{
"id": "MED-003",
"name": "Implement cached matchup functions",
"description": "Add calculate_matchup_cached and calculate_team_matchups_cached async functions to matchup.rs. These use the StandardizedScoreCache table instead of computing from raw card data, for faster UI rendering.",
"category": "medium",
"priority": 8,
"completed": false,
"tested": false,
"dependencies": ["HIGH-003", "MED-001"],
"files": [
{
"path": "rust/src/calc/matchup.rs",
"lines": [],
"issue": "No cached matchup path"
},
{
"path": "src/sba_scout/calc/matchup.py",
"lines": [230, 310],
"issue": "Python reference — calculate_matchup_cached, calculate_team_matchups_cached"
}
],
"suggestedFix": "```rust\npub async fn calculate_matchup_cached(\n pool: &SqlitePool,\n player: &Player,\n batter_card: Option<&BatterCard>,\n pitcher: &Player,\n pitcher_card: Option<&PitcherCard>,\n) -> Result<MatchupResult>\n```\n\n1. Same switch-hitter resolution as calculate_matchup\n2. Convert split labels to DB keys: \"vLHP\" → \"vlhp\", \"vRHP\" → \"vrhp\", \"vLHB\" → \"vlhb\", \"vRHB\" → \"vrhb\"\n3. DB lookup: get_cached_batter_score(pool, card.id, batter_split_key)\n4. If batter cache miss: return MatchupResult with rating=None, tier=\"--\"\n5. DB lookup: get_cached_pitcher_score (only if pitcher_card exists)\n6. Combine: batter_score + (-pitcher_score), same inversion as real-time\n\n```rust\npub async fn calculate_team_matchups_cached(\n pool: &SqlitePool,\n batters: &[(Player, Option<BatterCard>)],\n pitcher: &Player,\n pitcher_card: Option<&PitcherCard>,\n) -> Result<Vec<MatchupResult>>\n```\nIterate, call cached version, sort same as real-time.",
"estimatedHours": 1.5,
"notes": "The cached path avoids recomputing league stats and standardization on every request. The cache should be rebuilt after card imports (ensure_cache_exists) and when weights change (is_cache_valid check)."
},
{
"id": "LOW-001",
"name": "Wire cache rebuild into importer and add Phase 3 integration test",
"description": "Replace the TODO comment in api/importer.rs import_all_cards with an actual call to rebuild_score_cache. Add an integration test that imports test CSV data, rebuilds the cache, and verifies cached matchup scores match real-time scores.",
"category": "low",
"priority": 9,
"completed": false,
"tested": false,
"dependencies": ["MED-002"],
"files": [
{
"path": "rust/src/api/importer.rs",
"lines": [513],
"issue": "TODO comment for Phase 3 cache rebuild"
}
],
"suggestedFix": "1. In import_all_cards, replace the TODO with:\n```rust\nif batters.imported > 0 || pitchers.imported > 0 {\n crate::calc::league_stats::clear_league_stats_cache();\n crate::calc::score_cache::rebuild_score_cache(pool).await?;\n}\n```\n\n2. Create tests/calc_integration.rs:\n- Create in-memory SQLite DB\n- Insert a few test players, batter cards, pitcher cards with known stat values\n- Call rebuild_score_cache\n- Call calculate_matchup (real-time) and calculate_matchup_cached\n- Assert both produce the same rating and tier\n- Test switch-hitter resolution\n- Test batter-only matchup (no pitcher card)",
"estimatedHours": 1.5,
"notes": "This is the integration point that ties Phase 2 (import) to Phase 3 (calc). The test doesn't need real CSV files — just insert data directly into the DB."
}
],
"quickWins": [
{
"taskId": "HIGH-001",
"estimatedMinutes": 30,
"impact": "MatchupResult and get_tier are self-contained, no dependencies"
},
{
"taskId": "MED-001",
"estimatedMinutes": 30,
"impact": "Simple CRUD queries, no complex logic"
}
],
"productionBlockers": [
{
"taskId": "CRIT-001",
"reason": "League stats are the input to all scoring. Get the math wrong here and everything downstream is wrong."
},
{
"taskId": "CRIT-002",
"reason": "Without computed league stats from the DB, no matchup can be scored."
}
],
"weeklyRoadmap": {
"session1": {
"theme": "League Stats + Matchup Foundation",
"tasks": ["CRIT-001", "HIGH-001", "MED-001"],
"estimatedHours": 4,
"notes": "calc_distribution math, MatchupResult/get_tier, and cache DB queries. All independent — can run in parallel."
},
"session2": {
"theme": "Component Scoring + DB Integration",
"tasks": ["CRIT-002", "HIGH-002"],
"estimatedHours": 4.5,
"notes": "League stats from DB + batter/pitcher component scoring. CRIT-002 depends on CRIT-001."
},
"session3": {
"theme": "Matchup Assembly + Cache",
"tasks": ["HIGH-003", "MED-002", "MED-003", "LOW-001"],
"estimatedHours": 7.5,
"notes": "Full matchup pipeline, score cache rebuild, cached matchup path, and integration test. This completes Phase 3."
}
},
"architecturalDecisions": {
"stat_field_access_via_match": "Use explicit match arms to map stat name + split to struct fields, since Rust doesn't have runtime getattr(). Verbose but compile-time safe. A macro could reduce boilerplate but adds complexity.",
"sample_stdev": "Use sample standard deviation (N-1 denominator / Bessel's correction) to match Python's statistics.stdev. Implement manually — no external stats crate needed for just mean + stdev.",
"avg_excludes_zeros_stdev_includes": "The avg field excludes zero values (AVERAGEIF semantics) while stdev includes all values including zeros. This asymmetry is intentional and critical to preserve.",
"league_stats_cache": "Use OnceLock<Mutex<Option<T>>> for thread-safe in-memory cache with explicit invalidation. OnceLock for lazy initialization, Mutex for interior mutability, Option for nullable cache state.",
"score_cache_json_serialization": "Serialize stat_scores HashMap as JSON string via serde_json before DB insert. Deserialize on read when needed. Matches Python's SQLAlchemy JSON column behavior.",
"no_external_stats_crate": "Implement mean and sample stdev manually (< 10 lines) rather than adding a dependency like statrs. The math is simple enough that a crate adds more weight than value."
},
"testingStrategy": {
"calc_distribution": "Test with known data sets where avg and stdev can be hand-computed. Include edge cases: empty, single value, all same, all zeros, mix of zeros and non-zeros.",
"component_scoring": "Create a BatterCard/PitcherCard with known values and a BatterLeagueStats/PitcherLeagueStats with known distributions. Verify component scores match hand-calculated expectations.",
"matchup_assembly": "Test switch-hitter resolution (3 cases: L, R, S), pitcher inversion sign, batter-only fallback, None card handling.",
"cache_round_trip": "Insert cards → rebuild cache → verify cached scores match real-time computed scores for the same inputs. This is the ultimate correctness check.",
"tier_boundaries": "Test get_tier at exact boundary values: 40, 39, 20, 19, 0, -19, -20, -39, -40, None."
}
}

View File

@ -519,7 +519,14 @@ pub async fn import_all_cards(
}
};
// TODO: Phase 3 — call rebuild_score_cache() after import
// Invalidate league stats cache and rebuild score cache after card import.
// Only rebuild if at least one card was actually imported.
if batters.imported > 0 || pitchers.imported > 0 {
crate::calc::league_stats::clear_league_stats_cache();
if let Err(e) = crate::calc::score_cache::rebuild_score_cache(pool).await {
eprintln!("Score cache rebuild failed after import: {e}");
}
}
Ok(AllImportResult { batters, pitchers })
}

View File

@ -1,6 +1,370 @@
use anyhow::Result;
use sqlx::SqlitePool;
use std::sync::{Mutex, OnceLock};
use crate::db::models::{BatterCard, PitcherCard};
// =============================================================================
// In-memory cache
// =============================================================================
static BATTER_CACHE: OnceLock<Mutex<Option<BatterLeagueStats>>> = OnceLock::new();
static PITCHER_CACHE: OnceLock<Mutex<Option<PitcherLeagueStats>>> = OnceLock::new();
/// Distribution statistics for a single stat across the league.
#[derive(Debug, Clone)]
#[derive(Debug, Clone, Copy)]
pub struct StatDistribution {
pub avg: f64,
pub stdev: f64,
}
/// League-wide averages and standard deviations for batter card stats.
#[derive(Debug, Clone)]
pub struct BatterLeagueStats {
// vs Left-Handed Pitchers
pub so_vlhp: StatDistribution,
pub bb_vlhp: StatDistribution,
pub hit_vlhp: StatDistribution,
pub ob_vlhp: StatDistribution,
pub tb_vlhp: StatDistribution,
pub hr_vlhp: StatDistribution,
pub dp_vlhp: StatDistribution,
pub bphr_vlhp: StatDistribution,
pub bp1b_vlhp: StatDistribution,
// vs Right-Handed Pitchers
pub so_vrhp: StatDistribution,
pub bb_vrhp: StatDistribution,
pub hit_vrhp: StatDistribution,
pub ob_vrhp: StatDistribution,
pub tb_vrhp: StatDistribution,
pub hr_vrhp: StatDistribution,
pub dp_vrhp: StatDistribution,
pub bphr_vrhp: StatDistribution,
pub bp1b_vrhp: StatDistribution,
}
impl Default for BatterLeagueStats {
fn default() -> Self {
let d = StatDistribution { avg: 0.0, stdev: 1.0 };
Self {
so_vlhp: d.clone(),
bb_vlhp: d.clone(),
hit_vlhp: d.clone(),
ob_vlhp: d.clone(),
tb_vlhp: d.clone(),
hr_vlhp: d.clone(),
dp_vlhp: d.clone(),
bphr_vlhp: d.clone(),
bp1b_vlhp: d.clone(),
so_vrhp: d.clone(),
bb_vrhp: d.clone(),
hit_vrhp: d.clone(),
ob_vrhp: d.clone(),
tb_vrhp: d.clone(),
hr_vrhp: d.clone(),
dp_vrhp: d.clone(),
bphr_vrhp: d.clone(),
bp1b_vrhp: d,
}
}
}
/// League-wide averages and standard deviations for pitcher card stats.
#[derive(Debug, Clone)]
pub struct PitcherLeagueStats {
// vs Left-Handed Batters
pub so_vlhb: StatDistribution,
pub bb_vlhb: StatDistribution,
pub hit_vlhb: StatDistribution,
pub ob_vlhb: StatDistribution,
pub tb_vlhb: StatDistribution,
pub hr_vlhb: StatDistribution,
pub dp_vlhb: StatDistribution,
pub bphr_vlhb: StatDistribution,
pub bp1b_vlhb: StatDistribution,
// vs Right-Handed Batters
pub so_vrhb: StatDistribution,
pub bb_vrhb: StatDistribution,
pub hit_vrhb: StatDistribution,
pub ob_vrhb: StatDistribution,
pub tb_vrhb: StatDistribution,
pub hr_vrhb: StatDistribution,
pub dp_vrhb: StatDistribution,
pub bphr_vrhb: StatDistribution,
pub bp1b_vrhb: StatDistribution,
}
impl Default for PitcherLeagueStats {
fn default() -> Self {
let d = StatDistribution { avg: 0.0, stdev: 1.0 };
Self {
so_vlhb: d.clone(),
bb_vlhb: d.clone(),
hit_vlhb: d.clone(),
ob_vlhb: d.clone(),
tb_vlhb: d.clone(),
hr_vlhb: d.clone(),
dp_vlhb: d.clone(),
bphr_vlhb: d.clone(),
bp1b_vlhb: d.clone(),
so_vrhb: d.clone(),
bb_vrhb: d.clone(),
hit_vrhb: d.clone(),
ob_vrhb: d.clone(),
tb_vrhb: d.clone(),
hr_vrhb: d.clone(),
dp_vrhb: d.clone(),
bphr_vrhb: d.clone(),
bp1b_vrhb: d,
}
}
}
/// Calculate average and sample standard deviation for a slice of values.
///
/// CRITICAL MATH — two different means are used:
/// - `avg` = mean of NON-ZERO values only (matches spreadsheet AVERAGEIF behavior).
/// Falls back to 0.0 if fewer than 2 non-zero values exist.
/// - `stdev` = sample standard deviation (Bessel's correction, N-1 denominator) of
/// ALL values INCLUDING zeros (even when avg excludes them).
/// Falls back to 1.0 if fewer than 2 total values, or if computed stdev is 0.0.
///
/// The stdev variance sum uses mean_all (mean of all values), NOT avg.
pub fn calc_distribution(values: &[f64]) -> StatDistribution {
let non_zero: Vec<f64> = values.iter().copied().filter(|&v| v > 0.0).collect();
let avg = if non_zero.len() >= 2 {
non_zero.iter().sum::<f64>() / non_zero.len() as f64
} else {
0.0
};
let n = values.len();
if n < 2 {
return StatDistribution { avg, stdev: 1.0 };
}
// stdev uses mean of ALL values (including zeros), per Python reference
let mean_all = values.iter().sum::<f64>() / n as f64;
let variance = values.iter().map(|&x| (x - mean_all).powi(2)).sum::<f64>() / (n - 1) as f64;
let stdev_raw = variance.sqrt();
let stdev = if stdev_raw == 0.0 { 1.0 } else { stdev_raw };
StatDistribution { avg, stdev }
}
// =============================================================================
// DB computation functions
// =============================================================================
/// Compute league-wide batter stat distributions from all rows in `batter_cards`.
/// Returns `BatterLeagueStats::default()` if the table is empty.
pub async fn calculate_batter_league_stats(pool: &SqlitePool) -> Result<BatterLeagueStats> {
let cards: Vec<BatterCard> = sqlx::query_as("SELECT * FROM batter_cards")
.fetch_all(pool)
.await?;
if cards.is_empty() {
return Ok(BatterLeagueStats::default());
}
Ok(BatterLeagueStats {
so_vlhp: calc_distribution(&cards.iter().map(|c| c.so_vlhp).collect::<Vec<_>>()),
bb_vlhp: calc_distribution(&cards.iter().map(|c| c.bb_vlhp).collect::<Vec<_>>()),
hit_vlhp: calc_distribution(&cards.iter().map(|c| c.hit_vlhp).collect::<Vec<_>>()),
ob_vlhp: calc_distribution(&cards.iter().map(|c| c.ob_vlhp).collect::<Vec<_>>()),
tb_vlhp: calc_distribution(&cards.iter().map(|c| c.tb_vlhp).collect::<Vec<_>>()),
hr_vlhp: calc_distribution(&cards.iter().map(|c| c.hr_vlhp).collect::<Vec<_>>()),
dp_vlhp: calc_distribution(&cards.iter().map(|c| c.dp_vlhp).collect::<Vec<_>>()),
bphr_vlhp: calc_distribution(&cards.iter().map(|c| c.bphr_vlhp).collect::<Vec<_>>()),
bp1b_vlhp: calc_distribution(&cards.iter().map(|c| c.bp1b_vlhp).collect::<Vec<_>>()),
so_vrhp: calc_distribution(&cards.iter().map(|c| c.so_vrhp).collect::<Vec<_>>()),
bb_vrhp: calc_distribution(&cards.iter().map(|c| c.bb_vrhp).collect::<Vec<_>>()),
hit_vrhp: calc_distribution(&cards.iter().map(|c| c.hit_vrhp).collect::<Vec<_>>()),
ob_vrhp: calc_distribution(&cards.iter().map(|c| c.ob_vrhp).collect::<Vec<_>>()),
tb_vrhp: calc_distribution(&cards.iter().map(|c| c.tb_vrhp).collect::<Vec<_>>()),
hr_vrhp: calc_distribution(&cards.iter().map(|c| c.hr_vrhp).collect::<Vec<_>>()),
dp_vrhp: calc_distribution(&cards.iter().map(|c| c.dp_vrhp).collect::<Vec<_>>()),
bphr_vrhp: calc_distribution(&cards.iter().map(|c| c.bphr_vrhp).collect::<Vec<_>>()),
bp1b_vrhp: calc_distribution(&cards.iter().map(|c| c.bp1b_vrhp).collect::<Vec<_>>()),
})
}
/// Compute league-wide pitcher stat distributions from all rows in `pitcher_cards`.
/// Returns `PitcherLeagueStats::default()` if the table is empty.
pub async fn calculate_pitcher_league_stats(pool: &SqlitePool) -> Result<PitcherLeagueStats> {
let cards: Vec<PitcherCard> = sqlx::query_as("SELECT * FROM pitcher_cards")
.fetch_all(pool)
.await?;
if cards.is_empty() {
return Ok(PitcherLeagueStats::default());
}
Ok(PitcherLeagueStats {
so_vlhb: calc_distribution(&cards.iter().map(|c| c.so_vlhb).collect::<Vec<_>>()),
bb_vlhb: calc_distribution(&cards.iter().map(|c| c.bb_vlhb).collect::<Vec<_>>()),
hit_vlhb: calc_distribution(&cards.iter().map(|c| c.hit_vlhb).collect::<Vec<_>>()),
ob_vlhb: calc_distribution(&cards.iter().map(|c| c.ob_vlhb).collect::<Vec<_>>()),
tb_vlhb: calc_distribution(&cards.iter().map(|c| c.tb_vlhb).collect::<Vec<_>>()),
hr_vlhb: calc_distribution(&cards.iter().map(|c| c.hr_vlhb).collect::<Vec<_>>()),
dp_vlhb: calc_distribution(&cards.iter().map(|c| c.dp_vlhb).collect::<Vec<_>>()),
bphr_vlhb: calc_distribution(&cards.iter().map(|c| c.bphr_vlhb).collect::<Vec<_>>()),
bp1b_vlhb: calc_distribution(&cards.iter().map(|c| c.bp1b_vlhb).collect::<Vec<_>>()),
so_vrhb: calc_distribution(&cards.iter().map(|c| c.so_vrhb).collect::<Vec<_>>()),
bb_vrhb: calc_distribution(&cards.iter().map(|c| c.bb_vrhb).collect::<Vec<_>>()),
hit_vrhb: calc_distribution(&cards.iter().map(|c| c.hit_vrhb).collect::<Vec<_>>()),
ob_vrhb: calc_distribution(&cards.iter().map(|c| c.ob_vrhb).collect::<Vec<_>>()),
tb_vrhb: calc_distribution(&cards.iter().map(|c| c.tb_vrhb).collect::<Vec<_>>()),
hr_vrhb: calc_distribution(&cards.iter().map(|c| c.hr_vrhb).collect::<Vec<_>>()),
dp_vrhb: calc_distribution(&cards.iter().map(|c| c.dp_vrhb).collect::<Vec<_>>()),
bphr_vrhb: calc_distribution(&cards.iter().map(|c| c.bphr_vrhb).collect::<Vec<_>>()),
bp1b_vrhb: calc_distribution(&cards.iter().map(|c| c.bp1b_vrhb).collect::<Vec<_>>()),
})
}
// =============================================================================
// Cached accessors
// =============================================================================
/// Return cached batter league stats, computing and caching on first call.
pub async fn get_batter_league_stats(pool: &SqlitePool) -> Result<BatterLeagueStats> {
let cache = BATTER_CACHE.get_or_init(|| Mutex::new(None));
{
let guard = cache.lock().unwrap();
if let Some(ref stats) = *guard {
return Ok(stats.clone());
}
}
let stats = calculate_batter_league_stats(pool).await?;
*cache.lock().unwrap() = Some(stats.clone());
Ok(stats)
}
/// Return cached pitcher league stats, computing and caching on first call.
pub async fn get_pitcher_league_stats(pool: &SqlitePool) -> Result<PitcherLeagueStats> {
let cache = PITCHER_CACHE.get_or_init(|| Mutex::new(None));
{
let guard = cache.lock().unwrap();
if let Some(ref stats) = *guard {
return Ok(stats.clone());
}
}
let stats = calculate_pitcher_league_stats(pool).await?;
*cache.lock().unwrap() = Some(stats.clone());
Ok(stats)
}
/// Invalidate both in-memory league stats caches (e.g. after a card import).
pub fn clear_league_stats_cache() {
if let Some(cache) = BATTER_CACHE.get() {
*cache.lock().unwrap() = None;
}
if let Some(cache) = PITCHER_CACHE.get() {
*cache.lock().unwrap() = None;
}
}
#[cfg(test)]
mod tests {
use super::*;
/// Helper: assert two f64s are approximately equal within a tolerance.
fn approx_eq(a: f64, b: f64, eps: f64) -> bool {
(a - b).abs() < eps
}
/// Empty slice should return the safe default: avg=0.0, stdev=1.0.
/// This ensures callers that produce no data don't panic or divide by zero.
#[test]
fn test_empty_slice() {
let dist = calc_distribution(&[]);
assert_eq!(dist.avg, 0.0);
assert_eq!(dist.stdev, 1.0);
}
/// Single value has < 2 non-zero entries AND < 2 total entries, so both
/// avg and stdev fall back to their safe defaults.
#[test]
fn test_single_value() {
let dist = calc_distribution(&[5.0]);
assert_eq!(dist.avg, 0.0, "< 2 non-zero values → avg=0.0");
assert_eq!(dist.stdev, 1.0, "< 2 total values → stdev=1.0");
}
/// All-zeros slice: no non-zero values → avg=0.0. The stdev of [0,0,...] is
/// also 0.0, which triggers the zero-stdev fallback to 1.0.
#[test]
fn test_all_zeros() {
let dist = calc_distribution(&[0.0, 0.0, 0.0]);
assert_eq!(dist.avg, 0.0);
assert_eq!(dist.stdev, 1.0);
}
/// [4.0, 6.0]: avg of non-zeros = 5.0; sample stdev = sqrt(((4-5)²+(6-5)²)/1) = sqrt(2).
#[test]
fn test_two_nonzero_values() {
let dist = calc_distribution(&[4.0, 6.0]);
assert!(approx_eq(dist.avg, 5.0, 1e-10), "avg={}", dist.avg);
let expected_stdev = 2.0_f64.sqrt();
assert!(
approx_eq(dist.stdev, expected_stdev, 1e-10),
"stdev={} expected={}",
dist.stdev,
expected_stdev
);
}
/// [0.0, 4.0, 6.0]:
/// - avg uses only non-zeros [4.0, 6.0] → avg = 5.0
/// - stdev uses ALL values [0.0, 4.0, 6.0] with mean_all = 10/3 ≈ 3.333...
/// variance = ((0-3.333)²+(4-3.333)²+(6-3.333)²) / 2
/// = (11.111 + 0.444 + 7.111) / 2 = 18.667 / 2 = 9.333...
/// stdev = sqrt(9.333...) ≈ 3.0551
#[test]
fn test_mixed_zeros_and_nonzeros() {
let dist = calc_distribution(&[0.0, 4.0, 6.0]);
assert!(approx_eq(dist.avg, 5.0, 1e-10), "avg={}", dist.avg);
// Compute expected stdev manually
let mean_all = 10.0 / 3.0;
let variance =
((0.0_f64 - mean_all).powi(2) + (4.0_f64 - mean_all).powi(2) + (6.0_f64 - mean_all).powi(2))
/ 2.0;
let expected_stdev = variance.sqrt();
assert!(
approx_eq(dist.stdev, expected_stdev, 1e-10),
"stdev={} expected={}",
dist.stdev,
expected_stdev
);
}
/// [3.0, 3.0, 3.0]: all values equal → sample stdev = 0.0 → fallback to 1.0.
/// avg = 3.0 (non-zero mean).
#[test]
fn test_zero_stdev_fallback() {
let dist = calc_distribution(&[3.0, 3.0, 3.0]);
assert_eq!(dist.avg, 3.0);
assert_eq!(dist.stdev, 1.0, "zero stdev must fall back to 1.0");
}
/// BatterLeagueStats::default() should have all fields set to avg=0.0, stdev=1.0.
#[test]
fn test_batter_league_stats_default() {
let stats = BatterLeagueStats::default();
assert_eq!(stats.so_vlhp.avg, 0.0);
assert_eq!(stats.so_vlhp.stdev, 1.0);
assert_eq!(stats.bp1b_vrhp.avg, 0.0);
assert_eq!(stats.bp1b_vrhp.stdev, 1.0);
}
/// PitcherLeagueStats::default() should have all fields set to avg=0.0, stdev=1.0.
#[test]
fn test_pitcher_league_stats_default() {
let stats = PitcherLeagueStats::default();
assert_eq!(stats.so_vlhb.avg, 0.0);
assert_eq!(stats.so_vlhb.stdev, 1.0);
assert_eq!(stats.bp1b_vrhb.avg, 0.0);
assert_eq!(stats.bp1b_vrhb.stdev, 1.0);
}
}

View File

@ -1,5 +1,9 @@
use super::league_stats::StatDistribution;
use super::weights::StatWeight;
use anyhow::Result;
use sqlx::SqlitePool;
use super::league_stats::{BatterLeagueStats, PitcherLeagueStats, StatDistribution};
use super::weights::{StatWeight, BATTER_WEIGHTS, PITCHER_WEIGHTS};
use crate::db::models::{BatterCard, PitcherCard, Player};
use crate::db::queries::{get_cached_batter_score, get_cached_pitcher_score};
/// Convert a raw stat value to a standardized score (-3 to +3).
///
@ -42,6 +46,369 @@ pub fn calculate_weighted_score(
std_score as f64 * stat_weight.weight as f64
}
static DEFAULT_DIST: StatDistribution = StatDistribution { avg: 0.0, stdev: 1.0 };
/// Map (stat_name, pitcher_hand) to the batter card field value.
/// pitcher_hand: "L" → vs left-handed pitchers, "R" → vs right-handed pitchers.
pub(crate) fn get_batter_stat(card: &BatterCard, stat: &str, pitcher_hand: &str) -> f64 {
match (stat, pitcher_hand) {
("so", "L") => card.so_vlhp,
("so", "R") => card.so_vrhp,
("bb", "L") => card.bb_vlhp,
("bb", "R") => card.bb_vrhp,
("hit", "L") => card.hit_vlhp,
("hit", "R") => card.hit_vrhp,
("ob", "L") => card.ob_vlhp,
("ob", "R") => card.ob_vrhp,
("tb", "L") => card.tb_vlhp,
("tb", "R") => card.tb_vrhp,
("hr", "L") => card.hr_vlhp,
("hr", "R") => card.hr_vrhp,
("bphr", "L") => card.bphr_vlhp,
("bphr", "R") => card.bphr_vrhp,
("bp1b", "L") => card.bp1b_vlhp,
("bp1b", "R") => card.bp1b_vrhp,
("dp", "L") => card.dp_vlhp,
("dp", "R") => card.dp_vrhp,
_ => 0.0,
}
}
/// Map (stat_name, pitcher_hand) to the batter league distribution for that stat.
pub(crate) fn get_batter_dist<'a>(stats: &'a BatterLeagueStats, stat: &str, pitcher_hand: &str) -> &'a StatDistribution {
match (stat, pitcher_hand) {
("so", "L") => &stats.so_vlhp,
("so", "R") => &stats.so_vrhp,
("bb", "L") => &stats.bb_vlhp,
("bb", "R") => &stats.bb_vrhp,
("hit", "L") => &stats.hit_vlhp,
("hit", "R") => &stats.hit_vrhp,
("ob", "L") => &stats.ob_vlhp,
("ob", "R") => &stats.ob_vrhp,
("tb", "L") => &stats.tb_vlhp,
("tb", "R") => &stats.tb_vrhp,
("hr", "L") => &stats.hr_vlhp,
("hr", "R") => &stats.hr_vrhp,
("bphr", "L") => &stats.bphr_vlhp,
("bphr", "R") => &stats.bphr_vrhp,
("bp1b", "L") => &stats.bp1b_vlhp,
("bp1b", "R") => &stats.bp1b_vrhp,
("dp", "L") => &stats.dp_vlhp,
("dp", "R") => &stats.dp_vrhp,
_ => &DEFAULT_DIST,
}
}
/// Map (stat_name, batter_hand) to the pitcher card field value.
/// batter_hand: "L" → vs left-handed batters, "R" → vs right-handed batters.
pub(crate) fn get_pitcher_stat(card: &PitcherCard, stat: &str, batter_hand: &str) -> f64 {
match (stat, batter_hand) {
("so", "L") => card.so_vlhb,
("so", "R") => card.so_vrhb,
("bb", "L") => card.bb_vlhb,
("bb", "R") => card.bb_vrhb,
("hit", "L") => card.hit_vlhb,
("hit", "R") => card.hit_vrhb,
("ob", "L") => card.ob_vlhb,
("ob", "R") => card.ob_vrhb,
("tb", "L") => card.tb_vlhb,
("tb", "R") => card.tb_vrhb,
("hr", "L") => card.hr_vlhb,
("hr", "R") => card.hr_vrhb,
("bphr", "L") => card.bphr_vlhb,
("bphr", "R") => card.bphr_vrhb,
("bp1b", "L") => card.bp1b_vlhb,
("bp1b", "R") => card.bp1b_vrhb,
("dp", "L") => card.dp_vlhb,
("dp", "R") => card.dp_vrhb,
_ => 0.0,
}
}
/// Map (stat_name, batter_hand) to the pitcher league distribution for that stat.
pub(crate) fn get_pitcher_dist<'a>(stats: &'a PitcherLeagueStats, stat: &str, batter_hand: &str) -> &'a StatDistribution {
match (stat, batter_hand) {
("so", "L") => &stats.so_vlhb,
("so", "R") => &stats.so_vrhb,
("bb", "L") => &stats.bb_vlhb,
("bb", "R") => &stats.bb_vrhb,
("hit", "L") => &stats.hit_vlhb,
("hit", "R") => &stats.hit_vrhb,
("ob", "L") => &stats.ob_vlhb,
("ob", "R") => &stats.ob_vrhb,
("tb", "L") => &stats.tb_vlhb,
("tb", "R") => &stats.tb_vrhb,
("hr", "L") => &stats.hr_vlhb,
("hr", "R") => &stats.hr_vrhb,
("bphr", "L") => &stats.bphr_vlhb,
("bphr", "R") => &stats.bphr_vrhb,
("bp1b", "L") => &stats.bp1b_vlhb,
("bp1b", "R") => &stats.bp1b_vrhb,
("dp", "L") => &stats.dp_vlhb,
("dp", "R") => &stats.dp_vrhb,
_ => &DEFAULT_DIST,
}
}
/// Compute the batter component score for a given card, pitcher handedness, and league stats.
///
/// Sums weighted standardized scores across all batter stats.
/// pitcher_hand: "L" or "R" selects the appropriate split from the card.
pub fn calculate_batter_component(card: &BatterCard, pitcher_hand: &str, league_stats: &BatterLeagueStats) -> f64 {
BATTER_WEIGHTS
.iter()
.map(|(stat, weight)| {
let value = get_batter_stat(card, stat, pitcher_hand);
let dist = get_batter_dist(league_stats, stat, pitcher_hand);
calculate_weighted_score(value, dist, weight)
})
.sum()
}
/// Compute the pitcher component score for a given card, batter handedness, and league stats.
///
/// Sums weighted standardized scores across all pitcher stats.
/// batter_hand: "L" or "R" selects the appropriate split from the card.
pub fn calculate_pitcher_component(card: &PitcherCard, batter_hand: &str, league_stats: &PitcherLeagueStats) -> f64 {
PITCHER_WEIGHTS
.iter()
.map(|(stat, weight)| {
let value = get_pitcher_stat(card, stat, batter_hand);
let dist = get_pitcher_dist(league_stats, stat, batter_hand);
calculate_weighted_score(value, dist, weight)
})
.sum()
}
#[derive(Debug, Clone)]
pub struct MatchupResult {
pub player: Player,
pub rating: Option<f64>,
pub tier: String,
pub batter_hand: String,
pub batter_split: String,
pub pitcher_split: String,
pub batter_component: Option<f64>,
pub pitcher_component: Option<f64>,
}
impl MatchupResult {
pub fn rating_display(&self) -> String {
match self.rating {
Some(r) if r >= 0.0 => format!("+{:.0}", r),
Some(r) => format!("{:.0}", r),
None => "N/A".to_string(),
}
}
}
pub fn get_tier(rating: Option<f64>) -> &'static str {
match rating {
None => "--",
Some(r) if r >= 40.0 => "A",
Some(r) if r >= 20.0 => "B",
Some(r) if r >= -19.0 => "C",
Some(r) if r >= -39.0 => "D",
Some(_) => "F",
}
}
/// Calculate batter vs pitcher matchup result.
///
/// Resolves switch hitter handedness, computes batter and pitcher components,
/// and combines them with pitcher inversion (good pitcher hurts batter rating).
pub fn calculate_matchup(
player: &Player,
batter_card: Option<&BatterCard>,
pitcher: &Player,
pitcher_card: Option<&PitcherCard>,
batter_league_stats: &BatterLeagueStats,
pitcher_league_stats: &PitcherLeagueStats,
) -> MatchupResult {
let batter_hand = player.hand.as_deref().unwrap_or("R");
let pitcher_hand = pitcher.hand.as_deref().unwrap_or("R");
// Switch hitter bats opposite of pitcher's hand
let effective_hand = if batter_hand == "S" {
if pitcher_hand == "R" { "L" } else { "R" }
} else {
batter_hand
};
let batter_split = if pitcher_hand == "L" { "vLHP" } else { "vRHP" };
let pitcher_split = if effective_hand == "L" { "vLHB" } else { "vRHB" };
let Some(batter_card) = batter_card else {
return MatchupResult {
player: player.clone(),
rating: None,
tier: "--".to_string(),
batter_hand: effective_hand.to_string(),
batter_split: batter_split.to_string(),
pitcher_split: pitcher_split.to_string(),
batter_component: None,
pitcher_component: None,
};
};
let batter_component = calculate_batter_component(batter_card, pitcher_hand, batter_league_stats);
let (pitcher_component, total) = if let Some(pc) = pitcher_card {
let pc_score = calculate_pitcher_component(pc, effective_hand, pitcher_league_stats);
(Some(pc_score), batter_component + (-pc_score))
} else {
(None, batter_component)
};
MatchupResult {
player: player.clone(),
rating: Some(total),
tier: get_tier(Some(total)).to_string(),
batter_hand: effective_hand.to_string(),
batter_split: batter_split.to_string(),
pitcher_split: pitcher_split.to_string(),
batter_component: Some(batter_component),
pitcher_component,
}
}
/// Calculate matchups for a full roster against one pitcher, sorted best-first.
///
/// Results are sorted descending by rating (rated players first, then unrated).
pub fn calculate_team_matchups(
batters: &[(Player, Option<BatterCard>)],
pitcher: &Player,
pitcher_card: Option<&PitcherCard>,
batter_league_stats: &BatterLeagueStats,
pitcher_league_stats: &PitcherLeagueStats,
) -> Vec<MatchupResult> {
let mut results: Vec<MatchupResult> = batters
.iter()
.map(|(player, batter_card)| {
calculate_matchup(
player,
batter_card.as_ref(),
pitcher,
pitcher_card,
batter_league_stats,
pitcher_league_stats,
)
})
.collect();
results.sort_by(|a, b| match (&b.rating, &a.rating) {
(Some(br), Some(ar)) => br.partial_cmp(ar).unwrap_or(std::cmp::Ordering::Equal),
(Some(_), None) => std::cmp::Ordering::Greater, // b rated, a not → b first
(None, Some(_)) => std::cmp::Ordering::Less, // a rated, b not → a first
(None, None) => std::cmp::Ordering::Equal,
});
results
}
/// Calculate batter vs pitcher matchup using pre-computed scores from the DB cache.
///
/// Same switch-hitter resolution and pitcher inversion as `calculate_matchup`.
/// Returns a result with `rating=None` if the batter card is missing or its
/// cache entry is absent. Pitcher cache misses fall back to batter-only rating.
pub async fn calculate_matchup_cached(
pool: &SqlitePool,
player: &Player,
batter_card: Option<&BatterCard>,
pitcher: &Player,
pitcher_card: Option<&PitcherCard>,
) -> Result<MatchupResult> {
let batter_hand = player.hand.as_deref().unwrap_or("R");
let pitcher_hand = pitcher.hand.as_deref().unwrap_or("R");
// Switch hitter bats opposite of pitcher's hand
let effective_hand = if batter_hand == "S" {
if pitcher_hand == "R" { "L" } else { "R" }
} else {
batter_hand
};
let batter_split_label = if pitcher_hand == "L" { "vLHP" } else { "vRHP" };
let batter_split_key = if pitcher_hand == "L" { "vlhp" } else { "vrhp" };
let pitcher_split_label = if effective_hand == "L" { "vLHB" } else { "vRHB" };
let pitcher_split_key = if effective_hand == "L" { "vlhb" } else { "vrhb" };
let no_rating = || MatchupResult {
player: player.clone(),
rating: None,
tier: "--".to_string(),
batter_hand: effective_hand.to_string(),
batter_split: batter_split_label.to_string(),
pitcher_split: pitcher_split_label.to_string(),
batter_component: None,
pitcher_component: None,
};
let Some(batter_card) = batter_card else {
return Ok(no_rating());
};
let Some(batter_cache) =
get_cached_batter_score(pool, batter_card.id, batter_split_key).await?
else {
return Ok(no_rating());
};
let batter_score = batter_cache.total_score;
let pitcher_score = if let Some(pc) = pitcher_card {
get_cached_pitcher_score(pool, pc.id, pitcher_split_key)
.await?
.map(|c| c.total_score)
} else {
None
};
let total = match pitcher_score {
Some(ps) => batter_score + (-ps),
None => batter_score,
};
Ok(MatchupResult {
player: player.clone(),
rating: Some(total),
tier: get_tier(Some(total)).to_string(),
batter_hand: effective_hand.to_string(),
batter_split: batter_split_label.to_string(),
pitcher_split: pitcher_split_label.to_string(),
batter_component: Some(batter_score),
pitcher_component: pitcher_score,
})
}
/// Calculate cached matchups for a full roster against one pitcher, sorted best-first.
///
/// Iterates all batters, calling `calculate_matchup_cached` for each, then sorts
/// descending by rating (rated players first, unrated last).
pub async fn calculate_team_matchups_cached(
pool: &SqlitePool,
batters: &[(Player, Option<BatterCard>)],
pitcher: &Player,
pitcher_card: Option<&PitcherCard>,
) -> Result<Vec<MatchupResult>> {
let mut results = Vec::with_capacity(batters.len());
for (player, batter_card) in batters {
let result =
calculate_matchup_cached(pool, player, batter_card.as_ref(), pitcher, pitcher_card)
.await?;
results.push(result);
}
results.sort_by(|a, b| match (&b.rating, &a.rating) {
(Some(br), Some(ar)) => br.partial_cmp(ar).unwrap_or(std::cmp::Ordering::Equal),
(Some(_), None) => std::cmp::Ordering::Greater, // b rated, a not → b first
(None, Some(_)) => std::cmp::Ordering::Less, // a rated, b not → a first
(None, None) => std::cmp::Ordering::Equal,
});
Ok(results)
}
#[cfg(test)]
mod tests {
use super::*;
@ -128,4 +495,391 @@ mod tests {
// value=0.0 => always 3 => 3*3=9.0
assert!((calculate_weighted_score(0.0, &d, &w) - 9.0).abs() < f64::EPSILON);
}
// --- get_tier tests ---
#[test]
fn tier_none_returns_dash() {
assert_eq!(get_tier(None), "--");
}
#[test]
fn tier_a_at_40_and_above() {
assert_eq!(get_tier(Some(40.0)), "A");
assert_eq!(get_tier(Some(50.0)), "A");
}
#[test]
fn tier_b_between_20_and_39() {
assert_eq!(get_tier(Some(39.0)), "B");
assert_eq!(get_tier(Some(20.0)), "B");
}
#[test]
fn tier_c_between_neg19_and_19() {
assert_eq!(get_tier(Some(19.0)), "C");
assert_eq!(get_tier(Some(0.0)), "C");
assert_eq!(get_tier(Some(-19.0)), "C");
}
#[test]
fn tier_d_between_neg39_and_neg20() {
assert_eq!(get_tier(Some(-20.0)), "D");
assert_eq!(get_tier(Some(-39.0)), "D");
}
#[test]
fn tier_f_at_neg40_and_below() {
assert_eq!(get_tier(Some(-40.0)), "F");
}
// --- MatchupResult::rating_display tests ---
fn make_result(rating: Option<f64>) -> MatchupResult {
use crate::db::models::Player;
MatchupResult {
player: Player {
id: 1,
name: "Test".to_string(),
season: 13,
team_id: None,
swar: None,
card_image: None,
card_image_alt: None,
headshot: None,
vanity_card: None,
pos_1: None,
pos_2: None,
pos_3: None,
pos_4: None,
pos_5: None,
pos_6: None,
pos_7: None,
pos_8: None,
hand: None,
injury_rating: None,
il_return: None,
demotion_week: None,
strat_code: None,
bbref_id: None,
sbaplayer_id: None,
last_game: None,
last_game2: None,
synced_at: None,
},
rating,
tier: get_tier(rating).to_string(),
batter_hand: "R".to_string(),
batter_split: "vRHP".to_string(),
pitcher_split: "vRHB".to_string(),
batter_component: None,
pitcher_component: None,
}
}
#[test]
fn rating_display_positive() {
assert_eq!(make_result(Some(35.0)).rating_display(), "+35");
}
#[test]
fn rating_display_negative() {
assert_eq!(make_result(Some(-25.0)).rating_display(), "-25");
}
#[test]
fn rating_display_zero() {
assert_eq!(make_result(Some(0.0)).rating_display(), "+0");
}
#[test]
fn rating_display_none() {
assert_eq!(make_result(None).rating_display(), "N/A");
}
// --- calculate_batter_component / calculate_pitcher_component tests ---
/// Build a zeroed BatterCard. All stat fields default to 0.0.
/// When every stat value is 0.0, standardize_value always returns 3 (the best possible).
/// Total batter component = 3 * sum(weights) = 3 * 22 = 66.
fn make_batter_card_zeroed() -> BatterCard {
BatterCard {
id: 1,
player_id: 1,
so_vlhp: 0.0, bb_vlhp: 0.0, hit_vlhp: 0.0, ob_vlhp: 0.0,
tb_vlhp: 0.0, hr_vlhp: 0.0, dp_vlhp: 0.0,
bphr_vlhp: 0.0, bp1b_vlhp: 0.0,
so_vrhp: 0.0, bb_vrhp: 0.0, hit_vrhp: 0.0, ob_vrhp: 0.0,
tb_vrhp: 0.0, hr_vrhp: 0.0, dp_vrhp: 0.0,
bphr_vrhp: 0.0, bp1b_vrhp: 0.0,
stealing: None, steal_rating: None, speed: None,
bunt: None, hit_run: None, fielding: None,
catcher_arm: None, catcher_pb: None, catcher_t: None,
rating_vl: None, rating_vr: None, rating_overall: None,
imported_at: None, source: None,
}
}
/// Build a zeroed PitcherCard. All stat fields default to 0.0.
/// Total pitcher component = 3 * sum(weights) = 3 * 23 = 69.
fn make_pitcher_card_zeroed() -> PitcherCard {
PitcherCard {
id: 1,
player_id: 1,
so_vlhb: 0.0, bb_vlhb: 0.0, hit_vlhb: 0.0, ob_vlhb: 0.0,
tb_vlhb: 0.0, hr_vlhb: 0.0, dp_vlhb: 0.0,
bphr_vlhb: 0.0, bp1b_vlhb: 0.0,
so_vrhb: 0.0, bb_vrhb: 0.0, hit_vrhb: 0.0, ob_vrhb: 0.0,
tb_vrhb: 0.0, hr_vrhb: 0.0, dp_vrhb: 0.0,
bphr_vrhb: 0.0, bp1b_vrhb: 0.0,
hold_rating: None, endurance_start: None, endurance_relief: None,
endurance_close: None, fielding_range: None, fielding_error: None,
wild_pitch: None, balk: None, batting_rating: None,
rating_vlhb: None, rating_vrhb: None, rating_overall: None,
imported_at: None, source: None,
}
}
#[test]
fn batter_component_all_zeros_gives_max_score() {
// All card stats = 0.0 → standardize_value always returns 3 for every stat.
// Total = 3 * sum(batter weights) = 3 * 22 = 66.
let card = make_batter_card_zeroed();
let stats = BatterLeagueStats::default();
let score = calculate_batter_component(&card, "R", &stats);
assert!((score - 66.0).abs() < f64::EPSILON, "expected 66.0, got {}", score);
}
#[test]
fn pitcher_component_all_zeros_gives_max_score() {
// All card stats = 0.0 → standardize_value always returns 3 for every stat.
// Total = 3 * sum(pitcher weights) = 3 * 23 = 69.
let card = make_pitcher_card_zeroed();
let stats = PitcherLeagueStats::default();
let score = calculate_pitcher_component(&card, "R", &stats);
assert!((score - 69.0).abs() < f64::EPSILON, "expected 69.0, got {}", score);
}
#[test]
fn batter_component_uses_correct_split() {
// Verifies that pitcher_hand="L" reads vlhp fields and "R" reads vrhp fields.
//
// Set ob_vlhp=1.0 (all other stats remain 0.0).
// With default league stats (avg=0, stdev=1):
// standardize(1.0, {avg=0, stdev=1}, high=true):
// 1.0 > 0+0.33*1=0.33 → base_score=-1, high=true → +1
// ob weight = 5 → contribution = 1*5 = 5
// All other stats=0 → 3 * weight each.
// Total for vL: (3*1)+(3*1)+(3*2)+(1*5)+(3*5)+(3*2)+(3*3)+(3*1)+(3*2) = 3+3+6+5+15+6+9+3+6 = 56
let mut card = make_batter_card_zeroed();
card.ob_vlhp = 1.0;
let stats = BatterLeagueStats::default();
let score_vl = calculate_batter_component(&card, "L", &stats);
let score_vr = calculate_batter_component(&card, "R", &stats);
assert!((score_vl - 56.0).abs() < f64::EPSILON, "vL score expected 56.0, got {}", score_vl);
assert!((score_vr - 66.0).abs() < f64::EPSILON, "vR score expected 66.0, got {}", score_vr);
}
#[test]
fn pitcher_component_uses_correct_split() {
// Verifies that batter_hand="L" reads vlhb fields and "R" reads vrhb fields.
//
// Set so_vlhb=1.0 (all other stats remain 0.0).
// With default league stats (avg=0, stdev=1):
// standardize(1.0, {avg=0, stdev=1}, high=true):
// 1.0 > 0.33 → base_score=-1, high=true → +1
// so weight = 3 → contribution = 1*3 = 3
// All other vlhb stats=0 → 3 * weight each.
// Total vL: so(1*3)+bb(3*1)+hit(3*2)+ob(3*5)+tb(3*2)+hr(3*5)+bphr(3*2)+bp1b(3*1)+dp(3*2) = 3+3+6+15+6+15+6+3+6 = 63
let mut card = make_pitcher_card_zeroed();
card.so_vlhb = 1.0;
let stats = PitcherLeagueStats::default();
let score_vl = calculate_pitcher_component(&card, "L", &stats);
let score_vr = calculate_pitcher_component(&card, "R", &stats);
assert!((score_vl - 63.0).abs() < f64::EPSILON, "vL score expected 63.0, got {}", score_vl);
assert!((score_vr - 69.0).abs() < f64::EPSILON, "vR score expected 69.0, got {}", score_vr);
}
// --- calculate_matchup tests ---
fn make_player(hand: Option<&str>) -> Player {
Player {
id: 1,
name: "Test".to_string(),
season: 13,
team_id: None,
swar: None,
card_image: None,
card_image_alt: None,
headshot: None,
vanity_card: None,
pos_1: None, pos_2: None, pos_3: None, pos_4: None,
pos_5: None, pos_6: None, pos_7: None, pos_8: None,
hand: hand.map(|s| s.to_string()),
injury_rating: None,
il_return: None,
demotion_week: None,
strat_code: None,
bbref_id: None,
sbaplayer_id: None,
last_game: None,
last_game2: None,
synced_at: None,
}
}
#[test]
fn switch_hitter_vs_right_bats_left() {
// Switch hitter (S) vs right-handed pitcher → bats left → split vRHP / vLHB
let batter = make_player(Some("S"));
let pitcher = make_player(Some("R"));
let batter_card = make_batter_card_zeroed();
let pitcher_card = make_pitcher_card_zeroed();
let b_stats = BatterLeagueStats::default();
let p_stats = PitcherLeagueStats::default();
let result = calculate_matchup(
&batter, Some(&batter_card), &pitcher, Some(&pitcher_card), &b_stats, &p_stats,
);
assert_eq!(result.batter_hand, "L");
assert_eq!(result.batter_split, "vRHP");
assert_eq!(result.pitcher_split, "vLHB");
}
#[test]
fn switch_hitter_vs_left_bats_right() {
// Switch hitter (S) vs left-handed pitcher → bats right → split vLHP / vRHB
let batter = make_player(Some("S"));
let pitcher = make_player(Some("L"));
let batter_card = make_batter_card_zeroed();
let pitcher_card = make_pitcher_card_zeroed();
let b_stats = BatterLeagueStats::default();
let p_stats = PitcherLeagueStats::default();
let result = calculate_matchup(
&batter, Some(&batter_card), &pitcher, Some(&pitcher_card), &b_stats, &p_stats,
);
assert_eq!(result.batter_hand, "R");
assert_eq!(result.batter_split, "vLHP");
assert_eq!(result.pitcher_split, "vRHB");
}
#[test]
fn left_hitter_vs_right_pitcher_uses_vrhp_split() {
// Regular L batter vs R pitcher → split vRHP / vLHB
let batter = make_player(Some("L"));
let pitcher = make_player(Some("R"));
let batter_card = make_batter_card_zeroed();
let b_stats = BatterLeagueStats::default();
let p_stats = PitcherLeagueStats::default();
let result = calculate_matchup(
&batter, Some(&batter_card), &pitcher, None, &b_stats, &p_stats,
);
assert_eq!(result.batter_hand, "L");
assert_eq!(result.batter_split, "vRHP");
assert_eq!(result.pitcher_split, "vLHB");
assert!(result.rating.is_some());
}
#[test]
fn no_batter_card_gives_none_rating_and_dash_tier() {
// Missing batter card → rating=None, tier="--"
let batter = make_player(Some("R"));
let pitcher = make_player(Some("R"));
let b_stats = BatterLeagueStats::default();
let p_stats = PitcherLeagueStats::default();
let result = calculate_matchup(&batter, None, &pitcher, None, &b_stats, &p_stats);
assert!(result.rating.is_none());
assert_eq!(result.tier, "--");
assert!(result.batter_component.is_none());
assert!(result.pitcher_component.is_none());
}
#[test]
fn no_pitcher_card_uses_batter_only_rating() {
// No pitcher card → total = batter_component only (all zeros → 66.0)
let batter = make_player(Some("R"));
let pitcher = make_player(Some("R"));
let batter_card = make_batter_card_zeroed();
let b_stats = BatterLeagueStats::default();
let p_stats = PitcherLeagueStats::default();
let result = calculate_matchup(
&batter, Some(&batter_card), &pitcher, None, &b_stats, &p_stats,
);
let rating = result.rating.expect("should have a rating");
assert!((rating - 66.0).abs() < f64::EPSILON, "expected 66.0, got {}", rating);
assert!(result.pitcher_component.is_none());
}
#[test]
fn pitcher_component_is_inverted_in_total() {
// Both cards zeroed: batter=66, pitcher=69 (vRHB).
// total = 66 + (-69) = -3
let batter = make_player(Some("R"));
let pitcher = make_player(Some("R"));
let batter_card = make_batter_card_zeroed();
let pitcher_card = make_pitcher_card_zeroed();
let b_stats = BatterLeagueStats::default();
let p_stats = PitcherLeagueStats::default();
let result = calculate_matchup(
&batter, Some(&batter_card), &pitcher, Some(&pitcher_card), &b_stats, &p_stats,
);
let rating = result.rating.expect("should have a rating");
assert!((rating - (-3.0)).abs() < f64::EPSILON, "expected -3.0, got {}", rating);
}
// --- calculate_team_matchups tests ---
#[test]
fn team_matchups_sorted_rated_first_descending() {
// Build 3 batters: two with cards (different ratings), one without.
// Sorting: rated descending, then unrated last.
let pitcher = make_player(Some("R"));
let p_stats = PitcherLeagueStats::default();
// batter A: no card → None rating
let batter_a = make_player(Some("R"));
// batter B: zeroed card → 66.0 (batter-only, no pitcher card)
let mut batter_b = make_player(Some("R"));
batter_b.id = 2;
let card_b = make_batter_card_zeroed();
// batter C: card with ob_vrhp=1.0 → score 56.0
let mut batter_c = make_player(Some("R"));
batter_c.id = 3;
let mut card_c = make_batter_card_zeroed();
card_c.ob_vrhp = 1.0;
let batters: Vec<(Player, Option<BatterCard>)> = vec![
(batter_a, None),
(batter_b, Some(card_b)),
(batter_c, Some(card_c)),
];
let b_stats = BatterLeagueStats::default();
let results = calculate_team_matchups(&batters, &pitcher, None, &b_stats, &p_stats);
assert_eq!(results.len(), 3);
// First: batter_b (66.0)
assert!((results[0].rating.unwrap() - 66.0).abs() < f64::EPSILON);
// Second: batter_c (56.0)
assert!((results[1].rating.unwrap() - 56.0).abs() < f64::EPSILON);
// Third: batter_a (None)
assert!(results[2].rating.is_none());
}
}

View File

@ -1,3 +1,4 @@
pub mod league_stats;
pub mod matchup;
pub mod score_cache;
pub mod weights;

View File

@ -0,0 +1,227 @@
use anyhow::Result;
use serde::{Deserialize, Serialize};
use sha2::{Digest, Sha256};
use sqlx::SqlitePool;
use std::collections::HashMap;
use crate::calc::league_stats::{
calculate_batter_league_stats, calculate_pitcher_league_stats, BatterLeagueStats,
PitcherLeagueStats,
};
use crate::calc::matchup::{
get_batter_dist, get_batter_stat, get_pitcher_dist, get_pitcher_stat, standardize_value,
};
use crate::calc::weights::{BATTER_WEIGHTS, PITCHER_WEIGHTS};
use crate::db::models::{BatterCard, PitcherCard, StandardizedScoreCache};
use crate::db::queries::clear_score_cache;
/// Score details for a single stat.
#[derive(Debug, Serialize, Deserialize)]
pub struct StatScore {
pub raw: f64,
pub std: i32,
pub weighted: f64,
}
/// Result of a full cache rebuild.
#[derive(Debug)]
pub struct CacheRebuildResult {
pub batter_splits: i64,
pub pitcher_splits: i64,
}
/// Generate a stable hash of the current weight configuration.
///
/// Uses SHA-256 of a sorted JSON representation of BATTER_WEIGHTS and PITCHER_WEIGHTS.
/// Returns the first 16 hex characters.
fn compute_weights_hash() -> String {
let batter: std::collections::BTreeMap<&str, (i32, bool)> = BATTER_WEIGHTS
.iter()
.map(|(name, w)| (*name, (w.weight, w.high_is_better)))
.collect();
let pitcher: std::collections::BTreeMap<&str, (i32, bool)> = PITCHER_WEIGHTS
.iter()
.map(|(name, w)| (*name, (w.weight, w.high_is_better)))
.collect();
let data = serde_json::json!({
"batter": batter,
"pitcher": pitcher,
});
let mut hasher = Sha256::new();
hasher.update(data.to_string().as_bytes());
let result = hasher.finalize();
let hex: String = result.iter().map(|b| format!("{:02x}", b)).collect();
hex[..16].to_string()
}
/// Generate a hash of representative league stat values to detect significant changes.
///
/// Hashes 5 key distribution values; returns first 16 hex chars of SHA-256.
fn compute_league_stats_hash(batter: &BatterLeagueStats, pitcher: &PitcherLeagueStats) -> String {
let key_values = [
batter.hit_vrhp.avg,
batter.hit_vrhp.stdev,
batter.so_vrhp.avg,
pitcher.hit_vrhb.avg,
pitcher.so_vrhb.avg,
];
let repr = format!("{:?}", key_values);
let mut hasher = Sha256::new();
hasher.update(repr.as_bytes());
let result = hasher.finalize();
let hex: String = result.iter().map(|b| format!("{:02x}", b)).collect();
hex[..16].to_string()
}
/// Calculate standardized scores for all stats on a batter card split.
///
/// `split` is "vlhp" (vs left-handed pitchers) or "vrhp" (vs right-handed pitchers).
/// Returns (total_weighted_score, per-stat StatScore map).
fn calculate_batter_split_scores(
card: &BatterCard,
split: &str,
stats: &BatterLeagueStats,
) -> (f64, HashMap<String, StatScore>) {
let pitcher_hand = if split == "vlhp" { "L" } else { "R" };
let mut total = 0.0;
let mut stat_scores = HashMap::new();
for (stat_name, weight) in BATTER_WEIGHTS.iter() {
let raw = get_batter_stat(card, stat_name, pitcher_hand);
let dist = get_batter_dist(stats, stat_name, pitcher_hand);
let std = standardize_value(raw, dist, weight.high_is_better);
let weighted = std as f64 * weight.weight as f64;
total += weighted;
stat_scores.insert(stat_name.to_string(), StatScore { raw, std, weighted });
}
(total, stat_scores)
}
/// Calculate standardized scores for all stats on a pitcher card split.
///
/// `split` is "vlhb" (vs left-handed batters) or "vrhb" (vs right-handed batters).
/// Returns (total_weighted_score, per-stat StatScore map).
fn calculate_pitcher_split_scores(
card: &PitcherCard,
split: &str,
stats: &PitcherLeagueStats,
) -> (f64, HashMap<String, StatScore>) {
let batter_hand = if split == "vlhb" { "L" } else { "R" };
let mut total = 0.0;
let mut stat_scores = HashMap::new();
for (stat_name, weight) in PITCHER_WEIGHTS.iter() {
let raw = get_pitcher_stat(card, stat_name, batter_hand);
let dist = get_pitcher_dist(stats, stat_name, batter_hand);
let std = standardize_value(raw, dist, weight.high_is_better);
let weighted = std as f64 * weight.weight as f64;
total += weighted;
stat_scores.insert(stat_name.to_string(), StatScore { raw, std, weighted });
}
(total, stat_scores)
}
/// Rebuild the entire standardized score cache.
///
/// Clears all existing entries and recalculates scores for every batter and pitcher
/// card using current league statistics and weight configuration.
pub async fn rebuild_score_cache(pool: &SqlitePool) -> Result<CacheRebuildResult> {
let batter_stats = calculate_batter_league_stats(pool).await?;
let pitcher_stats = calculate_pitcher_league_stats(pool).await?;
let weights_hash = compute_weights_hash();
let league_hash = compute_league_stats_hash(&batter_stats, &pitcher_stats);
clear_score_cache(pool).await?;
let batter_cards: Vec<BatterCard> =
sqlx::query_as("SELECT * FROM batter_cards").fetch_all(pool).await?;
let pitcher_cards: Vec<PitcherCard> =
sqlx::query_as("SELECT * FROM pitcher_cards").fetch_all(pool).await?;
let computed_at = chrono::Utc::now().naive_utc();
let mut batter_count: i64 = 0;
let mut pitcher_count: i64 = 0;
for card in &batter_cards {
for split in ["vlhp", "vrhp"] {
let (total, stat_scores) = calculate_batter_split_scores(card, split, &batter_stats);
let stat_scores_json = serde_json::to_string(&stat_scores)?;
sqlx::query(
"INSERT INTO standardized_score_cache \
(batter_card_id, pitcher_card_id, split, total_score, stat_scores, computed_at, \
weights_hash, league_stats_hash) \
VALUES (?, NULL, ?, ?, ?, ?, ?, ?)",
)
.bind(card.id)
.bind(split)
.bind(total)
.bind(&stat_scores_json)
.bind(computed_at)
.bind(&weights_hash)
.bind(&league_hash)
.execute(pool)
.await?;
batter_count += 1;
}
}
for card in &pitcher_cards {
for split in ["vlhb", "vrhb"] {
let (total, stat_scores) =
calculate_pitcher_split_scores(card, split, &pitcher_stats);
let stat_scores_json = serde_json::to_string(&stat_scores)?;
sqlx::query(
"INSERT INTO standardized_score_cache \
(batter_card_id, pitcher_card_id, split, total_score, stat_scores, computed_at, \
weights_hash, league_stats_hash) \
VALUES (NULL, ?, ?, ?, ?, ?, ?, ?)",
)
.bind(card.id)
.bind(split)
.bind(total)
.bind(&stat_scores_json)
.bind(computed_at)
.bind(&weights_hash)
.bind(&league_hash)
.execute(pool)
.await?;
pitcher_count += 1;
}
}
Ok(CacheRebuildResult { batter_splits: batter_count, pitcher_splits: pitcher_count })
}
/// Check if the score cache is valid (non-empty and matches current weight config).
///
/// Returns false if the cache is empty or if weights have changed since last rebuild.
pub async fn is_cache_valid(pool: &SqlitePool) -> Result<bool> {
let entry: Option<StandardizedScoreCache> =
sqlx::query_as("SELECT * FROM standardized_score_cache LIMIT 1")
.fetch_optional(pool)
.await?;
let entry = match entry {
Some(e) => e,
None => return Ok(false),
};
let current_weights_hash = compute_weights_hash();
Ok(entry.weights_hash.as_deref() == Some(current_weights_hash.as_str()))
}
/// Ensure the score cache exists, rebuilding if necessary.
pub async fn ensure_cache_exists(pool: &SqlitePool) -> Result<()> {
if !is_cache_valid(pool).await? {
rebuild_score_cache(pool).await?;
}
Ok(())
}

View File

@ -28,7 +28,7 @@ pub struct Team {
pub synced_at: Option<NaiveDateTime>,
}
#[derive(Debug, FromRow, Serialize, Deserialize)]
#[derive(Debug, Clone, FromRow, Serialize, Deserialize)]
pub struct Player {
pub id: i64,
pub name: String,
@ -93,7 +93,7 @@ impl Player {
// Card Data (imported from Strat-o-Matic)
// =============================================================================
#[derive(Debug, FromRow, Serialize, Deserialize)]
#[derive(Debug, Clone, FromRow, Serialize, Deserialize)]
pub struct BatterCard {
pub id: i64,
pub player_id: i64,
@ -140,7 +140,7 @@ pub struct BatterCard {
pub source: Option<String>,
}
#[derive(Debug, FromRow, Serialize, Deserialize)]
#[derive(Debug, Clone, FromRow, Serialize, Deserialize)]
pub struct PitcherCard {
pub id: i64,
pub player_id: i64,

View File

@ -2,7 +2,10 @@ use anyhow::Result;
use sqlx::SqlitePool;
use std::collections::HashMap;
use super::models::{BatterCard, Lineup, MatchupCache, PitcherCard, Player, Roster, SyncStatus, Team};
use super::models::{
BatterCard, Lineup, MatchupCache, PitcherCard, Player, Roster, StandardizedScoreCache,
SyncStatus, Team,
};
// =============================================================================
// Team Queries
@ -284,6 +287,77 @@ pub async fn invalidate_matchup_cache(pool: &SqlitePool) -> Result<u64> {
Ok(result.rows_affected())
}
// =============================================================================
// Standardized Score Cache Queries
// =============================================================================
pub async fn get_cached_batter_score(
pool: &SqlitePool,
batter_card_id: i64,
split: &str,
) -> Result<Option<StandardizedScoreCache>> {
let cache = sqlx::query_as::<_, StandardizedScoreCache>(
"SELECT * FROM standardized_score_cache WHERE batter_card_id = ? AND split = ?",
)
.bind(batter_card_id)
.bind(split)
.fetch_optional(pool)
.await?;
Ok(cache)
}
pub async fn get_cached_pitcher_score(
pool: &SqlitePool,
pitcher_card_id: i64,
split: &str,
) -> Result<Option<StandardizedScoreCache>> {
let cache = sqlx::query_as::<_, StandardizedScoreCache>(
"SELECT * FROM standardized_score_cache WHERE pitcher_card_id = ? AND split = ?",
)
.bind(pitcher_card_id)
.bind(split)
.fetch_optional(pool)
.await?;
Ok(cache)
}
pub async fn clear_score_cache(pool: &SqlitePool) -> Result<u64> {
let result = sqlx::query("DELETE FROM standardized_score_cache")
.execute(pool)
.await?;
Ok(result.rows_affected())
}
pub async fn insert_score_cache(
pool: &SqlitePool,
batter_card_id: Option<i64>,
pitcher_card_id: Option<i64>,
split: &str,
total_score: f64,
stat_scores: &str,
weights_hash: &str,
league_stats_hash: &str,
) -> Result<()> {
let computed_at = chrono::Utc::now().naive_utc();
sqlx::query(
"INSERT INTO standardized_score_cache \
(batter_card_id, pitcher_card_id, split, total_score, stat_scores, computed_at, \
weights_hash, league_stats_hash) \
VALUES (?, ?, ?, ?, ?, ?, ?, ?)",
)
.bind(batter_card_id)
.bind(pitcher_card_id)
.bind(split)
.bind(total_score)
.bind(stat_scores)
.bind(computed_at)
.bind(weights_hash)
.bind(league_stats_hash)
.execute(pool)
.await?;
Ok(())
}
// =============================================================================
// Lineup Queries
// =============================================================================

View File

@ -0,0 +1,277 @@
/// Integration tests for the full calc pipeline:
/// league stats → score cache → calculate_matchup (real-time vs cached).
///
/// These tests verify that:
/// - `rebuild_score_cache` populates the DB correctly
/// - `calculate_matchup` (real-time) and `calculate_matchup_cached` agree
/// - Switch-hitter resolution works end-to-end
/// - Batter-only rating (no pitcher card) works end-to-end
use sba_scout::{
calc::{
league_stats::BatterLeagueStats,
matchup::{calculate_matchup, calculate_matchup_cached},
score_cache::rebuild_score_cache,
},
db::{models::Player, queries, schema},
};
// ---------------------------------------------------------------------------
// Helpers
// ---------------------------------------------------------------------------
async fn test_pool() -> sqlx::SqlitePool {
let pool = schema::init_pool(std::path::Path::new(":memory:"))
.await
.expect("failed to create in-memory pool");
schema::create_tables(&pool).await.expect("failed to create tables");
pool
}
async fn insert_team(pool: &sqlx::SqlitePool, id: i64, abbrev: &str) {
sqlx::query(
"INSERT INTO teams (id, abbrev, short_name, long_name, season) VALUES (?, ?, ?, ?, 13)",
)
.bind(id)
.bind(abbrev)
.bind(abbrev)
.bind(format!("{abbrev} Long Name"))
.execute(pool)
.await
.expect("failed to insert team");
}
async fn insert_player(
pool: &sqlx::SqlitePool,
id: i64,
name: &str,
team_id: i64,
pos_1: &str,
hand: &str,
) {
sqlx::query(
"INSERT INTO players (id, name, season, team_id, pos_1, hand) VALUES (?, ?, 13, ?, ?, ?)",
)
.bind(id)
.bind(name)
.bind(team_id)
.bind(pos_1)
.bind(hand)
.execute(pool)
.await
.expect("failed to insert player");
}
/// Fetch a Player struct from the DB by id.
async fn fetch_player(pool: &sqlx::SqlitePool, id: i64) -> Player {
sqlx::query_as("SELECT * FROM players WHERE id = ?")
.bind(id)
.fetch_one(pool)
.await
.expect("player not found")
}
// ---------------------------------------------------------------------------
// Tests
// ---------------------------------------------------------------------------
/// Verify rebuild_score_cache runs without error and populates cache entries.
///
/// With one batter card and one pitcher card, we expect 4 cache rows total:
/// 2 splits for the batter (vlhp, vrhp) and 2 for the pitcher (vlhb, vrhb).
#[tokio::test]
async fn rebuild_score_cache_populates_entries() {
let pool = test_pool().await;
insert_team(&pool, 1, "TST").await;
insert_player(&pool, 10, "Test Batter", 1, "CF", "R").await;
insert_player(&pool, 11, "Test Pitcher", 1, "SP", "R").await;
sqlx::query("INSERT INTO batter_cards (player_id) VALUES (?)")
.bind(10)
.execute(&pool)
.await
.unwrap();
sqlx::query("INSERT INTO pitcher_cards (player_id) VALUES (?)")
.bind(11)
.execute(&pool)
.await
.unwrap();
let result = rebuild_score_cache(&pool).await.expect("rebuild failed");
assert_eq!(result.batter_splits, 2, "expected 2 batter cache rows (vlhp, vrhp)");
assert_eq!(result.pitcher_splits, 2, "expected 2 pitcher cache rows (vlhb, vrhb)");
// Verify rows exist in DB
let (count,): (i64,) = sqlx::query_as("SELECT COUNT(*) FROM standardized_score_cache")
.fetch_one(&pool)
.await
.unwrap();
assert_eq!(count, 4);
}
/// Verify that calculate_matchup (real-time) and calculate_matchup_cached agree
/// on rating and tier for a standard R batter vs R pitcher matchup.
///
/// All card stats default to 0.0 (zeroed). With zeroed cards and default league
/// stats, both functions must return identical ratings.
#[tokio::test]
async fn realtime_and_cached_matchup_agree() {
let pool = test_pool().await;
insert_team(&pool, 1, "TST").await;
insert_player(&pool, 10, "Test Batter", 1, "CF", "R").await;
insert_player(&pool, 11, "Test Pitcher", 1, "SP", "R").await;
sqlx::query("INSERT INTO batter_cards (player_id) VALUES (?)")
.bind(10)
.execute(&pool)
.await
.unwrap();
sqlx::query("INSERT INTO pitcher_cards (player_id) VALUES (?)")
.bind(11)
.execute(&pool)
.await
.unwrap();
rebuild_score_cache(&pool).await.expect("rebuild failed");
let batter = fetch_player(&pool, 10).await;
let pitcher = fetch_player(&pool, 11).await;
let batter_card = queries::get_batter_card(&pool, 10).await.unwrap().unwrap();
let pitcher_card = queries::get_pitcher_card(&pool, 11).await.unwrap().unwrap();
let b_stats = BatterLeagueStats::default();
let p_stats = sba_scout::calc::league_stats::PitcherLeagueStats::default();
let realtime = calculate_matchup(
&batter,
Some(&batter_card),
&pitcher,
Some(&pitcher_card),
&b_stats,
&p_stats,
);
let cached = calculate_matchup_cached(&pool, &batter, Some(&batter_card), &pitcher, Some(&pitcher_card))
.await
.expect("cached matchup failed");
let rt_rating = realtime.rating.expect("real-time should have rating");
let c_rating = cached.rating.expect("cached should have rating");
assert!(
(rt_rating - c_rating).abs() < 0.01,
"ratings differ: real-time={rt_rating}, cached={c_rating}"
);
assert_eq!(realtime.tier, cached.tier, "tiers should match");
assert_eq!(realtime.batter_split, cached.batter_split);
assert_eq!(realtime.pitcher_split, cached.pitcher_split);
}
/// Verify switch-hitter resolution: a player with hand="S" facing an R pitcher
/// should bat left (effective_hand="L", batter_split="vRHP", pitcher_split="vLHB").
#[tokio::test]
async fn switch_hitter_resolution_vs_right_pitcher() {
let pool = test_pool().await;
insert_team(&pool, 1, "TST").await;
insert_player(&pool, 10, "Switch Batter", 1, "CF", "S").await;
insert_player(&pool, 11, "Right Pitcher", 1, "SP", "R").await;
sqlx::query("INSERT INTO batter_cards (player_id) VALUES (?)")
.bind(10)
.execute(&pool)
.await
.unwrap();
rebuild_score_cache(&pool).await.expect("rebuild failed");
let batter = fetch_player(&pool, 10).await;
let pitcher = fetch_player(&pool, 11).await;
let batter_card = queries::get_batter_card(&pool, 10).await.unwrap().unwrap();
let b_stats = BatterLeagueStats::default();
let p_stats = sba_scout::calc::league_stats::PitcherLeagueStats::default();
let result = calculate_matchup(&batter, Some(&batter_card), &pitcher, None, &b_stats, &p_stats);
assert_eq!(result.batter_hand, "L", "switch hitter vs R pitcher should bat left");
assert_eq!(result.batter_split, "vRHP");
assert_eq!(result.pitcher_split, "vLHB");
assert!(result.rating.is_some());
}
/// Verify switch-hitter resolution: hand="S" facing an L pitcher → bats right.
#[tokio::test]
async fn switch_hitter_resolution_vs_left_pitcher() {
let pool = test_pool().await;
insert_team(&pool, 1, "TST").await;
insert_player(&pool, 10, "Switch Batter", 1, "CF", "S").await;
insert_player(&pool, 11, "Left Pitcher", 1, "SP", "L").await;
sqlx::query("INSERT INTO batter_cards (player_id) VALUES (?)")
.bind(10)
.execute(&pool)
.await
.unwrap();
rebuild_score_cache(&pool).await.expect("rebuild failed");
let batter = fetch_player(&pool, 10).await;
let pitcher = fetch_player(&pool, 11).await;
let batter_card = queries::get_batter_card(&pool, 10).await.unwrap().unwrap();
let b_stats = BatterLeagueStats::default();
let p_stats = sba_scout::calc::league_stats::PitcherLeagueStats::default();
let result = calculate_matchup(&batter, Some(&batter_card), &pitcher, None, &b_stats, &p_stats);
assert_eq!(result.batter_hand, "R", "switch hitter vs L pitcher should bat right");
assert_eq!(result.batter_split, "vLHP");
assert_eq!(result.pitcher_split, "vRHB");
}
/// Verify batter-only rating: when pitcher card is None, total equals batter component.
///
/// With a zeroed batter card and default league stats, all stats standardize to 3
/// and the batter component = 3 * 22 = 66.
#[tokio::test]
async fn batter_only_rating_when_no_pitcher_card() {
let pool = test_pool().await;
insert_team(&pool, 1, "TST").await;
insert_player(&pool, 10, "Test Batter", 1, "CF", "R").await;
insert_player(&pool, 11, "Test Pitcher", 1, "SP", "R").await;
sqlx::query("INSERT INTO batter_cards (player_id) VALUES (?)")
.bind(10)
.execute(&pool)
.await
.unwrap();
rebuild_score_cache(&pool).await.expect("rebuild failed");
let batter = fetch_player(&pool, 10).await;
let pitcher = fetch_player(&pool, 11).await;
let batter_card = queries::get_batter_card(&pool, 10).await.unwrap().unwrap();
let b_stats = BatterLeagueStats::default();
let p_stats = sba_scout::calc::league_stats::PitcherLeagueStats::default();
// Real-time: no pitcher card passed
let rt = calculate_matchup(&batter, Some(&batter_card), &pitcher, None, &b_stats, &p_stats);
let rating = rt.rating.expect("should have rating");
assert!(
(rating - 66.0).abs() < f64::EPSILON,
"batter-only zeroed rating should be 66.0, got {rating}"
);
assert!(rt.pitcher_component.is_none());
// Cached: no pitcher card passed (pitcher has no card in DB)
let cached = calculate_matchup_cached(&pool, &batter, Some(&batter_card), &pitcher, None)
.await
.expect("cached matchup failed");
let c_rating = cached.rating.expect("cached should have rating");
assert!(
(c_rating - 66.0).abs() < f64::EPSILON,
"cached batter-only zeroed rating should be 66.0, got {c_rating}"
);
}