refactor: full recalculation for season stats + fix evaluator bugs #111
No reviewers
Labels
No Label
ai-changes-requested
ai-failed
ai-merged
ai-pr-opened
ai-reviewed
ai-reviewing
ai-reviewing
ai-working
bug
enhancement
evolution
performance
phase-0
phase-1a
phase-1b
phase-1c
phase-1d
security
tech-debt
todo
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: cal/paper-dynasty-database#111
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "feature/season-stats-full-recalc"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
update_season_stats()— each call now recomputes season totals from all StratPlay rows for affected players, eliminating double-counting and enabling self-healingforce=Trueparameter for reprocessing previously handled games (useful for correcting stats after data fixes)evolution_evaluator.py: brokenPlayerSeasonStatsimport → queriesBattingSeasonStatsorPitchingSeasonStatsbased oncard_type; fixedr.k→r.strikeoutstest_evolution_models.py,test_postgame_evolution.py): replaced references to nonexistentPlayerSeasonStatsmodel withBattingSeasonStatsKey Changes
app/services/season_stats.py_get_player_pairs(),_recalc_batting(),_recalc_pitching(),_recalc_decisions()replace all delta logicapp/services/evolution_evaluator.pyapp/routers_v2/season_stats.pyforcequery param passthroughtests/test_season_stats_update.pytests/test_evolution_evaluator.pytests/test_evolution_models.pytests/test_postgame_evolution.pyTest plan
.cast("INTEGER")forDecision.is_startSUM)🤖 Generated with Claude Code
AI Code Review (automated)
Files Reviewed
app/routers_v2/season_stats.py(modified)app/services/season_stats.py(major rewrite)app/services/evolution_evaluator.py(modified)tests/test_season_stats_update.py(modified + 3 new tests)tests/test_evolution_evaluator.py(modified)tests/test_evolution_models.py(modified)tests/test_postgame_evolution.py(modified)Findings
Correctness
1. [HIGH]
fn.COUNT(fn.DISTINCT(...))— incorrect Peewee idiom, likely SQL error on PostgreSQL._recalc_batting()(line 106) uses:_recalc_pitching()(line 193) uses:fn.DISTINCTis not a SQL function — it is a Peewee modifier. Wrapping it insidefn.COUNT()may emitCOUNT(DISTINCT(...))with extra parentheses that is a syntax error in PostgreSQL (SQLite is more permissive). The PR description already flags "Verify on PostgreSQL CI" as an open item — this is the likely failure.The correct Peewee idiom for
COUNT(DISTINCT expr)is.distinct()on the column/expression:2. [MEDIUM] Decision-only pitchers silently excluded — behavioral regression.
The old
_apply_decisions()explicitly handled pitchers who received aDecisionrow but had zeroStratPlayrows. The new_get_player_pairs()only scansStratPlay, so a pitcher who received a win/loss/save Decision but had no plays will not appear inpitching_pairs. TheirPitchingSeasonStatsrow will not be written for that game, silently dropping decision stats.Fix: also scan
Decisionrows when buildingpitching_pairs:File:
app/services/season_stats.py, lines 61–803. [MEDIUM]
ProcessedGame.get_or_create(game_id=game_id)— verify PK FK semantics.ProcessedGamehasgame = ForeignKeyField(StratGame, primary_key=True). Usinggame_id=as the keyword works via Peewee's FK accessor convention, butget_or_createon a PK FK field had edge cases in Peewee versions before 3.17 wherecreatedcould be wrong. The entire force=True path depends oncreatedbeing reliable. Worth confirming the deployed Peewee version.File:
app/services/season_stats.py, line 3314. [LOW] Stale docstrings in
evolution_evaluator.py.Depends on WP-07 (PlayerSeasonStats)— should referenceBattingSeasonStats/PitchingSeasonStats._CareerTotalsdocstring:sp/rp: outs, k—kwas renamed tostrikeouts.evaluate_card()docstring:_stats_model: Override for PlayerSeasonStats— outdated.5. [LOW] Silent fallthrough for unknown
card_typeinevolution_evaluator.py.Any unrecognized
card_type(future types,None, typos) returns zeroed pitcher stats silently. An explicitelif/raise ValueErrorwould surface data problems early.Security
No issues found. All queries use parameterized Peewee ORM calls; no raw SQL constructed from user input.
Style & Conventions
Code follows project patterns. Docstrings use the "What/Why" style. No convention violations.
Test Coverage
The three new tests are well-designed and cover the self-healing scenario clearly.
Gap 1: No test for Decision-only pitchers (issue 2 above). The existing
test_decision_integrationalways creates a StratPlay row before the Decision, keeping the pitcher inpitching_pairs.Gap 2:
test_idempotent_reprocessingcallsupdate_season_stats(game.id, force=True)as the first call, socreated=Trueis returned byget_or_createand thenot created and force=Truebranch is never exercised. The test still passes because recalc is idempotent, but it doesn't cover the force re-entry path. Add a non-force call first to make the second force call actually exercise that branch.Summary
season_stats.py:106, 193fn.COUNT(fn.DISTINCT(...))invalid Peewee idiom — use.distinct()season_stats.py:61–80season_stats.py:331ProcessedGame.get_or_createPK FKcreatedsemanticsevolution_evaluator.py:7,33,67evolution_evaluator.py:148card_typeVerdict: REQUEST_CHANGES
The full-recalculation architecture is the right approach. The evaluator import fix and
r.k→r.strikeoutsrename are correct. Issues 1 and 2 should be resolved before merge — issue 1 will fail on PostgreSQL in production, and issue 2 is a silent behavioral regression.Automated review by Claude PR Reviewer
AI Code Review (Round 2)
Files Reviewed
app/services/season_stats.py(major rewrite)app/routers_v2/season_stats.py(modified)app/services/evolution_evaluator.py(modified)tests/test_season_stats_update.py(modified + new tests)tests/test_evolution_evaluator.py(modified)tests/test_evolution_models.py(modified)tests/test_postgame_evolution.py(modified)Prior Issues — Verification
Issue 1:
fn.COUNT(fn.DISTINCT(...))— FIXEDThe previous review flagged incorrect Peewee syntax. The fix is correctly applied in two different ways depending on the context:
_recalc_batting(line 118–120): UsesCase(...).distinct()— theDISTINCTmodifier is placed on theCaseexpression before it is passed toCOUNT. This is the correct Peewee idiom forCOUNT(DISTINCT CASE WHEN pa > 0 THEN game_id END). The approach is sound and the docstring explains the NULL-ignoring behaviour correctly._recalc_pitching(line 205): Usesfn.COUNT(StratPlay.game.distinct())— the.distinct()call on a column field is valid Peewee syntax forCOUNT(DISTINCT game_id). This is correct.Both fixes are verified correct.
Issue 2: Decision-only pitchers dropped — FIXED
_get_player_pairs()now performs a second query against theDecisiontable (lines 84–90) and unions those(pitcher_id, pitcher_team_id)pairs intopitching_pairs. Pitchers credited with a Decision but having no StratPlay rows will now correctly receive aPitchingSeasonStatsrow populated entirely from_recalc_decisions()(with zeroed play stats from_recalc_pitching()returning an empty row).Both prior issues are properly resolved.
Findings
Correctness
_recalc_decisionsseason filter vs. Decision model schema.The
Decisionmodel has aseason = IntegerField()column (confirmed indb_engine.pyline 1026), so filteringDecision.season == seasonin_recalc_decisionsis correct. The query is not joining throughStratGameto derive season, which is intentional — Decision rows carry their own denormalizedseasonfield.ProcessedGame.get_or_create(game_id=game_id).ProcessedGamedefinesgame = ForeignKeyField(StratGame, primary_key=True). Peewee exposes the raw integer column asgame_idon FK fields, soget_or_create(game_id=game_id)correctly targets the primary key column. This is consistent with how the old code worked and is not a regression.last_game_idvs.last_game.The service assigns
obj.last_game_id = game_id(lines 401, 434).BattingSeasonStatsandPitchingSeasonStatsdefinelast_game = ForeignKeyField(StratGame, null=True), so Peewee exposes this aslast_game_id. The assignment is correct.Zero-plays edge case.
When a decision-only pitcher has no StratPlay rows,
_recalc_pitchingwill return an aggregate row where all SUM columns are NULL (since there are no matching rows, the aggregate query returns one row with all NULLs, not zero rows). Therow.get("field") or 0pattern correctly converts bothNoneand0to0. All fields will be zero, and the Decision-sourced fields will be populated correctly.Stale docstring in
evolution_evaluator.py.The
_stats_modelparameter description inevaluate_card()'s docstring (line 68) still reads "Override for PlayerSeasonStats" — this should now reference the split-model approach. Documentation only; no runtime impact.Security
No issues. All queries use Peewee parameterized ORM calls. The
forceparameter is a bool, validated by FastAPI before reaching the service.Style & Conventions
The code follows the project's established Peewee patterns throughout. The removal of the
DATABASE_TYPEbranch (replacing the SQLite/PostgreSQL split with a singleget_or_create + save()path) is a clear net improvement in maintainability. Module-level docstring accurately describes the new strategy.Concurrent Access
The docstring claims "no risk of concurrent-write skew between games" because writes are replacements rather than increments. This is correct within a single player's recalculation. However, if two game-completion events for the same player arrive simultaneously and both pass the
ProcessedGameledger check (both games are new), they will compute at overlapping times: whicheversave()runs last wins. Under the old incremental approach the same race would produce double-counted deltas; under the new approach the worst case is one game's stats missing from the row until aforce=Truereprocess. The new approach is strictly safer, and the self-healing mechanism exists to recover. This is acceptable but worth noting operationally.Test Coverage
The three new tests are well-constructed and cover the critical new behaviours:
test_force_recalc: Confirmsforce=Truedoes not double stats when data is unchanged.test_idempotent_reprocessing: Confirms two consecutiveforce=Truecalls produce identical output.test_partial_reprocessing_heals: Confirms corrupted rows are corrected byforce=True. This is the most valuable self-healing test.One minor gap: there is no test for the decision-only pitcher path added in
_get_player_pairs(). A test that creates aDecisionrow for a pitcher with zeroStratPlayrows and verifies thePitchingSeasonStatsrow is created with correct decision counts would round out coverage for that edge case.Suggestions (non-blocking)
_stats_modelarg description inevaluate_card()from "Override for PlayerSeasonStats" to reflect the split-model approach.Decisionwith noStratPlayrows — the docstring calls this out as an edge case but it is not currently exercised in the test suite.Decision.is_start.cast("INTEGER"). The cast approach is correct — SQLite treats booleans as 0/1 and SUM works without the cast; PostgreSQL requires it. The code handles both correctly.Verdict: APPROVED
Both prior issues are correctly fixed. The
Case(...).distinct()syntax for batting games count is valid Peewee;fn.COUNT(field.distinct())for pitching games count is also valid. Decision-only pitchers are now included via theDecisiontable scan in_get_player_pairs. The full-recalculation approach is logically sound, the upsert strategy is correct for both databases, and the three new tests adequately cover the force/idempotency/self-healing behaviours. The stale docstring and missing decision-only pitcher test are minor and do not block merging.Automated review by Claude PR Reviewer
AI Code Review (Round 3)
Files Reviewed
app/services/season_stats.py(major rewrite)app/services/evolution_evaluator.py(modified)app/routers_v2/season_stats.py(modified)tests/test_season_stats_update.py(modified + 4 new tests)tests/test_evolution_evaluator.py(modified)tests/test_evolution_models.py(modified)tests/test_postgame_evolution.py(modified)Findings
Correctness
No issues found. The full-recalculation approach is correctly implemented:
_get_player_pairs()properly covers the decision-only pitcher edge case by unioning StratPlay and Decision sources, directly addressing the Round 1 concern._recalc_batting()usesfn.COUNT(Case(None, [(StratPlay.pa > 0, StratPlay.game)], None).distinct())— the.distinct()call is on theCaseexpression result, which is the correct pattern for counting distinct non-NULL game IDs where pa > 0._recalc_decisions()usesDecision.is_start.cast("INTEGER")before summing, correctly handling the boolean-to-integer arithmetic difference between SQLite and PostgreSQL.force=Truepath skips the early return without touching the ProcessedGame ledger entry — correct, since the ledger only needs to exist once per game_id.evaluate_card()now branches oncard_state.track.card_typeand queries the appropriate model (BattingSeasonStatsvsPitchingSeasonStats), fully resolving the broken import from Round 2.Security
No issues found. The
forceparameter is a boolean query param behind the existing bearer-token auth guard.Style & Conventions
Two stale docstring fragments remain. Neither is a correctness issue, but they contradict the actual behavior:
app/routers_v2/season_stats.py, module docstring (lines 7–8, 11) and endpoint docstring (lines 32–35) — Still reads "performs an additive upsert into player_season_stats", "Guards against double-counting via the last_game FK check". Carried over from the old delta approach.app/services/evolution_evaluator.py,evaluate_card()function summary (line 56–57) — Still opens with "Sums all player_season_stats rows for (player_id, team_id) across all seasons". The module-level docstring and parameter docstring were updated correctly; this one sentence was missed.Neither warrants blocking the merge.
Test Coverage
All four new tests are well-structured and cover the claimed behavior:
test_force_recalc— proves replacement upsert does not double statstest_idempotent_reprocessing— proves two consecutiveforce=Truecalls are stabletest_partial_reprocessing_heals— proves self-healing from corrupted row statetest_decision_only_pitcher— covers the rare edge case addressed in_get_player_pairs(); correctly assertsstats.games == 0(no StratPlay rows means COUNT(DISTINCT game) = 0), documenting the intentional designBoth Round 2 non-blocking notes are addressed: the stale "PlayerSeasonStats" reference in
evaluate_card()is updated at the module and parameter level, andtest_decision_only_pitcheris present and correct.Verdict: APPROVED
The implementation is correct, the force-recalc / self-healing path is well-exercised, and there are no blocking issues. The two remaining stale docstring fragments are cosmetic follow-up cleanup only. Ready to merge.
Automated review by Claude PR Reviewer