refactor: full recalculation for season stats + fix evaluator bugs #111

cal · 2026-03-19T15:17:52Z

cal commented

2026-03-19 15:17:52 +00:00

Summary

Replaced incremental delta upserts with full recalculation in update_season_stats() — each call now recomputes season totals from all StratPlay rows for affected players, eliminating double-counting and enabling self-healing
Added force=True parameter for reprocessing previously handled games (useful for correcting stats after data fixes)
Fixed evolution_evaluator.py: broken PlayerSeasonStats import → queries BattingSeasonStats or PitchingSeasonStats based on card_type; fixed r.k → r.strikeouts
Fixed test files (test_evolution_models.py, test_postgame_evolution.py): replaced references to nonexistent PlayerSeasonStats model with BattingSeasonStats

Key Changes

File	Change
`app/services/season_stats.py`	Major rewrite: `_get_player_pairs()`, `_recalc_batting()`, `_recalc_pitching()`, `_recalc_decisions()` replace all delta logic
`app/services/evolution_evaluator.py`	Fix broken import + field name; query correct model by card_type
`app/routers_v2/season_stats.py`	Add `force` query param passthrough
`tests/test_season_stats_update.py`	Updated + 3 new tests: force_recalc, idempotent_reprocessing, partial_reprocessing_heals
`tests/test_evolution_evaluator.py`	StatsStub.k → StatsStub.strikeouts
`tests/test_evolution_models.py`	PlayerSeasonStats → BattingSeasonStats
`tests/test_postgame_evolution.py`	PlayerSeasonStats → BattingSeasonStats + missing model imports

Test plan

All 10 season stats tests pass (7 existing + 3 new)
All 15 evolution evaluator tests pass
Full suite: 116 passed, 15 skipped (PG integration), 0 failed
Verify on PostgreSQL CI (the .cast("INTEGER") for Decision.is_start SUM)

🤖 Generated with Claude Code

## Summary - **Replaced incremental delta upserts with full recalculation** in `update_season_stats()` — each call now recomputes season totals from all StratPlay rows for affected players, eliminating double-counting and enabling self-healing - **Added `force=True` parameter** for reprocessing previously handled games (useful for correcting stats after data fixes) - **Fixed `evolution_evaluator.py`**: broken `PlayerSeasonStats` import → queries `BattingSeasonStats` or `PitchingSeasonStats` based on `card_type`; fixed `r.k` → `r.strikeouts` - **Fixed test files** (`test_evolution_models.py`, `test_postgame_evolution.py`): replaced references to nonexistent `PlayerSeasonStats` model with `BattingSeasonStats` ## Key Changes | File | Change | |------|--------| | `app/services/season_stats.py` | Major rewrite: `_get_player_pairs()`, `_recalc_batting()`, `_recalc_pitching()`, `_recalc_decisions()` replace all delta logic | | `app/services/evolution_evaluator.py` | Fix broken import + field name; query correct model by card_type | | `app/routers_v2/season_stats.py` | Add `force` query param passthrough | | `tests/test_season_stats_update.py` | Updated + 3 new tests: force_recalc, idempotent_reprocessing, partial_reprocessing_heals | | `tests/test_evolution_evaluator.py` | StatsStub.k → StatsStub.strikeouts | | `tests/test_evolution_models.py` | PlayerSeasonStats → BattingSeasonStats | | `tests/test_postgame_evolution.py` | PlayerSeasonStats → BattingSeasonStats + missing model imports | ## Test plan - [x] All 10 season stats tests pass (7 existing + 3 new) - [x] All 15 evolution evaluator tests pass - [x] Full suite: 116 passed, 15 skipped (PG integration), 0 failed - [ ] Verify on PostgreSQL CI (the `.cast("INTEGER")` for `Decision.is_start` SUM) 🤖 Generated with [Claude Code](https://claude.com/claude-code)

cal added 1 commit 2026-03-19 15:17:53 +00:00

refactor: replace incremental delta upserts with full recalculation in season stats 1b4eab9d99

The previous approach accumulated per-game deltas into season stats rows,
which was fragile — partial processing corrupted stats, upsert bugs
compounded, and there was no self-healing mechanism.

Now update_season_stats() recomputes full season totals from all StratPlay
rows for each affected player whenever a game is processed. The result
replaces whatever was stored, eliminating double-counting and enabling
self-healing via force=True.

Also fixes:
- evolution_evaluator.py: broken PlayerSeasonStats import → queries
  BattingSeasonStats or PitchingSeasonStats based on card_type
- evolution_evaluator.py: r.k → r.strikeouts
- test_evolution_models.py, test_postgame_evolution.py: PlayerSeasonStats
  → BattingSeasonStats (model never existed)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

cal commented

2026-03-19 15:21:43 +00:00

AI Code Review (automated)

Note: Gitea prevents authors from submitting formal reviews on their own PRs, so this is posted as a comment. Verdict: REQUEST_CHANGES.

Files Reviewed

app/routers_v2/season_stats.py (modified)
app/services/season_stats.py (major rewrite)
app/services/evolution_evaluator.py (modified)
tests/test_season_stats_update.py (modified + 3 new tests)
tests/test_evolution_evaluator.py (modified)
tests/test_evolution_models.py (modified)
tests/test_postgame_evolution.py (modified)

Findings

Correctness

1. [HIGH] fn.COUNT(fn.DISTINCT(...)) — incorrect Peewee idiom, likely SQL error on PostgreSQL.

_recalc_batting() (line 106) uses:

fn.COUNT(fn.DISTINCT(Case(None, [(StratPlay.pa > 0, StratPlay.game)], None)))

_recalc_pitching() (line 193) uses:

fn.COUNT(fn.DISTINCT(StratPlay.game)).alias("games")

fn.DISTINCT is not a SQL function — it is a Peewee modifier. Wrapping it inside fn.COUNT() may emit COUNT(DISTINCT(...)) with extra parentheses that is a syntax error in PostgreSQL (SQLite is more permissive). The PR description already flags "Verify on PostgreSQL CI" as an open item — this is the likely failure.

The correct Peewee idiom for COUNT(DISTINCT expr) is .distinct() on the column/expression:

# Pitching (simple column):
fn.COUNT(StratPlay.game.distinct()).alias("games")

# Batting (CASE expression — use SQL() wrapper if .distinct() not chainable):
fn.COUNT(Case(None, [(StratPlay.pa > 0, StratPlay.game)], None).distinct())

2. [MEDIUM] Decision-only pitchers silently excluded — behavioral regression.

The old _apply_decisions() explicitly handled pitchers who received a Decision row but had zero StratPlay rows. The new _get_player_pairs() only scans StratPlay, so a pitcher who received a win/loss/save Decision but had no plays will not appear in pitching_pairs. Their PitchingSeasonStats row will not be written for that game, silently dropping decision stats.

Fix: also scan Decision rows when building pitching_pairs:

decisions = Decision.select(Decision.pitcher, Decision.pitcher_team).where(Decision.game == game_id).tuples()
for pitcher_id, pitcher_team_id in decisions:
    pitching_pairs.add((pitcher_id, pitcher_team_id))

File: app/services/season_stats.py, lines 61–80

3. [MEDIUM] ProcessedGame.get_or_create(game_id=game_id) — verify PK FK semantics.

ProcessedGame has game = ForeignKeyField(StratGame, primary_key=True). Using game_id= as the keyword works via Peewee's FK accessor convention, but get_or_create on a PK FK field had edge cases in Peewee versions before 3.17 where created could be wrong. The entire force=True path depends on created being reliable. Worth confirming the deployed Peewee version.

File: app/services/season_stats.py, line 331

4. [LOW] Stale docstrings in evolution_evaluator.py.

Module docstring line 7: Depends on WP-07 (PlayerSeasonStats) — should reference BattingSeasonStats/PitchingSeasonStats.
_CareerTotals docstring: sp/rp: outs, k — k was renamed to strikeouts.
evaluate_card() docstring: _stats_model: Override for PlayerSeasonStats — outdated.

5. [LOW] Silent fallthrough for unknown card_type in evolution_evaluator.py.

if card_type == "batter":
    # query BattingSeasonStats
else:
    # silently treats everything else as a pitcher

Any unrecognized card_type (future types, None, typos) returns zeroed pitcher stats silently. An explicit elif/raise ValueError would surface data problems early.

Security

No issues found. All queries use parameterized Peewee ORM calls; no raw SQL constructed from user input.

Style & Conventions

Code follows project patterns. Docstrings use the "What/Why" style. No convention violations.

Test Coverage

The three new tests are well-designed and cover the self-healing scenario clearly.

Gap 1: No test for Decision-only pitchers (issue 2 above). The existing test_decision_integration always creates a StratPlay row before the Decision, keeping the pitcher in pitching_pairs.

Gap 2: test_idempotent_reprocessing calls update_season_stats(game.id, force=True) as the first call, so created=True is returned by get_or_create and the not created and force=True branch is never exercised. The test still passes because recalc is idempotent, but it doesn't cover the force re-entry path. Add a non-force call first to make the second force call actually exercise that branch.

Summary

#	Severity	File	Issue
1	High	`season_stats.py:106, 193`	`fn.COUNT(fn.DISTINCT(...))` invalid Peewee idiom — use `.distinct()`
2	Medium	`season_stats.py:61–80`	Decision-only pitchers silently dropped from recalc
3	Medium	`season_stats.py:331`	`ProcessedGame.get_or_create` PK FK `created` semantics
4	Low	`evolution_evaluator.py:7,33,67`	Stale docstring references
5	Low	`evolution_evaluator.py:148`	Silent fallthrough for unknown `card_type`

Verdict: REQUEST_CHANGES

The full-recalculation architecture is the right approach. The evaluator import fix and r.k → r.strikeouts rename are correct. Issues 1 and 2 should be resolved before merge — issue 1 will fail on PostgreSQL in production, and issue 2 is a silent behavioral regression.

Automated review by Claude PR Reviewer

## AI Code Review (automated) > Note: Gitea prevents authors from submitting formal reviews on their own PRs, so this is posted as a comment. Verdict: **REQUEST_CHANGES**. ### Files Reviewed - `app/routers_v2/season_stats.py` (modified) - `app/services/season_stats.py` (major rewrite) - `app/services/evolution_evaluator.py` (modified) - `tests/test_season_stats_update.py` (modified + 3 new tests) - `tests/test_evolution_evaluator.py` (modified) - `tests/test_evolution_models.py` (modified) - `tests/test_postgame_evolution.py` (modified) --- ### Findings #### Correctness **1. [HIGH] `fn.COUNT(fn.DISTINCT(...))` — incorrect Peewee idiom, likely SQL error on PostgreSQL.** `_recalc_batting()` (line 106) uses: ```python fn.COUNT(fn.DISTINCT(Case(None, [(StratPlay.pa > 0, StratPlay.game)], None))) ``` `_recalc_pitching()` (line 193) uses: ```python fn.COUNT(fn.DISTINCT(StratPlay.game)).alias("games") ``` `fn.DISTINCT` is not a SQL function — it is a Peewee modifier. Wrapping it inside `fn.COUNT()` may emit `COUNT(DISTINCT(...))` with extra parentheses that is a syntax error in PostgreSQL (SQLite is more permissive). The PR description already flags "Verify on PostgreSQL CI" as an open item — this is the likely failure. The correct Peewee idiom for `COUNT(DISTINCT expr)` is `.distinct()` on the column/expression: ```python # Pitching (simple column): fn.COUNT(StratPlay.game.distinct()).alias("games") # Batting (CASE expression — use SQL() wrapper if .distinct() not chainable): fn.COUNT(Case(None, [(StratPlay.pa > 0, StratPlay.game)], None).distinct()) ``` --- **2. [MEDIUM] Decision-only pitchers silently excluded — behavioral regression.** The old `_apply_decisions()` explicitly handled pitchers who received a `Decision` row but had zero `StratPlay` rows. The new `_get_player_pairs()` only scans `StratPlay`, so a pitcher who received a win/loss/save Decision but had no plays will not appear in `pitching_pairs`. Their `PitchingSeasonStats` row will not be written for that game, silently dropping decision stats. Fix: also scan `Decision` rows when building `pitching_pairs`: ```python decisions = Decision.select(Decision.pitcher, Decision.pitcher_team).where(Decision.game == game_id).tuples() for pitcher_id, pitcher_team_id in decisions: pitching_pairs.add((pitcher_id, pitcher_team_id)) ``` **File:** `app/services/season_stats.py`, lines 61–80 --- **3. [MEDIUM] `ProcessedGame.get_or_create(game_id=game_id)` — verify PK FK semantics.** `ProcessedGame` has `game = ForeignKeyField(StratGame, primary_key=True)`. Using `game_id=` as the keyword works via Peewee's FK accessor convention, but `get_or_create` on a PK FK field had edge cases in Peewee versions before 3.17 where `created` could be wrong. The entire force=True path depends on `created` being reliable. Worth confirming the deployed Peewee version. **File:** `app/services/season_stats.py`, line 331 --- **4. [LOW] Stale docstrings in `evolution_evaluator.py`.** - Module docstring line 7: `Depends on WP-07 (PlayerSeasonStats)` — should reference `BattingSeasonStats`/`PitchingSeasonStats`. - `_CareerTotals` docstring: `sp/rp: outs, k` — `k` was renamed to `strikeouts`. - `evaluate_card()` docstring: `_stats_model: Override for PlayerSeasonStats` — outdated. --- **5. [LOW] Silent fallthrough for unknown `card_type` in `evolution_evaluator.py`.** ```python if card_type == "batter": # query BattingSeasonStats else: # silently treats everything else as a pitcher ``` Any unrecognized `card_type` (future types, `None`, typos) returns zeroed pitcher stats silently. An explicit `elif`/`raise ValueError` would surface data problems early. --- #### Security No issues found. All queries use parameterized Peewee ORM calls; no raw SQL constructed from user input. #### Style & Conventions Code follows project patterns. Docstrings use the "What/Why" style. No convention violations. --- #### Test Coverage The three new tests are well-designed and cover the self-healing scenario clearly. **Gap 1:** No test for Decision-only pitchers (issue 2 above). The existing `test_decision_integration` always creates a StratPlay row before the Decision, keeping the pitcher in `pitching_pairs`. **Gap 2:** `test_idempotent_reprocessing` calls `update_season_stats(game.id, force=True)` as the *first* call, so `created=True` is returned by `get_or_create` and the `not created and force=True` branch is never exercised. The test still passes because recalc is idempotent, but it doesn't cover the force re-entry path. Add a non-force call first to make the second force call actually exercise that branch. --- ### Summary | # | Severity | File | Issue | |---|----------|------|-------| | 1 | High | `season_stats.py:106, 193` | `fn.COUNT(fn.DISTINCT(...))` invalid Peewee idiom — use `.distinct()` | | 2 | Medium | `season_stats.py:61–80` | Decision-only pitchers silently dropped from recalc | | 3 | Medium | `season_stats.py:331` | `ProcessedGame.get_or_create` PK FK `created` semantics | | 4 | Low | `evolution_evaluator.py:7,33,67` | Stale docstring references | | 5 | Low | `evolution_evaluator.py:148` | Silent fallthrough for unknown `card_type` | ### Verdict: REQUEST_CHANGES The full-recalculation architecture is the right approach. The evaluator import fix and `r.k` → `r.strikeouts` rename are correct. Issues 1 and 2 should be resolved before merge — issue 1 will fail on PostgreSQL in production, and issue 2 is a silent behavioral regression. --- *Automated review by Claude PR Reviewer*

cal added 1 commit 2026-03-19 15:23:56 +00:00

fix: address PR review — correct Peewee DISTINCT syntax and Decision-only pitchers 4211bd69e0

- fn.COUNT(fn.DISTINCT(expr)) → fn.COUNT(expr.distinct()) for correct
  COUNT(DISTINCT ...) SQL on PostgreSQL
- _get_player_pairs() now also scans Decision table to include pitchers
  who have a Decision row but no StratPlay rows (rare edge case)
- Updated stale docstring references to PlayerSeasonStats and r.k

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

cal reviewed 2026-03-19 15:28:21 +00:00

cal left a comment

AI Code Review (Round 2)

Files Reviewed

app/services/season_stats.py (major rewrite)
app/routers_v2/season_stats.py (modified)
app/services/evolution_evaluator.py (modified)
tests/test_season_stats_update.py (modified + new tests)
tests/test_evolution_evaluator.py (modified)
tests/test_evolution_models.py (modified)
tests/test_postgame_evolution.py (modified)

Prior Issues — Verification

Issue 1: `fn.COUNT(fn.DISTINCT(...))` — FIXED

The previous review flagged incorrect Peewee syntax. The fix is correctly applied in two different ways depending on the context:

_recalc_batting (line 118–120): Uses Case(...).distinct() — the DISTINCT modifier is placed on the Case expression before it is passed to COUNT. This is the correct Peewee idiom for COUNT(DISTINCT CASE WHEN pa > 0 THEN game_id END). The approach is sound and the docstring explains the NULL-ignoring behaviour correctly.
_recalc_pitching (line 205): Uses fn.COUNT(StratPlay.game.distinct()) — the .distinct() call on a column field is valid Peewee syntax for COUNT(DISTINCT game_id). This is correct.

Both fixes are verified correct.

Issue 2: Decision-only pitchers dropped — FIXED

_get_player_pairs() now performs a second query against the Decision table (lines 84–90) and unions those (pitcher_id, pitcher_team_id) pairs into pitching_pairs. Pitchers credited with a Decision but having no StratPlay rows will now correctly receive a PitchingSeasonStats row populated entirely from _recalc_decisions() (with zeroed play stats from _recalc_pitching() returning an empty row).

Both prior issues are properly resolved.

Findings

Correctness

_recalc_decisions season filter vs. Decision model schema.
The Decision model has a season = IntegerField() column (confirmed in db_engine.py line 1026), so filtering Decision.season == season in _recalc_decisions is correct. The query is not joining through StratGame to derive season, which is intentional — Decision rows carry their own denormalized season field.

ProcessedGame.get_or_create(game_id=game_id).
ProcessedGame defines game = ForeignKeyField(StratGame, primary_key=True). Peewee exposes the raw integer column as game_id on FK fields, so get_or_create(game_id=game_id) correctly targets the primary key column. This is consistent with how the old code worked and is not a regression.

last_game_id vs. last_game.
The service assigns obj.last_game_id = game_id (lines 401, 434). BattingSeasonStats and PitchingSeasonStats define last_game = ForeignKeyField(StratGame, null=True), so Peewee exposes this as last_game_id. The assignment is correct.

Zero-plays edge case.
When a decision-only pitcher has no StratPlay rows, _recalc_pitching will return an aggregate row where all SUM columns are NULL (since there are no matching rows, the aggregate query returns one row with all NULLs, not zero rows). The row.get("field") or 0 pattern correctly converts both None and 0 to 0. All fields will be zero, and the Decision-sourced fields will be populated correctly.

Stale docstring in evolution_evaluator.py.
The _stats_model parameter description in evaluate_card()'s docstring (line 68) still reads "Override for PlayerSeasonStats" — this should now reference the split-model approach. Documentation only; no runtime impact.

Security

No issues. All queries use Peewee parameterized ORM calls. The force parameter is a bool, validated by FastAPI before reaching the service.

Style & Conventions

The code follows the project's established Peewee patterns throughout. The removal of the DATABASE_TYPE branch (replacing the SQLite/PostgreSQL split with a single get_or_create + save() path) is a clear net improvement in maintainability. Module-level docstring accurately describes the new strategy.

Concurrent Access

The docstring claims "no risk of concurrent-write skew between games" because writes are replacements rather than increments. This is correct within a single player's recalculation. However, if two game-completion events for the same player arrive simultaneously and both pass the ProcessedGame ledger check (both games are new), they will compute at overlapping times: whichever save() runs last wins. Under the old incremental approach the same race would produce double-counted deltas; under the new approach the worst case is one game's stats missing from the row until a force=True reprocess. The new approach is strictly safer, and the self-healing mechanism exists to recover. This is acceptable but worth noting operationally.

Test Coverage

The three new tests are well-constructed and cover the critical new behaviours:

test_force_recalc: Confirms force=True does not double stats when data is unchanged.
test_idempotent_reprocessing: Confirms two consecutive force=True calls produce identical output.
test_partial_reprocessing_heals: Confirms corrupted rows are corrected by force=True. This is the most valuable self-healing test.

One minor gap: there is no test for the decision-only pitcher path added in _get_player_pairs(). A test that creates a Decision row for a pitcher with zero StratPlay rows and verifies the PitchingSeasonStats row is created with correct decision counts would round out coverage for that edge case.

Suggestions (non-blocking)

Stale docstring: Update the _stats_model arg description in evaluate_card() from "Override for PlayerSeasonStats" to reflect the split-model approach.
Decision-only pitcher test: Consider adding coverage for a pitcher who appears only in Decision with no StratPlay rows — the docstring calls this out as an edge case but it is not currently exercised in the test suite.
CI note acknowledged: The PR body flags a PostgreSQL CI gap for Decision.is_start.cast("INTEGER"). The cast approach is correct — SQLite treats booleans as 0/1 and SUM works without the cast; PostgreSQL requires it. The code handles both correctly.

Verdict: APPROVED

Both prior issues are correctly fixed. The Case(...).distinct() syntax for batting games count is valid Peewee; fn.COUNT(field.distinct()) for pitching games count is also valid. Decision-only pitchers are now included via the Decision table scan in _get_player_pairs. The full-recalculation approach is logically sound, the upsert strategy is correct for both databases, and the three new tests adequately cover the force/idempotency/self-healing behaviours. The stale docstring and missing decision-only pitcher test are minor and do not block merging.

Automated review by Claude PR Reviewer

## AI Code Review (Round 2) ### Files Reviewed - `app/services/season_stats.py` (major rewrite) - `app/routers_v2/season_stats.py` (modified) - `app/services/evolution_evaluator.py` (modified) - `tests/test_season_stats_update.py` (modified + new tests) - `tests/test_evolution_evaluator.py` (modified) - `tests/test_evolution_models.py` (modified) - `tests/test_postgame_evolution.py` (modified) --- ### Prior Issues — Verification #### Issue 1: `fn.COUNT(fn.DISTINCT(...))` — FIXED The previous review flagged incorrect Peewee syntax. The fix is correctly applied in two different ways depending on the context: - **`_recalc_batting` (line 118–120):** Uses `Case(...).distinct()` — the `DISTINCT` modifier is placed on the `Case` expression before it is passed to `COUNT`. This is the correct Peewee idiom for `COUNT(DISTINCT CASE WHEN pa > 0 THEN game_id END)`. The approach is sound and the docstring explains the NULL-ignoring behaviour correctly. - **`_recalc_pitching` (line 205):** Uses `fn.COUNT(StratPlay.game.distinct())` — the `.distinct()` call on a column field is valid Peewee syntax for `COUNT(DISTINCT game_id)`. This is correct. Both fixes are verified correct. #### Issue 2: Decision-only pitchers dropped — FIXED `_get_player_pairs()` now performs a second query against the `Decision` table (lines 84–90) and unions those `(pitcher_id, pitcher_team_id)` pairs into `pitching_pairs`. Pitchers credited with a Decision but having no StratPlay rows will now correctly receive a `PitchingSeasonStats` row populated entirely from `_recalc_decisions()` (with zeroed play stats from `_recalc_pitching()` returning an empty row). Both prior issues are properly resolved. --- ### Findings #### Correctness **`_recalc_decisions` season filter vs. Decision model schema.** The `Decision` model has a `season = IntegerField()` column (confirmed in `db_engine.py` line 1026), so filtering `Decision.season == season` in `_recalc_decisions` is correct. The query is not joining through `StratGame` to derive season, which is intentional — Decision rows carry their own denormalized `season` field. **`ProcessedGame.get_or_create(game_id=game_id)`.** `ProcessedGame` defines `game = ForeignKeyField(StratGame, primary_key=True)`. Peewee exposes the raw integer column as `game_id` on FK fields, so `get_or_create(game_id=game_id)` correctly targets the primary key column. This is consistent with how the old code worked and is not a regression. **`last_game_id` vs. `last_game`.** The service assigns `obj.last_game_id = game_id` (lines 401, 434). `BattingSeasonStats` and `PitchingSeasonStats` define `last_game = ForeignKeyField(StratGame, null=True)`, so Peewee exposes this as `last_game_id`. The assignment is correct. **Zero-plays edge case.** When a decision-only pitcher has no StratPlay rows, `_recalc_pitching` will return an aggregate row where all SUM columns are NULL (since there are no matching rows, the aggregate query returns one row with all NULLs, not zero rows). The `row.get("field") or 0` pattern correctly converts both `None` and `0` to `0`. All fields will be zero, and the Decision-sourced fields will be populated correctly. **Stale docstring in `evolution_evaluator.py`.** The `_stats_model` parameter description in `evaluate_card()`'s docstring (line 68) still reads "Override for PlayerSeasonStats" — this should now reference the split-model approach. Documentation only; no runtime impact. #### Security No issues. All queries use Peewee parameterized ORM calls. The `force` parameter is a bool, validated by FastAPI before reaching the service. #### Style & Conventions The code follows the project's established Peewee patterns throughout. The removal of the `DATABASE_TYPE` branch (replacing the SQLite/PostgreSQL split with a single `get_or_create + save()` path) is a clear net improvement in maintainability. Module-level docstring accurately describes the new strategy. #### Concurrent Access The docstring claims "no risk of concurrent-write skew between games" because writes are replacements rather than increments. This is correct within a single player's recalculation. However, if two game-completion events for the same player arrive simultaneously and both pass the `ProcessedGame` ledger check (both games are new), they will compute at overlapping times: whichever `save()` runs last wins. Under the old incremental approach the same race would produce double-counted deltas; under the new approach the worst case is one game's stats missing from the row until a `force=True` reprocess. The new approach is strictly safer, and the self-healing mechanism exists to recover. This is acceptable but worth noting operationally. #### Test Coverage The three new tests are well-constructed and cover the critical new behaviours: - `test_force_recalc`: Confirms `force=True` does not double stats when data is unchanged. - `test_idempotent_reprocessing`: Confirms two consecutive `force=True` calls produce identical output. - `test_partial_reprocessing_heals`: Confirms corrupted rows are corrected by `force=True`. This is the most valuable self-healing test. One minor gap: there is no test for the decision-only pitcher path added in `_get_player_pairs()`. A test that creates a `Decision` row for a pitcher with zero `StratPlay` rows and verifies the `PitchingSeasonStats` row is created with correct decision counts would round out coverage for that edge case. --- ### Suggestions (non-blocking) - **Stale docstring:** Update the `_stats_model` arg description in `evaluate_card()` from "Override for PlayerSeasonStats" to reflect the split-model approach. - **Decision-only pitcher test:** Consider adding coverage for a pitcher who appears only in `Decision` with no `StratPlay` rows — the docstring calls this out as an edge case but it is not currently exercised in the test suite. - **CI note acknowledged:** The PR body flags a PostgreSQL CI gap for `Decision.is_start.cast("INTEGER")`. The cast approach is correct — SQLite treats booleans as 0/1 and SUM works without the cast; PostgreSQL requires it. The code handles both correctly. --- ### Verdict: APPROVED Both prior issues are correctly fixed. The `Case(...).distinct()` syntax for batting games count is valid Peewee; `fn.COUNT(field.distinct())` for pitching games count is also valid. Decision-only pitchers are now included via the `Decision` table scan in `_get_player_pairs`. The full-recalculation approach is logically sound, the upsert strategy is correct for both databases, and the three new tests adequately cover the force/idempotency/self-healing behaviours. The stale docstring and missing decision-only pitcher test are minor and do not block merging. --- *Automated review by Claude PR Reviewer*

cal added 1 commit 2026-03-19 15:31:19 +00:00

fix: stale docstring + add decision-only pitcher test 46c85e6874

- evaluate_card() docstring: "Override for PlayerSeasonStats" →
  "Override for BattingSeasonStats/PitchingSeasonStats"
- New test_decision_only_pitcher: exercises the edge case where a pitcher
  has a Decision row but no StratPlay rows, verifying _get_player_pairs()
  correctly includes them via the Decision table scan

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

cal reviewed 2026-03-19 15:33:22 +00:00

cal left a comment

AI Code Review (Round 3)

Files Reviewed

app/services/season_stats.py (major rewrite)
app/services/evolution_evaluator.py (modified)
app/routers_v2/season_stats.py (modified)
tests/test_season_stats_update.py (modified + 4 new tests)
tests/test_evolution_evaluator.py (modified)
tests/test_evolution_models.py (modified)
tests/test_postgame_evolution.py (modified)

Findings

Correctness

No issues found. The full-recalculation approach is correctly implemented:

_get_player_pairs() properly covers the decision-only pitcher edge case by unioning StratPlay and Decision sources, directly addressing the Round 1 concern.
_recalc_batting() uses fn.COUNT(Case(None, [(StratPlay.pa > 0, StratPlay.game)], None).distinct()) — the .distinct() call is on the Case expression result, which is the correct pattern for counting distinct non-NULL game IDs where pa > 0.
_recalc_decisions() uses Decision.is_start.cast("INTEGER") before summing, correctly handling the boolean-to-integer arithmetic difference between SQLite and PostgreSQL.
The force=True path skips the early return without touching the ProcessedGame ledger entry — correct, since the ledger only needs to exist once per game_id.
evaluate_card() now branches on card_state.track.card_type and queries the appropriate model (BattingSeasonStats vs PitchingSeasonStats), fully resolving the broken import from Round 2.

Security

No issues found. The force parameter is a boolean query param behind the existing bearer-token auth guard.

Style & Conventions

Two stale docstring fragments remain. Neither is a correctness issue, but they contradict the actual behavior:

app/routers_v2/season_stats.py, module docstring (lines 7–8, 11) and endpoint docstring (lines 32–35) — Still reads "performs an additive upsert into player_season_stats", "Guards against double-counting via the last_game FK check". Carried over from the old delta approach.
app/services/evolution_evaluator.py, evaluate_card() function summary (line 56–57) — Still opens with "Sums all player_season_stats rows for (player_id, team_id) across all seasons". The module-level docstring and parameter docstring were updated correctly; this one sentence was missed.

Neither warrants blocking the merge.

Test Coverage

All four new tests are well-structured and cover the claimed behavior:

test_force_recalc — proves replacement upsert does not double stats
test_idempotent_reprocessing — proves two consecutive force=True calls are stable
test_partial_reprocessing_heals — proves self-healing from corrupted row state
test_decision_only_pitcher — covers the rare edge case addressed in _get_player_pairs(); correctly asserts stats.games == 0 (no StratPlay rows means COUNT(DISTINCT game) = 0), documenting the intentional design

Both Round 2 non-blocking notes are addressed: the stale "PlayerSeasonStats" reference in evaluate_card() is updated at the module and parameter level, and test_decision_only_pitcher is present and correct.

Verdict: APPROVED

The implementation is correct, the force-recalc / self-healing path is well-exercised, and there are no blocking issues. The two remaining stale docstring fragments are cosmetic follow-up cleanup only. Ready to merge.

Automated review by Claude PR Reviewer

## AI Code Review (Round 3) ### Files Reviewed - `app/services/season_stats.py` (major rewrite) - `app/services/evolution_evaluator.py` (modified) - `app/routers_v2/season_stats.py` (modified) - `tests/test_season_stats_update.py` (modified + 4 new tests) - `tests/test_evolution_evaluator.py` (modified) - `tests/test_evolution_models.py` (modified) - `tests/test_postgame_evolution.py` (modified) ### Findings #### Correctness No issues found. The full-recalculation approach is correctly implemented: - `_get_player_pairs()` properly covers the decision-only pitcher edge case by unioning StratPlay and Decision sources, directly addressing the Round 1 concern. - `_recalc_batting()` uses `fn.COUNT(Case(None, [(StratPlay.pa > 0, StratPlay.game)], None).distinct())` — the `.distinct()` call is on the `Case` expression result, which is the correct pattern for counting distinct non-NULL game IDs where pa > 0. - `_recalc_decisions()` uses `Decision.is_start.cast("INTEGER")` before summing, correctly handling the boolean-to-integer arithmetic difference between SQLite and PostgreSQL. - The `force=True` path skips the early return without touching the ProcessedGame ledger entry — correct, since the ledger only needs to exist once per game_id. - `evaluate_card()` now branches on `card_state.track.card_type` and queries the appropriate model (`BattingSeasonStats` vs `PitchingSeasonStats`), fully resolving the broken import from Round 2. #### Security No issues found. The `force` parameter is a boolean query param behind the existing bearer-token auth guard. #### Style & Conventions Two stale docstring fragments remain. Neither is a correctness issue, but they contradict the actual behavior: 1. **`app/routers_v2/season_stats.py`, module docstring (lines 7–8, 11) and endpoint docstring (lines 32–35)** — Still reads "performs an additive upsert into player_season_stats", "Guards against double-counting via the last_game FK check". Carried over from the old delta approach. 2. **`app/services/evolution_evaluator.py`, `evaluate_card()` function summary (line 56–57)** — Still opens with "Sums all player_season_stats rows for (player_id, team_id) across all seasons". The module-level docstring and parameter docstring were updated correctly; this one sentence was missed. Neither warrants blocking the merge. #### Test Coverage All four new tests are well-structured and cover the claimed behavior: - `test_force_recalc` — proves replacement upsert does not double stats - `test_idempotent_reprocessing` — proves two consecutive `force=True` calls are stable - `test_partial_reprocessing_heals` — proves self-healing from corrupted row state - `test_decision_only_pitcher` — covers the rare edge case addressed in `_get_player_pairs()`; correctly asserts `stats.games == 0` (no StratPlay rows means COUNT(DISTINCT game) = 0), documenting the intentional design Both Round 2 non-blocking notes are addressed: the stale "PlayerSeasonStats" reference in `evaluate_card()` is updated at the module and parameter level, and `test_decision_only_pitcher` is present and correct. ### Verdict: APPROVED The implementation is correct, the force-recalc / self-healing path is well-exercised, and there are no blocking issues. The two remaining stale docstring fragments are cosmetic follow-up cleanup only. Ready to merge. --- *Automated review by Claude PR Reviewer*

cal added 1 commit 2026-03-19 15:34:58 +00:00

docs: update stale docstrings to reflect full-recalculation approach d10276525e

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

cal merged commit b68e280c64 into card-evolution

2026-03-19 15:35:43 +00:00

cal deleted branch feature/season-stats-full-recalc

2026-03-19 15:35:44 +00:00

cal referenced this issue from a commit

2026-03-19 15:35:45 +00:00

Merge pull request 'refactor: full recalculation for season stats + fix evaluator bugs' (#111) from feature/season-stats-full-recalc into card-evolution

cal referenced this pull request

2026-03-19 15:39:53 +00:00

Card Evolution: season stats full recalculation → next-release #112

Sign in to join this conversation.

No reviewers

No Milestone

No project

No Assignees

1 Participants

Notifications

Due Date

The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: cal/paper-dynasty-database#111

refactor: full recalculation for season stats + fix evaluator bugs #111

Summary

Key Changes

Test plan

AI Code Review (automated)

Files Reviewed

Findings

Correctness

Security

Style & Conventions

Test Coverage

Summary

Verdict: REQUEST_CHANGES

AI Code Review (Round 2)

Files Reviewed

Prior Issues — Verification

Issue 1: fn.COUNT(fn.DISTINCT(...)) — FIXED

Issue 2: Decision-only pitchers dropped — FIXED

Findings

Correctness

Security

Style & Conventions

Concurrent Access

Test Coverage

Suggestions (non-blocking)

Verdict: APPROVED

AI Code Review (Round 3)

Files Reviewed

Findings

Correctness

Security

Style & Conventions

Test Coverage

Verdict: APPROVED

Issue 1: `fn.COUNT(fn.DISTINCT(...))` — FIXED