feat(WP-13): post-game callback endpoints for season stats and evolution #109

cal · 2026-03-18T20:51:03Z

cal commented

2026-03-18 20:51:03 +00:00

Closes #78

Summary

Implements the two post-game callback API endpoints described in WP-13.

POST /api/v2/season-stats/update-game/{game_id} — delegates to the existing update_season_stats() service (WP-05) and returns {"updated": N, "skipped": bool}. The service's ProcessedGame ledger enforces full idempotency against re-delivery.
POST /api/v2/evolution/evaluate-game/{game_id} — collects all unique (player_id, team_id) pairs from the game's StratPlay rows, calls evaluate_card() for each pair that has an EvolutionCardState, and returns {"evaluated": N, "tier_ups": [...]} with player name, old/new tier, current value, and track name for any tier advancements.

New files

File	Purpose
`app/services/evolution_evaluator.py`	`evaluate_card()` service (WP-08) — sums career PlayerSeasonStats, delegates formula computation to the formula engine, updates EvolutionCardState with no-regression guarantee
`tests/test_postgame_evolution.py`	10 integration tests covering the full end-to-end path via HTTP TestClient

Changed files

File	Change
`app/routers_v2/season_stats.py`	Rewritten to delegate to `update_season_stats()` service instead of raw SQL
`app/routers_v2/evolution.py`	Added `POST /evaluate-game/{game_id}` endpoint
`app/main.py`	Registered `season_stats` router

Test plan

test_update_game_creates_season_stats_rows — POST creates player_season_stats rows
test_update_game_response_shape — response shape {"updated": N, "skipped": false}
test_update_game_idempotent — second call returns skipped: true, stats unchanged
test_evaluate_game_increases_current_value — current_value increases after stats + eval
test_evaluate_game_tier_advancement — tier advances when threshold crossed
test_evaluate_game_no_tier_advancement — tier stays at 0 when insufficient stats
test_evaluate_game_tier_ups_in_response — tier_ups list contains correct fields
test_evaluate_game_skips_players_without_state — missing states silently skipped
test_auth_required_update_game — 401 without bearer token
test_auth_required_evaluate_game — 401 without bearer token

All 81 tests pass, 5 PostgreSQL integration tests skipped (no POSTGRES_HOST).

Design notes

Thread-safe SQLite for tests: The TestClient runs route handlers in a separate thread. SQLite :memory: databases are per-connection, so a new thread would get an empty DB. The test suite uses a named shared-cache URI (file:wp13test?mode=memory&cache=shared) with a db_middleware that calls connect(reuse_if_open=True) so the handler thread joins the same in-memory database as the fixture thread.

Non-fatal batch evaluation: Individual evaluate_card() failures are logged but do not abort the batch. The endpoint always returns a summary even if some players fail.

tier_ups shape:

{
  "evaluated": 24,
  "tier_ups": [
    {
      "player_id": 42,
      "team_id": 7,
      "player_name": "Mike Trout",
      "old_tier": 1,
      "new_tier": 2,
      "current_value": 160.5,
      "track_name": "Batter Track"
    }
  ]
}

Closes #78 ## Summary Implements the two post-game callback API endpoints described in WP-13. - `POST /api/v2/season-stats/update-game/{game_id}` — delegates to the existing `update_season_stats()` service (WP-05) and returns `{"updated": N, "skipped": bool}`. The service's ProcessedGame ledger enforces full idempotency against re-delivery. - `POST /api/v2/evolution/evaluate-game/{game_id}` — collects all unique `(player_id, team_id)` pairs from the game's StratPlay rows, calls `evaluate_card()` for each pair that has an EvolutionCardState, and returns `{"evaluated": N, "tier_ups": [...]}` with player name, old/new tier, current value, and track name for any tier advancements. ## New files | File | Purpose | |---|---| | `app/services/evolution_evaluator.py` | `evaluate_card()` service (WP-08) — sums career PlayerSeasonStats, delegates formula computation to the formula engine, updates EvolutionCardState with no-regression guarantee | | `tests/test_postgame_evolution.py` | 10 integration tests covering the full end-to-end path via HTTP TestClient | ## Changed files | File | Change | |---|---| | `app/routers_v2/season_stats.py` | Rewritten to delegate to `update_season_stats()` service instead of raw SQL | | `app/routers_v2/evolution.py` | Added `POST /evaluate-game/{game_id}` endpoint | | `app/main.py` | Registered `season_stats` router | ## Test plan - [x] `test_update_game_creates_season_stats_rows` — POST creates player_season_stats rows - [x] `test_update_game_response_shape` — response shape `{"updated": N, "skipped": false}` - [x] `test_update_game_idempotent` — second call returns `skipped: true`, stats unchanged - [x] `test_evaluate_game_increases_current_value` — current_value increases after stats + eval - [x] `test_evaluate_game_tier_advancement` — tier advances when threshold crossed - [x] `test_evaluate_game_no_tier_advancement` — tier stays at 0 when insufficient stats - [x] `test_evaluate_game_tier_ups_in_response` — tier_ups list contains correct fields - [x] `test_evaluate_game_skips_players_without_state` — missing states silently skipped - [x] `test_auth_required_update_game` — 401 without bearer token - [x] `test_auth_required_evaluate_game` — 401 without bearer token All 81 tests pass, 5 PostgreSQL integration tests skipped (no POSTGRES_HOST). ## Design notes **Thread-safe SQLite for tests:** The TestClient runs route handlers in a separate thread. SQLite `:memory:` databases are per-connection, so a new thread would get an empty DB. The test suite uses a named shared-cache URI (`file:wp13test?mode=memory&cache=shared`) with a db_middleware that calls `connect(reuse_if_open=True)` so the handler thread joins the same in-memory database as the fixture thread. **Non-fatal batch evaluation:** Individual evaluate_card() failures are logged but do not abort the batch. The endpoint always returns a summary even if some players fail. **tier_ups shape:** ```json { "evaluated": 24, "tier_ups": [ { "player_id": 42, "team_id": 7, "player_name": "Mike Trout", "old_tier": 1, "new_tier": 2, "current_value": 160.5, "track_name": "Batter Track" } ] } ```

cal added 1 commit 2026-03-18 20:51:04 +00:00

feat(WP-13): post-game callback endpoints for season stats and evolution b151923480

Implements two new API endpoints the bot calls after a game completes:

  POST /api/v2/season-stats/update-game/{game_id}
    Delegates to update_season_stats() service (WP-05). Returns
    {"updated": N, "skipped": bool} with idempotency via ProcessedGame ledger.

  POST /api/v2/evolution/evaluate-game/{game_id}
    Finds all (player_id, team_id) pairs from the game's StratPlay rows,
    calls evaluate_card() for each pair that has an EvolutionCardState,
    and returns {"evaluated": N, "tier_ups": [...]} with full tier-up detail.

New files:
  app/services/evolution_evaluator.py — evaluate_card() service (WP-08)
  tests/test_postgame_evolution.py    — 10 integration tests (all pass)

Modified files:
  app/routers_v2/season_stats.py — rewritten to delegate to the service
  app/routers_v2/evolution.py    — evaluate-game endpoint added
  app/main.py                    — season_stats router registered

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

cal reviewed 2026-03-18 20:55:14 +00:00

cal left a comment

AI Code Review

Files Reviewed

app/main.py (modified)
app/routers_v2/season_stats.py (modified)
app/routers_v2/evolution.py (modified)
app/services/evolution_evaluator.py (added)
tests/test_postgame_evolution.py (added)

Also reviewed as context: app/services/season_stats.py, app/services/formula_engine.py, app/db_engine.py, app/dependencies.py.

Findings

Correctness

season-stats endpoint (season_stats.py): Correctly delegates to update_season_stats() rather than raw SQL. The response assembly correctly sums batters_updated and pitchers_updated from the service result dict and passes through skipped. The service's last_game FK guard handles idempotency. The documented limitation (out-of-order re-delivery bypasses the guard) is honestly disclosed in both the service docstring and the issue reference (#105).

evaluate-game endpoint (evolution.py): Correctly collects unique (player_id, team_id) pairs from StratPlay rows by checking both batter_id and pitcher_id. The pre-fetch of EvolutionCardState before calling evaluate_card() is the correct skip gate — pairs without a state row are skipped without error. old_tier is captured before the call, new_tier comes from the returned dict, and the comparison new_tier > old_tier correctly identifies tier advancements.

evaluate_card service (evolution_evaluator.py): Career totals are correctly summed across all seasons for the (player_id, team_id) pair. The no-regression guarantee (max(current_tier, new_tier)) is correctly implemented. fully_evolved is set from the post-max tier value (card_state.current_tier >= 4), not from new_tier, which is correct. last_evaluated_at is set unconditionally on every call.

Field naming in evaluate_card(): The service uses _state_model.player_id == player_id and _state_model.team_id == team_id in where-clauses. In Peewee, FK fields expose both .player (the FK accessor) and .player_id (the raw integer column). Both forms are valid and this usage is consistent with the rest of the codebase (e.g. _upsert_sqlite in season_stats.py uses player_id= and team_id= in get_or_create). No bug here.

_CareerTotals.strikeouts mapping: PlayerSeasonStats stores pitcher strikeouts as k, and _CareerTotals maps r.k to self.strikeouts. The formula engine's PitcherStats Protocol expects strikeouts, so the translation is correct and well-commented.

main.py router registration: season_stats is added to both the import tuple and the app.include_router() call. Correct.

Response shapes: Both endpoints return the shapes specified in the PR description and verified against the spec.

Security

Both endpoints guard with valid_token(token) before any DB access, consistent with every other authenticated endpoint in the codebase. valid_token uses hmac.compare_digest (timing-safe). No injection risk — all queries use Peewee ORM, no raw SQL is introduced by this PR. No secrets or credentials in new code.

The error detail on 500 responses (f"Season stats update failed for game {game_id}: {exc}") leaks the exception message to the caller. This is a pre-existing pattern in this codebase and is acceptable for an internal bot API.

Style & Conventions

All new code follows the established patterns: lazy imports inside route handlers, logging.getLogger(__name__) at module level, auth check at the top of each handler, and structured log messages with %s format strings rather than f-strings. The noqa: PLC0415 suppression for lazy imports inside functions is used correctly and consistently with the rest of the codebase.

The # noqa: E402 additions to main.py imports are cosmetic cleanup and harmless.

Edge Cases

Game with no plays: evaluate-game with a game_id that has zero StratPlay rows returns {"evaluated": 0, "tier_ups": []} with 200. Correct.
Game with no EvolutionCardState for any player: Same result. Correct.
Individual evaluate_card() failure: Logged at ERROR level, skipped, batch continues. The failed player is not counted in evaluated. Correct non-fatal behavior.
Player name lookup failure in tier-up path: Caught separately; falls back to "player_{player_id}" string. This prevents a tier-up from being silently dropped if the Player row is missing.

Suggestions (non-blocking)

Missing test for evaluate-game with a non-existent game_id: The endpoint would return {"evaluated": 0, "tier_ups": []} because StratPlay.select().where(StratPlay.game == game_id) returns an empty queryset for an unknown game. This is reasonable behavior, but an explicit test would lock in the contract.
fully_evolved threshold hardcoded to 4: card_state.current_tier >= 4 assumes T4 is always the max tier. If the track system ever gains a T5, this will silently mis-classify. Consider deriving it from the track definition, or at minimum adding a comment explaining the invariant.
Test db patch is permanent for the module: _season_stats_module.db = _wp13_db is set once at import time and never restored. This is fine for an isolated test module, but a monkeypatch fixture would be safer if this module ever runs alongside others that depend on the real db reference.

Verdict: APPROVED

The implementation is correct and complete. Both endpoints delegate properly to their respective service layers, auth is enforced on both routes, idempotency is preserved through the existing last_game guard, tier-up detection is logically sound, and non-fatal batch evaluation is correctly implemented. The 10-test suite covers the critical paths including idempotency, tier advancement, no-advancement, skip-without-state, and auth. The suggestions above are minor and do not block merging.

Automated review by Claude PR Reviewer

## AI Code Review ### Files Reviewed - `app/main.py` (modified) - `app/routers_v2/season_stats.py` (modified) - `app/routers_v2/evolution.py` (modified) - `app/services/evolution_evaluator.py` (added) - `tests/test_postgame_evolution.py` (added) Also reviewed as context: `app/services/season_stats.py`, `app/services/formula_engine.py`, `app/db_engine.py`, `app/dependencies.py`. --- ### Findings #### Correctness **season-stats endpoint** (`season_stats.py`): Correctly delegates to `update_season_stats()` rather than raw SQL. The response assembly correctly sums `batters_updated` and `pitchers_updated` from the service result dict and passes through `skipped`. The service's `last_game` FK guard handles idempotency. The documented limitation (out-of-order re-delivery bypasses the guard) is honestly disclosed in both the service docstring and the issue reference (#105). **evaluate-game endpoint** (`evolution.py`): Correctly collects unique `(player_id, team_id)` pairs from `StratPlay` rows by checking both `batter_id` and `pitcher_id`. The pre-fetch of `EvolutionCardState` before calling `evaluate_card()` is the correct skip gate — pairs without a state row are skipped without error. `old_tier` is captured before the call, `new_tier` comes from the returned dict, and the comparison `new_tier > old_tier` correctly identifies tier advancements. **evaluate_card service** (`evolution_evaluator.py`): Career totals are correctly summed across all seasons for the `(player_id, team_id)` pair. The no-regression guarantee (`max(current_tier, new_tier)`) is correctly implemented. `fully_evolved` is set from the post-max tier value (`card_state.current_tier >= 4`), not from `new_tier`, which is correct. `last_evaluated_at` is set unconditionally on every call. **Field naming in `evaluate_card()`**: The service uses `_state_model.player_id == player_id` and `_state_model.team_id == team_id` in where-clauses. In Peewee, FK fields expose both `.player` (the FK accessor) and `.player_id` (the raw integer column). Both forms are valid and this usage is consistent with the rest of the codebase (e.g. `_upsert_sqlite` in `season_stats.py` uses `player_id=` and `team_id=` in `get_or_create`). No bug here. **`_CareerTotals.strikeouts` mapping**: `PlayerSeasonStats` stores pitcher strikeouts as `k`, and `_CareerTotals` maps `r.k` to `self.strikeouts`. The formula engine's `PitcherStats` Protocol expects `strikeouts`, so the translation is correct and well-commented. **main.py router registration**: `season_stats` is added to both the import tuple and the `app.include_router()` call. Correct. **Response shapes**: Both endpoints return the shapes specified in the PR description and verified against the spec. #### Security Both endpoints guard with `valid_token(token)` before any DB access, consistent with every other authenticated endpoint in the codebase. `valid_token` uses `hmac.compare_digest` (timing-safe). No injection risk — all queries use Peewee ORM, no raw SQL is introduced by this PR. No secrets or credentials in new code. The error detail on 500 responses (`f"Season stats update failed for game {game_id}: {exc}"`) leaks the exception message to the caller. This is a pre-existing pattern in this codebase and is acceptable for an internal bot API. #### Style & Conventions All new code follows the established patterns: lazy imports inside route handlers, `logging.getLogger(__name__)` at module level, auth check at the top of each handler, and structured log messages with `%s` format strings rather than f-strings. The `noqa: PLC0415` suppression for lazy imports inside functions is used correctly and consistently with the rest of the codebase. The `# noqa: E402` additions to `main.py` imports are cosmetic cleanup and harmless. #### Edge Cases - **Game with no plays**: `evaluate-game` with a `game_id` that has zero `StratPlay` rows returns `{"evaluated": 0, "tier_ups": []}` with 200. Correct. - **Game with no EvolutionCardState for any player**: Same result. Correct. - **Individual `evaluate_card()` failure**: Logged at ERROR level, skipped, batch continues. The failed player is not counted in `evaluated`. Correct non-fatal behavior. - **Player name lookup failure in tier-up path**: Caught separately; falls back to `"player_{player_id}"` string. This prevents a tier-up from being silently dropped if the Player row is missing. #### Suggestions (non-blocking) 1. **Missing test for evaluate-game with a non-existent game_id**: The endpoint would return `{"evaluated": 0, "tier_ups": []}` because `StratPlay.select().where(StratPlay.game == game_id)` returns an empty queryset for an unknown game. This is reasonable behavior, but an explicit test would lock in the contract. 2. **`fully_evolved` threshold hardcoded to 4**: `card_state.current_tier >= 4` assumes T4 is always the max tier. If the track system ever gains a T5, this will silently mis-classify. Consider deriving it from the track definition, or at minimum adding a comment explaining the invariant. 3. **Test db patch is permanent for the module**: `_season_stats_module.db = _wp13_db` is set once at import time and never restored. This is fine for an isolated test module, but a `monkeypatch` fixture would be safer if this module ever runs alongside others that depend on the real `db` reference. --- ### Verdict: APPROVED The implementation is correct and complete. Both endpoints delegate properly to their respective service layers, auth is enforced on both routes, idempotency is preserved through the existing `last_game` guard, tier-up detection is logically sound, and non-fatal batch evaluation is correctly implemented. The 10-test suite covers the critical paths including idempotency, tier advancement, no-advancement, skip-without-state, and auth. The suggestions above are minor and do not block merging. --- *Automated review by Claude PR Reviewer*

cal force-pushed feature/wp-13-postgame-callbacks from b151923480 to a2d2aa3d31

2026-03-18 21:05:32 +00:00

Compare

cal merged commit 6b6359c6bb into card-evolution

2026-03-18 21:05:38 +00:00

cal deleted branch feature/wp-13-postgame-callbacks

2026-03-18 21:05:38 +00:00

cal referenced this issue from a commit

2026-03-18 21:05:39 +00:00

Merge pull request 'feat(WP-13): post-game callback endpoints for season stats and evolution' (#109) from feature/wp-13-postgame-callbacks into card-evolution

Sign in to join this conversation.

No reviewers

No Milestone

No project

No Assignees

1 Participants

Notifications

Due Date

The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: cal/paper-dynasty-database#109