feat(dev): add rollback endpoint for synthetic test games #215
No reviewers
Labels
No Label
ai-changes-requested
ai-failed
ai-merged
ai-pr-opened
ai-reviewed
ai-reviewing
ai-working
autonomous
bug
enhancement
evolution
performance
phase-0
phase-1a
phase-1b
phase-1c
phase-1d
security
size:M
size:S
tech-debt
todo
type:feature
type:stability
No Milestone
No project
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: cal/paper-dynasty-database#215
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "feat/dev-cleanup-endpoint"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Adds a dev-only
DELETE /api/v2/dev/test-games/{game_id}endpoint that fully reverses all side effects of a synthetic test game created by the Discord/dev refractor-testcommand.Design spec:
docs/superpowers/specs/2026-04-11-dev-cleanup-endpoint-design.mdImplementation plan:
docs/superpowers/plans/2026-04-11-dev-cleanup-endpoint.mdWhy
The
/dev refractor-testcommand (discord#161) exercises the full refractor evaluation pipeline by creating synthetic games and plays. ItsCleanupViewbutton only deletes the game/plays/decisions — it leaves bloated season stats, orphaned variant cards, and permanently tier-upped refractor state (becauseevaluate_cardenforcescurrent_tier = max(current_tier, new_tier)). Manual SQL cleanup is currently required to re-run a test on the same card. This endpoint automates the full rollback.What's included
app/routers_v2/dev.py— router +require_dev_env()dependency that gates onPOSTGRES_DB=paperdynasty_devat request time (not import time, so prod containers still refuse)app/services/dev_rollback.py— 14-step rollback service, transactionaltests/test_dev_rollback.py— 9 unit tests (service)tests/test_dev_router.py— 6 router-level tests (HTTP contract)app/main.py— registerdev.routertests/conftest.py— addBattingCardRatings/PitchingCardRatingsto_TEST_MODELS(missing; variant card deletion requires them)No schema changes, no migrations.
Safety
POSTGRES_DB != paperdynasty_dev(prod refuses)stratgamerow missingprocessed_atis >10 minutes old (safety bail for stale games)Test summary
Follow-up
Discord cog consumer (cleanup button calls this endpoint instead of manual SQL) is a separate PR in the discord-app repo.
DO NOT DEPLOY — Cal will deploy separately after merge while manual testing is active on the dev API.
Adds DELETE /api/v2/dev/test-games/{game_id} that fully reverses all side effects of a game created by the Discord /dev refractor-test command: stratplay, decision, processed_game, stratgame deletion; refractor_card_state reset or deletion; variant card and audit row cleanup; season stats recompute. Dev-gated via require_dev_env() dependency (returns 403 when POSTGRES_DB is not 'paperdynasty_dev', checked at request time). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>AI Code Review
Files Reviewed
app/main.py(modified)app/routers_v2/dev.py(added)app/services/dev_rollback.py(added)tests/conftest.py(modified)tests/test_dev_rollback.py(added)tests/test_dev_router.py(added)Findings
Correctness
[BLOCKING] FK ordering bug — step 6b deletes variant cards before their referencing audit rows are gone (test-created states)
In
app/services/dev_rollback.py, the step ordering causes a PostgreSQL foreign key constraint violation for the most common scenario: rolling back a test game that created a new refractor state (player had no prior refractor history).Trace:
audit_rows_to_deleteusing onlyreset_state_ids(pre-existing states). Test-created states (state_ids_to_delete) are explicitly excluded.BattingCard/PitchingCardrows for both reset and test-created states.RefractorCardStaterows, which cascade-deletes their audit rows.The problem: step 6b attempts to delete a
BattingCardrow that is still referenced by aRefractorBoostAuditrow (from the test-created state).RefractorBoostAudit.battingcardis aForeignKeyFieldwith noon_deleteclause, which means PostgreSQL usesRESTRICT(immediate enforcement). The deletion at step 6b will raise a FK violation before step 9 ever runs.The SQLite test database does not enforce FK constraints by default, so all 9 unit tests pass despite the broken ordering on PostgreSQL.
Affected code path: any rollback where the test game created a new
RefractorCardState(i.e.state_ids_to_deleteis non-empty and one of those states has an audit row with a non-NULLbattingcard_idorpitchingcard_id). This is the primary use-case for the endpoint.Fix: In step 5, also collect audit rows for
state_ids_to_deleteand add them toaudit_rows_to_delete. Step 7 then deletes all audit rows (for both reset and test-created states) before step 6b deletes the card rows. The cascade in step 9 becomes a no-op for those states (no rows left to cascade), which is safe.Security
No issues. The
require_dev_env()gate checksPOSTGRES_DBat request time, correctly. Thedev.routeris unconditionally registered (visible in/api/docson prod) but every handler is gated. No injection vectors, no secrets exposure.Style & Conventions
[BLOCKING] Lazy import inside function body — violates CLAUDE.md
dev_rollback.pyline 474:This is inside the body of
rollback_test_game(). CLAUDE.md: "Never add lazy imports to middle of file." The# noqa: PLC0415comment suppresses the linter but doesn't resolve the violation.No circular import exists.
refractor_evaluator.pyimports onlydatetimeandloggingat module level — addingfrom app.services.refractor_evaluator import evaluate_cardto the top ofdev_rollback.pyis safe. Previous review (PR #187) blocked for the same pattern.Suggestions
_AUDIT_WINDOW_SECONDS = 6— tight window that could miss audit rows written slightly afterprocessed_aton a slow DB. Consider bumping to 10–15s or making it configurable alongside_MAX_AUDIT_AGE_SECONDS.test_pre_existing_state_reset_to_pre_test_tier): "Step 8 sets last_evaluated_at=NULL" — the code actually sets it todatetime.now(), not NULL. Harmless test comment, the assertion is correct.audit_rows_deletedcount excludes cascade-deleted audit rows for test-created states. If the FK bug is fixed and audit rows for test-created states are explicitly deleted in step 7, the count in the return dict will correctly reflect all deleted audit rows. Document this in the log line if it changes the count semantics.Verdict: REQUEST_CHANGES
Two blocking issues:
FK ordering bug: step 6b deletes variant
BattingCard/PitchingCardrows before theRefractorBoostAuditrows that reference them are removed. On PostgreSQL this is a FK constraint violation for test-created states with variant cards. SQLite tests pass because FK enforcement is off by default. Fix: add test-created state audit rows toaudit_rows_to_deletein step 5 so step 7 clears them before step 6b.Lazy import:
from app.services.refractor_evaluator import evaluate_cardinside the function body — move to top of file.Automated review by Claude PR Reviewer
Addressed both blocking issues from review (commit
42d6f8b):1. FK ordering bug (step 5/7)
Added a new block in step 5 that also collects
RefractorBoostAuditrows for test-created states (state_ids_to_delete) intoaudit_rows_to_delete. Step 7 now deletes all audit rows — for both reset and test-created states — before step 6b deletes theBattingCard/PitchingCardrows they reference. Previously, test-created state audit rows were expected to cascade-delete in step 9, but step 6b ran first and triggered PostgreSQL FK RESTRICT violations. Theaudit_rows_deletedcount in the response now correctly reflects all explicitly deleted audit rows.2. Lazy import
Moved
from app.services.refractor_evaluator import evaluate_cardfrom inside therollback_test_game()function body (line 474) to the module-level imports at the top of the file. No circular import exists.AI Code Review
Files Reviewed
app/main.py(modified)app/routers_v2/dev.py(added)app/services/dev_rollback.py(added)tests/test_dev_rollback.py(added)tests/test_dev_router.py(added)tests/conftest.py(modified)Prior Review Blockers — Status
Both issues flagged in the previous review pass in this revision:
FIXED — FK ordering bug (step 6b): Step 5 now explicitly collects audit rows for test-created states (the new
if state_ids_to_delete:block), and step 7 deletes all of them before step 6b runs.BattingCard/PitchingCardrows are deleted only after theirRefractorBoostAuditFK children are gone. PostgreSQL FK RESTRICT violation is no longer possible.FIXED — Lazy import:
from app.services.refractor_evaluator import evaluate_cardis now at the file top (line ~32), not nested insiderollback_test_game(). No# noqa: PLC0415needed.Findings
Correctness
No issues found.
Logic trace for the FK-ordering fix:
state_ids_to_delete(test-created) orstates_to_reset(pre-existing) — mutually exclusive.processed_at). Both categories land inaudit_rows_to_delete.variant_auditsforall_state_ids_for_variants = reset_state_ids + state_ids_to_deleteusing the same window conditions — consistent with step 5.Security
require_dev_env()readsPOSTGRES_DBat request time; defaults to""(closed), not open. Prod containers refuse even if the module is imported. ✅game_idis FastAPI-typedint— no injection surface. ✅Style & Conventions
Suggestions
variant_cards_deletedcount: If two audit rows for different states reference the samebattingcard_id(unlikely but theoretically possible), the count would be inflated relative to the actualDELETErows affected. Non-blocking — the count is informational only and the DELETE is correct either way.evaluate_cardexception swallowing:ValueErroris caught and logged as warning.Exceptionis not caught — it would propagate and roll back the transaction. This is intentional (fail-safe) but worth a comment if not already clear. Not a blocker.Verdict: COMMENT
Both prior blocking issues are resolved. Code is correct, conventions followed, security appropriate for a dev-only endpoint. Self-review restriction applies (author = repo owner) — no functional objections, ready to merge.
Automated review by Claude PR Reviewer
Phase 1 smoke test from claude-reviewer PAT — safe to dismiss
Checkout
From your project repository, check out a new branch and test the changes.