test: refractor system Tier 1+2 test coverage #170
No reviewers
Labels
No Label
ai-changes-requested
ai-failed
ai-merged
ai-pr-opened
ai-reviewed
ai-reviewing
ai-reviewing
ai-working
bug
enhancement
evolution
performance
phase-0
phase-1a
phase-1b
phase-1c
phase-1d
security
tech-debt
todo
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: cal/paper-dynasty-database#170
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "test/refractor-tier1-tier2"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Test plan
101 passed, 15 skipped— skips are existing PostgreSQL-only tests)Notable findings
compute_batter_valueis not guarded — behavior documentedevaluate-gameand team refractors return 200+empty for invalid IDs (not 404) — documented as contract🤖 Generated with Claude Code
Implements all gap tests identified in the PO review for the refractor card progression system (Phase 1 foundation). TIER 1 (critical): - T1-1: Negative singles guard in compute_batter_value — documents that hits=1, doubles=1, triples=1 produces singles=-1 and flows through unclamped (value=8.0, not 10.0) - T1-2: SP tier boundary precision with floats — outs=29 (IP=9.666) stays T0, outs=30 (IP=10.0) promotes to T1; also covers T2 float boundary - T1-3: evaluate-game with non-existent game_id returns 200 with empty results - T1-4: Seed threshold ordering + positivity invariant (t1<t2<t3<t4, all >0) TIER 2 (high): - T2-1: fully_evolved=True persists when stats are zeroed or drop below previous tier — no-regression applies to both tier and fully_evolved flag - T2-2: Parametrized edge cases for _determine_card_type: DH, C, 2B, empty string, None, and compound "SP/RP" (resolves to "sp", SP checked first) - T2-3: evaluate-game with zero StratPlay rows returns empty batch result - T2-4: GET /teams/{id}/refractors with valid team and zero states is empty - T2-5: GET /teams/99999/refractors documents 200+empty (no team existence check) - T2-6: POST /cards/{id}/evaluate with zero season stats stays at T0 value=0.0 - T2-9: Per-player error isolation — patches source module so router's local from-import picks up the patched version; one failure, one success = evaluated=1 - T2-10: Each card_type has exactly one RefractorTrack after seeding All 101 tests pass (15 PostgreSQL-only tests skip without POSTGRES_HOST). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>AI Code Review
Files Reviewed
tests/test_formula_engine.py(modified — 7 new tests added)tests/test_refractor_evaluator.py(modified —TestFullyEvolvedPersistenceclass, 2 new tests)tests/test_refractor_init.py(modified —TestDetermineCardTypeEdgeCasesclass, 1 parametrized test covering 6 cases)tests/test_refractor_seed.py(modified — 2 new tests added)tests/test_postgame_refractor.py(modified — 3 new tests added)tests/test_refractor_state_api.py(modified — new SQLite-backed section with 3 new tests)Source files read for assertion verification:
app/services/formula_engine.pyapp/services/refractor_evaluator.pyapp/services/refractor_init.pyapp/routers_v2/refractor.pyFindings
Correctness
All 13 test cases are mathematically and logically sound against the actual source implementations.
T1-1 (negative singles):
singles = 1 - 1 - 1 - 0 = -1,tb = -1 + 2 + 3 + 0 = 4,value = 0 + 4*2 = 8.0. Matchesformula_engine.compute_batter_valueexactly — no clamping guard exists in the source. The companion test assertingunclamped_value != 10.0is a clean behavior-change guard.T1-2 (SP float precision): All four boundary tests verify correctly:
29/3 = 9.666... < 10.0 → T0,30/3 = 10.0 >= 10.0 → T1,120/3 = 40.0 >= 40.0 → T2,119/3 = 39.666... < 40.0 → T1. Use ofpytest.approxfor non-integer cases is correct.T1-3 (non-existent game_id): The router does
StratPlay.select().where(StratPlay.game == 99999)→ 0 rows → emptypairsset →{"evaluated": 0, "tier_ups": []}with 200. Assertion is accurate.T1-4 (seed threshold invariant): Correctly asserts positivity and strict ascending order per track. Overlaps intentionally with the existing
test_seed_thresholds_ascendingandtest_seed_thresholds_positive— acknowledged in the PR body. The combined form is more expressive and the overlap is harmless.T2-1 (fully_evolved persistence): Verified against
refractor_evaluator.pyline 176:card_state.fully_evolved = card_state.current_tier >= 4. Withcurrent_tier = max(4, 0) = 4(no-regression),fully_evolved = True. Both sub-tests (zero-stats and partial-stats) are correct.T2-2 (position edge cases): All 6 parametrized cases match
_determine_card_type:None→(None or "").upper() = ""→ neither SP/RP/CP →"batter"."SP/RP"→"SP" in "SP/RP"fires first →"sp". Correct.T2-3 (zero plays): No
StratPlayrows →pairs = set()→ loop body never executes →{"evaluated": 0, "tier_ups": []}. Correct.T2-4/T2-5 (team refractors empty/missing): The new SQLite fixture correctly replicates the WP-13 pattern. Zero state rows for a valid or non-existent team_id returns
{"count": 0, "items": []}— the teams router queriesRefractorCardState WHERE team_id=Nwithout a team-existence check, so 200+empty is the accurate contract.T2-6 (zero season stats): Zero
BattingSeasonStatsrows → all-zero_CareerTotals→compute_batter_value(zeros) = 0.0→tier_from_value(0.0) = 0. Correct.T2-9 (per-player error isolation): The monkeypatch targets
app.services.refractor_evaluator.evaluate_card. The router performsfrom ..services.refractor_evaluator import evaluate_cardinside the async function body on each invocation (confirmed atrefractor.pyline 179), so patching the module attribute causes the local import to pick up the patched version. The counter-and-first-call-fail pattern cleanly isolates the failure.evaluated == 1andfail_player_id not in [tu["player_id"] for tu in data["tier_ups"]]correctly verifies isolation behavior.T2-10 (duplicate card_type guard): The Peewee GROUP BY + COUNT query is correct. Asserting
count == 1percard_typeis the right contract forRefractorTrack.get(card_type=...)usage throughout the system.Security
No issues. Test-only files, no hardcoded credentials, no dynamic query injection risk.
AUTH_HEADERpattern is consistent with existing tests.Style & Conventions
Pattern consistency: All new tests follow established conventions precisely — shared-memory SQLite URI,
_sa_make_*factory prefix to disambiguate from WP-13 factories,autouse=Falseon the new fixture to avoid interfering with PostgreSQL-backed module fixtures, and the "What / Why" docstring format throughout.One convention violation:
from peewee import fn as peewee_fnis placed insidetest_seed_each_card_type_has_exactly_one_trackrather than at the top oftest_refractor_seed.py. The project convention (CLAUDE.md) is "Never add lazy imports to middle of file." This import should be at module level with the other top-level imports.Suggestions
from peewee import fn as peewee_fnto the top-level imports intest_refractor_seed.py. It is the only convention violation in this PR and is a one-line fix.test_seed_all_thresholds_strictly_ascending_after_seednoting it is intentionally more expressive than the oldertest_seed_thresholds_ascending+test_seed_thresholds_positivepair, so future maintainers do not remove it as a duplicate. Not blocking.Verdict: APPROVED
All 13 test cases correctly exercise the scenarios described in the PR body. Assertions are mathematically verified against the source implementations. The new SQLite fixture structure in
test_refractor_state_api.pyfollows the established WP-13 pattern cleanly. The error isolation test (T2-9) correctly handles the lazy-import monkeypatching challenge documented in the PR notes. The only actionable item is the lazyfrom peewee import fninside a test function body, which is a minor convention violation that does not affect correctness or test isolation.Automated review by Claude PR Reviewer
Approved — 13 tests verified by pr-reviewer agent. Safe to merge.