feat: implement run-decision algorithm in gb_decide_run (#18) #72

Merged
cal merged 1 commits from ai/paper-dynasty-discord-18 into next-release 2026-03-10 14:44:43 +00:00
Owner

Summary

Replaces the placeholder formula this_resp.min_safe = 15 - aggression in gb_decide_run with a real tier-based algorithm consistent with tag_from_second and tag_from_third.

Algorithm

aggression_mod = abs(self.ahead_aggression - 5 if ai_rd > 0 else self.behind_aggression - 5)
adjusted_running = self.running + aggression_mod

if adjusted_running >= 8:
    this_resp.min_safe = 4
elif adjusted_running >= 5:
    this_resp.min_safe = 6
else:
    this_resp.min_safe = 8

if this_play.starting_outs == 2:
    this_resp.min_safe -= 2   # run on contact with 2 outs
elif this_play.starting_outs == 0:
    this_resp.min_safe += 2   # more conservative with nobody out

Design rationale:

  • Uses self.running + aggression_mod — identical pattern to tag_from_second / tag_from_third
  • Three tiers (4/6/8) give meaningful separation between Yolo (min_safe=4), Balanced (min_safe=6), and Safe (min_safe=8) managers
  • 2-out adjustment: AI should nearly always run on a grounder with 2 outs (run on contact rule)
  • 0-out adjustment: AI is more conservative to avoid giving up baserunners early in a threat

Files Changed

  • in_game/gameplay_models.py — replaced TODO placeholder in gb_decide_run
  • tests/gameplay_models/test_managerai_model.py — added test_gb_decide_run

Notes

The file was auto-reformatted by a ruff pre-commit hook when the file was saved. The functional change is confined to gb_decide_run (lines ~682–706).

Tests require testcontainers + Docker (pre-existing dev environment constraint); test infrastructure failure is not caused by this change.

## Summary Replaces the placeholder formula `this_resp.min_safe = 15 - aggression` in `gb_decide_run` with a real tier-based algorithm consistent with `tag_from_second` and `tag_from_third`. ## Algorithm ```python aggression_mod = abs(self.ahead_aggression - 5 if ai_rd > 0 else self.behind_aggression - 5) adjusted_running = self.running + aggression_mod if adjusted_running >= 8: this_resp.min_safe = 4 elif adjusted_running >= 5: this_resp.min_safe = 6 else: this_resp.min_safe = 8 if this_play.starting_outs == 2: this_resp.min_safe -= 2 # run on contact with 2 outs elif this_play.starting_outs == 0: this_resp.min_safe += 2 # more conservative with nobody out ``` **Design rationale:** - Uses `self.running + aggression_mod` — identical pattern to `tag_from_second` / `tag_from_third` - Three tiers (4/6/8) give meaningful separation between Yolo (min_safe=4), Balanced (min_safe=6), and Safe (min_safe=8) managers - 2-out adjustment: AI should nearly always run on a grounder with 2 outs (run on contact rule) - 0-out adjustment: AI is more conservative to avoid giving up baserunners early in a threat ## Files Changed - `in_game/gameplay_models.py` — replaced TODO placeholder in `gb_decide_run` - `tests/gameplay_models/test_managerai_model.py` — added `test_gb_decide_run` ## Notes The file was auto-reformatted by a ruff pre-commit hook when the file was saved. The functional change is confined to `gb_decide_run` (lines ~682–706). Tests require `testcontainers` + Docker (pre-existing dev environment constraint); test infrastructure failure is not caused by this change.
cal added 1 commit 2026-03-07 23:35:15 +00:00
feat: implement run-decision algorithm in gb_decide_run (#18)
All checks were successful
Build Docker Image / build (pull_request) Successful in 1m19s
6c4ff3bd27
Replace placeholder formula with tier-based algorithm modeled after
tag_from_second and tag_from_third. Uses self.running + aggression_mod
(abs deviation from neutral) for adjusted_running, then brackets into
three min_safe tiers (4/6/8), with a ±2 adjustment for 2-out and 0-out
situations respectively.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
cal added the
ai-reviewing
label 2026-03-07 23:45:33 +00:00
cal reviewed 2026-03-07 23:47:41 +00:00
cal left a comment
Author
Owner

AI Code Review

Files Reviewed

  • in_game/gameplay_models.py (modified — gb_decide_run algorithm + ruff reformat)
  • tests/gameplay_models/test_managerai_model.py (modified — test_gb_decide_run added)

Findings

Correctness

  • Algorithm is correct and consistent with tag_from_second / tag_from_third — uses the same aggression_mod = abs(self.ahead_aggression - 5 if ai_rd > 0 else self.behind_aggression - 5) + adjusted_running = self.running + aggression_mod pattern.
  • Three-tier thresholds (4/6/8) are appropriate for a groundball run decision and give meaningful separation between Yolo/Balanced/Safe managers.
  • Outs adjustments are baseball-correct: 2-out penalty (−2) makes the AI more aggressive (run on contact), 0-out penalty (+2) is more conservative. This is an improvement over tag_from_second, which applies the same +2 to both 0 and 2 outs.
  • Min/max reachable values are 2 (Yolo + 2 outs) and 10 (Safe + 0 outs) — both reasonable.
  • running, ahead_aggression, and behind_aggression are typed int | None but used without null guards — same pattern as the existing tag_from_second/tag_from_third, so this is a pre-existing concern and out of scope.

Security

  • No issues. Pure game logic with no external input, queries, or credentials.

Style & Conventions

  • Ruff auto-formatting is cosmetic (quote style, trailing whitespace, blank lines). No behavioral changes in the surrounding code.
  • Test docstring follows CLAUDE.md convention (explains "what" and "why"). ✓
  • Test uses factory data and asserts starting_outs == 1 before running to confirm the outs-adjustment branch is correctly skipped. ✓
  • Test assertions verified manually: Balanced (running=5, behind_aggression=5 → adjusted=5 → min_safe=6 ✓), Yolo (running=10, behind_aggression=10 → adjusted=15 → min_safe=4 ✓).

Suggestions

  • None that block merging. The 1-out case (no adjustment) is intentional and correct — 1 out is the neutral baseline.

Verdict: APPROVED

Clean replacement of the TODO placeholder. The algorithm matches the established pattern, the outs logic is sound baseball reasoning, and the tests are well-structured and correct. The ruff reformat is cosmetic.


Automated review by Claude PR Reviewer

## AI Code Review ### Files Reviewed - `in_game/gameplay_models.py` (modified — `gb_decide_run` algorithm + ruff reformat) - `tests/gameplay_models/test_managerai_model.py` (modified — `test_gb_decide_run` added) ### Findings #### Correctness - Algorithm is correct and consistent with `tag_from_second` / `tag_from_third` — uses the same `aggression_mod = abs(self.ahead_aggression - 5 if ai_rd > 0 else self.behind_aggression - 5)` + `adjusted_running = self.running + aggression_mod` pattern. - Three-tier thresholds (4/6/8) are appropriate for a groundball run decision and give meaningful separation between Yolo/Balanced/Safe managers. - Outs adjustments are baseball-correct: 2-out penalty (−2) makes the AI more aggressive (run on contact), 0-out penalty (+2) is more conservative. This is an improvement over `tag_from_second`, which applies the same `+2` to both 0 and 2 outs. - Min/max reachable values are 2 (Yolo + 2 outs) and 10 (Safe + 0 outs) — both reasonable. - `running`, `ahead_aggression`, and `behind_aggression` are typed `int | None` but used without null guards — same pattern as the existing `tag_from_second`/`tag_from_third`, so this is a pre-existing concern and out of scope. #### Security - No issues. Pure game logic with no external input, queries, or credentials. #### Style & Conventions - Ruff auto-formatting is cosmetic (quote style, trailing whitespace, blank lines). No behavioral changes in the surrounding code. - Test docstring follows CLAUDE.md convention (explains "what" and "why"). ✓ - Test uses factory data and asserts `starting_outs == 1` before running to confirm the outs-adjustment branch is correctly skipped. ✓ - Test assertions verified manually: Balanced (running=5, behind_aggression=5 → adjusted=5 → min_safe=6 ✓), Yolo (running=10, behind_aggression=10 → adjusted=15 → min_safe=4 ✓). #### Suggestions - None that block merging. The 1-out case (no adjustment) is intentional and correct — 1 out is the neutral baseline. ### Verdict: APPROVED Clean replacement of the TODO placeholder. The algorithm matches the established pattern, the outs logic is sound baseball reasoning, and the tests are well-structured and correct. The ruff reformat is cosmetic. --- *Automated review by Claude PR Reviewer*
cal added
ai-reviewed
and removed
ai-reviewing
labels 2026-03-07 23:48:04 +00:00
cal changed target branch from main to next-release 2026-03-10 14:44:30 +00:00
cal merged commit 7c76d1e2c6 into next-release 2026-03-10 14:44:43 +00:00
cal deleted branch ai/paper-dynasty-discord-18 2026-03-10 14:44:43 +00:00
Sign in to join this conversation.
No reviewers
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: cal/paper-dynasty-discord#72
No description provided.