docs: reconcile PRD boost spec with shipped implementation

Section 5.3 of 05-rating-boosts.md described profile-based boost
distribution (power/contact/patient hitter profiles) that was never
built. Updated to document the actual shipped algorithms: fixed column
deltas for batters and TB-budget priority drain for pitchers, both in
database/app/services/refractor_boost.py.

Corrected the truncation behavior — the implementation scales positive
deltas proportionally rather than discarding them, preserving the
108-sum exactly in all cases.

Updated REFRACTOR_PHASE2_VALIDATION_SPEC.md T4-1 to reflect shipped
function signatures and marked the case as complete.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Cal Corum 2026-04-07 20:25:02 -05:00
parent f868e5a329
commit 3ad893c949
2 changed files with 305 additions and 30 deletions

View File

@ -61,40 +61,48 @@ the upgrade logic must use an ordered rarity ladder, not arithmetic.
---
### T4-1: 108-sum preservation under profile-based boosts
### T4-1: 108-sum preservation under batter and pitcher boosts
**Status:** Pending — Phase 2
**Status:** Shipped — Phase 2 complete
> **Updated 2026-04-08:** Profile-based boost distribution was not implemented. The shipped
> implementation uses `apply_batter_boost()` (fixed column deltas) and `apply_pitcher_boost()`
> (TB-budget priority algorithm) in `database/app/services/refractor_boost.py`. There is no
> `apply_evolution_boosts(card_ratings, boost_tier, player_profile)` function and no
> `pd_cards/evo/boost_profiles.py` module. See `docs/prd-evolution/05-rating-boosts.md`
> section 5.3 for the shipped algorithm details.
**Scenario:**
`apply_evolution_boosts(card_ratings, boost_tier, player_profile)` redistributes 1.0 chance per
tier across outcome columns according to the player's detected profile (power hitter, contact
hitter, patient hitter, starting pitcher, relief pitcher). Every combination of profile and tier
must leave the 22-column sum exactly equal to 108 after the boost is applied. This must hold for
all four tier applications, cumulative as well as individual.
`apply_batter_boost(ratings_dict)` applies fixed deltas (+0.50 to `homerun`, `double_pull`,
`single_one`, `walk`; -1.50 from `strikeout`, -0.50 from `groundout_a`) per tier. The 22-column
sum must equal exactly 108 after every application.
The edge case: a batter card where `flyout_a = 0`. The power and contact hitter profiles draw
reductions from out columns including `flyout_a`. If the preferred reduction column is at zero,
the implementation must not produce a negative value and must not silently drop the remainder of
the budget. The 0-floor cap is enforced per column (see `05-rating-boosts.md` section 5.1:
"Truncated points are lost, not redistributed").
`apply_pitcher_boost(ratings_dict, tb_budget=1.5)` drains a 1.5 TB-unit budget by converting
hit-allowed chances into strikeouts in priority order. The 18 variable outcome columns must sum
to their pre-boost total (the conversion is chance-for-chance; only column identity changes,
not the total).
The edge case: a batter card where `strikeout = 0` and `groundout_a = 0`. The negative funding
columns are both at zero, so no reduction can occur. The shipped implementation handles this by
scaling the positive deltas to zero (`scale = 0`), leaving all columns unchanged. The 108-sum
is preserved exactly. A warning is logged.
Verify:
- After each of T1, T2, T3, T4 boost applications, `sum(all outcome columns) == 108` exactly.
- A card with `flyout_a = 0` does not raise an error and does not produce a column below 0.
- When truncation occurs (column already at 0), the lost budget is discarded, not moved
elsewhere — the post-boost sum will be less than 108 + budget_added only in the case of
truncation, but must never exceed 108.
- After each of T1, T2, T3, T4 boost applications, `sum(all batter outcome columns) == 108`.
- After each pitcher boost, `sum(pitcher outcome columns) + sum(xcheck columns) == 108`.
- A batter card with `strikeout = 0` and `groundout_a = 0` does not raise an error, does not
produce any column below 0, and leaves the sum at exactly 108.
- No column value falls below 0 under any input.
**Expected Outcome:**
Sum remains 108 after every boost under non-truncation conditions. Under truncation conditions
(a column hits 0), the final column sum must equal exactly `108 - truncated_amount` — where
`truncated_amount` is the portion of the 1.0-chance budget that was dropped due to the 0-floor
cap. This is a single combined assertion: `sum(columns) == 108 - truncated_amount`. Checking
"sum <= 108" and "truncated amount was discarded" as two independent conditions is insufficient
— a test can pass both checks while the sum is wrong for an unrelated reason (e.g., a positive
column also lost value due to a bug). No column value falls below 0.
(funding columns already at or near zero), the positive deltas are scaled proportionally to the
amount actually reduced — the 108-sum is preserved exactly (not approximately). The original
spec's statement that "truncated points are lost, not redistributed" does not reflect the
shipped behavior: positive deltas ARE scaled down to match what was taken, ensuring the sum
invariant holds in all cases. No column value falls below 0.
**Risk If Failed:**
@ -105,10 +113,9 @@ corrupts game results without any visible failure.
**Files Involved:**
- `docs/prd-evolution/05-rating-boosts.md` — boost budget, profile definitions, cap behavior
- Phase 2: `pd_cards/evo/boost_profiles.py` (to be created) — `apply_evolution_boosts`
- `batters/creation.py``battingcardratings` model column set (22 columns)
- `pitchers/creation.py``pitchingcardratings` model column set (18 columns + 9 x-checks)
- `docs/prd-evolution/05-rating-boosts.md` — section 5.3 (shipped algorithm), section 5.1 (cap behavior)
- `database/app/services/refractor_boost.py``apply_batter_boost`, `apply_pitcher_boost` (shipped)
- `database/tests/test_refractor_boost.py` — existing test coverage for these functions
---
@ -153,10 +160,10 @@ undermines the core design intent.
**Files Involved:**
- `docs/prd-evolution/05-rating-boosts.md` — section 5.2 (boost budgets), section 5.3 (profiles)
- `docs/prd-evolution/05-rating-boosts.md` — section 5.2 (boost budgets), section 5.3 (shipped algorithm)
- `rarity_thresholds.py` — OPS boundary values used to assess whether evolution crosses a rarity
threshold as a side effect (it should not for mid-range cards)
- Phase 2: `pd_cards/evo/boost_profiles.py` — boost distribution logic
- `database/app/services/refractor_boost.py``apply_batter_boost`, `apply_pitcher_boost` (shipped)
---
@ -455,7 +462,7 @@ heavily in T3 at season end will be angry when their progress disappears.
| ID | Title | Status |
|---|---|---|
| T4-1 | 108-sum preservation under profile-based boosts | Pending — Phase 2 |
| T4-1 | 108-sum preservation under batter and pitcher boosts | Shipped — Phase 2 complete |
| T4-2 | D20 probability shift at T4 | Pending — Phase 2 |
| T4-3 | T4 rarity upgrade — pipeline collision risk | Pending — Phase 2 |
| T4-4 | T4 rarity cap for HoF cards | Pending — Phase 2 |

View File

@ -0,0 +1,268 @@
# 5. Rating Boost Mechanics
[< Back to Index](README.md) | [Next: Database Schema >](06-database.md)
---
## 5.1 Rating Model Overview
The card rating system is built on the `battingcardratings` and `pitchingcardratings` models.
Each model defines outcome columns whose values represent chances out of a **108-chance total**
(derived from the D20 probability system: 2d6 × 3 columns × 6 rows = 108 total chances).
**Batter ratings** have **22 outcome columns** summing to 108:
| Category | Columns |
|---|---|
| Hits | `homerun`, `bp_homerun`, `triple`, `double_three`, `double_two`, `double_pull`, `single_two`, `single_one`, `single_center`, `bp_single` |
| On-base | `hbp`, `walk` |
| Outs | `strikeout`, `lineout`, `popout`, `flyout_a`, `flyout_bq`, `flyout_lf_b`, `flyout_rf_b`, `groundout_a`, `groundout_b`, `groundout_c` |
**Pitcher ratings** have **18 outcome columns + 9 x-check fields** summing to 108:
| Category | Columns |
|---|---|
| Hits allowed | `homerun`, `bp_homerun`, `triple`, `double_three`, `double_two`, `double_cf`, `single_two`, `single_one`, `single_center`, `bp_single` |
| On-base | `hbp`, `walk` |
| Outs | `strikeout`, `flyout_lf_b`, `flyout_cf_b`, `flyout_rf_b`, `groundout_a`, `groundout_b` |
| X-checks | `xcheck_p` (1), `xcheck_c` (3), `xcheck_1b` (2), `xcheck_2b` (6), `xcheck_3b` (3), `xcheck_ss` (7), `xcheck_lf` (2), `xcheck_cf` (3), `xcheck_rf` (2) — always sum to 29 |
**Key differences:** Batters have `double_pull`, pitchers have `double_cf`. Batters have
`lineout`, `popout`, `flyout_a`, `flyout_bq`, `groundout_c` — pitchers do not. Pitchers have
`flyout_cf_b` and x-check fields — batters do not.
Evolution boosts apply **flat deltas to individual result columns** within these models. The
108-sum constraint must be maintained: any increase to a positive outcome column requires an
equal decrease to a negative outcome column.
### Rating Cap Enforcement
All boosts are subject to the existing hard caps on individual stat columns. If applying a delta
would push a value past its cap, the delta is **truncated** to the cap value.
**Key caps (from existing card creation system):**
| Stat | Cap | Direction | Example |
|---|---|---|---|
| Hold rating (pitcher) | -5 | Lower is better | A pitcher at -4 hold can only receive -1 more |
| Result columns | 0 floor | Cannot go negative | A 0.1 strikeout column can only lose 0.1 |
**Truncated points are lost, not redistributed.** If a boost would push a stat past its cap, the
delta is truncated and the excess is simply discarded. This is an intentional soft penalty for
cards that are already near their ceiling — they're being penalized because they're already that
good. Lower-rated cards have more headroom and benefit more from the same flat delta.
## 5.2 Boost Budgets Per Tier
Rating boosts are defined as **flat deltas to specific result columns** within the 108-sum model.
The budget per tier is the total number of chances that can be shifted from negative outcomes
(outs) to positive outcomes (hits, on-base).
| Tier | Batter Budget | Pitcher TB Budget | Approx Impact |
|------|--------------|-------------------|---------------|
| T1 | 2.0 chances net (+2.0 pos, -2.0 neg) | 1.5 TB units | Fixed deltas / priority drain |
| T2 | 2.0 chances net | 1.5 TB units | Same — consistent per-tier reward |
| T3 | 2.0 chances net | 1.5 TB units | Same — consistent per-tier reward |
| T4 | 2.0 chances net | 1.5 TB units | Same — plus rarity upgrade |
| **Total** | **8.0 chances net** | **6.0 TB units** | **~7.4% of chances shifted (batter)** |
Every tier provides the same fixed boost. T4 is distinguished not by a larger delta but by the
rarity upgrade, which is the real capstone reward.
**Flat delta design rationale:** All cards receive the same absolute boost regardless of rarity.
A Replacement card (where `homerun` might be 0.3) gains much more relative value from a fixed
+0.50 HR boost than a Hall of Fame card (where `homerun` might be 5.0). This intentionally
incentivizes using lower-rated cards and prevents elite cards from becoming god-tier. Cards
already near column caps receive even less due to truncation.
**Example — T1 batter boost:**
```
homerun: +0.50 (from 2.0 → 2.50)
double_pull: +0.50 (from 3.5 → 4.00)
single_one: +0.50 (from 4.0 → 4.50)
walk: +0.50 (from 3.0 → 3.50)
strikeout: -1.50 (from 15.0 → 13.50)
groundout_a: -0.50 (from 8.0 → 7.50)
Net: +2.0 / -2.0 = 0, sum stays at 108
```
## 5.3 Shipped Boost Distribution
> **Updated 2026-04-08 to reflect shipped implementation.**
> The original spec described profile-based boost distribution (power hitter, contact hitter,
> patient hitter profiles). The implementation uses a simpler, more predictable approach:
> fixed deltas for batters and a TB-budget priority algorithm for pitchers. Profile detection
> was not implemented.
### 5.3.1 Batter Boost — Fixed Column Deltas
Every batter receives identical fixed deltas per tier regardless of their profile. There is no
player-style detection. The implementation is in `apply_batter_boost()` in
`database/app/services/refractor_boost.py`.
**Positive deltas (applied each tier):**
| Column | Delta |
|---|---|
| `homerun` | +0.50 |
| `double_pull` | +0.50 |
| `single_one` | +0.50 |
| `walk` | +0.50 |
**Negative deltas (funding source):**
| Column | Delta |
|---|---|
| `strikeout` | -1.50 |
| `groundout_a` | -0.50 |
**0-floor truncation behavior:** If `strikeout` or `groundout_a` cannot supply their full
requested reduction (because the column is already near zero), the positive deltas are scaled
proportionally so the 108-sum invariant is always preserved. Specifically:
1. Negative deltas are applied first, each capped at the column's current value (0 floor).
2. The total amount actually reduced is computed.
3. Positive deltas are scaled by `actually_reduced / total_requested_reduction` so that
additions always equal reductions.
4. A warning is logged when truncation occurs.
This differs from the original spec's statement that "truncated points are lost, not
redistributed." In the shipped implementation, positive deltas are scaled down to match what
was actually taken — the 108-sum is always exactly preserved.
### 5.3.2 Pitcher Boost — TB-Budget Priority Algorithm
Pitchers use a total-bases budget approach instead of fixed column deltas. Each tier awards a
**1.5 TB-unit budget**. The algorithm converts hit-allowed chances into strikeouts, iterating
through outcome types in priority order (most damaging hits first) until the budget is exhausted.
The implementation is in `apply_pitcher_boost()` in `database/app/services/refractor_boost.py`.
**Priority order and TB cost per chance:**
| Priority | Column | TB Cost |
|---|---|---|
| 1 | `double_cf` | 2 |
| 2 | `double_three` | 2 |
| 3 | `double_two` | 2 |
| 4 | `single_center` | 1 |
| 5 | `single_two` | 1 |
| 6 | `single_one` | 1 |
| 7 | `bp_single` | 1 |
| 8 | `walk` | 1 |
| 9 | `homerun` | 4 |
| 10 | `bp_homerun` | 4 |
| 11 | `triple` | 3 |
| 12 | `hbp` | 1 |
**Algorithm per tier:**
1. Start with `remaining = 1.5` TB budget.
2. Iterate priority list in order. Skip columns already at 0.
3. For each column: compute `chances_to_take = min(column_value, remaining / tb_cost)`.
4. Reduce the column by `chances_to_take`; add `chances_to_take` to `strikeout`.
5. Reduce `remaining` by `chances_to_take * tb_cost`.
6. Stop when `remaining <= 0` or the priority list is exhausted.
X-check columns (`xcheck_p` through `xcheck_rf`, always summing to 29) are never touched by
the boost algorithm.
**Budget not fully spent:** If all priority columns are already at zero before the budget is
exhausted (extremely rare), the remaining budget is discarded and a warning is logged.
**No separate SP vs. RP logic:** The same algorithm applies to both starting pitchers and
relief pitchers. Card type (`sp` vs. `rp`) determines how the card is used in the game engine
but does not change the boost formula.
### 5.3.3 Function Signatures (Shipped)
The boost logic lives in the **database repo** (`database/app/services/refractor_boost.py`),
not in card-creation. The functions called per tier-up are:
```python
# Batter
apply_batter_boost(ratings_dict: dict) -> dict
# Pitcher (sp or rp)
apply_pitcher_boost(ratings_dict: dict, tb_budget: float = 1.5) -> dict
```
Both functions accept a dict of outcome column values and return a new dict with updated values
(all other keys passed through unchanged). They are pure functions — no DB access.
The orchestration function that applies the correct boost, creates the variant card row, updates
`RefractorCardState`, and writes the audit record is:
```python
apply_tier_boost(
player_id: int,
team_id: int,
new_tier: int,
card_type: str, # 'batter', 'sp', or 'rp'
...injectable test stubs...
) -> dict # {'variant_created': int, 'boost_deltas': dict}
```
The `card-creation` repo does not contain boost application code. The `pd_cards/evo/` package
referenced in the original spec was not created; the boost logic was implemented directly in the
database API service layer.
## 5.4 Rarity Upgrade at T4
When a card completes T4, the card's rarity is upgraded by one tier (if below HoF):
- The `player.rarity_id` field is incremented by one step (e.g., Sta -> All)
- The card's base rating recalculation is skipped; only the T4 boost deltas are applied on top of the
accumulated evolved ratings
- The card cost field is NOT automatically recalculated (rarity upgrade is a gameplay reward, not
a market event; admin can manually adjust if needed)
- The rarity change is recorded in `evolution_card_state.final_rarity_id` for audit purposes
- **HoF cards cannot upgrade further** — they receive the T4 boost deltas but no rarity change
**Live series interaction:** If a card's rarity changes due to a live series update (e.g.,
Reserve → All-Star after a hot streak), the evolution rarity upgrade stacks on top of the
*current* rarity at the time T4 completes. The evolution system does not track or care about
historical rarity — it simply increments whatever the current rarity is by one step.
## 5.5 Variant System Usage (Hash-Based)
The existing `battingcard.variant` and `pitchingcard.variant` fields (integer, UNIQUE with player)
are currently always 0. The evolution system uses variant to store evolved versions, with the
variant number derived from a **deterministic hash** of all inputs that affect the card:
```python
import hashlib
def compute_variant_hash(player_id: int, evolution_tier: int,
cosmetics: list[str] | None) -> int:
"""Compute a stable variant number from evolution + cosmetic state."""
inputs = {
"player_id": player_id,
"evolution_tier": evolution_tier,
"cosmetics": sorted(cosmetics or []),
}
raw = hashlib.sha256(str(inputs).encode()).hexdigest()
return int(raw[:8], 16) # 32-bit unsigned integer from first 8 hex chars
```
- `variant = 0`: Base card (standard, shared across all teams)
- `variant = <hash>`: Evolution/cosmetic-specific card with boosted ratings and custom image
**Key property: two teams with the same player_id, same evolution tier, and same cosmetics
produce the same variant hash.** This means they share the same ratings rows and the same
rendered S3 image — no duplication. If either team changes any input (buys a cosmetic), the
hash changes, creating a new variant.
Each tier completion or cosmetic change computes the new variant hash, checks if a `battingcard`
row with that variant exists (reuse if so), and creates one if not. The `card` table instance
points to its current variant via `card.variant`.
Evolved rating rows coexist with the base card in the same `battingcardratings`/`pitchingcardratings`
tables, keyed by `(battingcard_id, vs_hand)` where `battingcard_id` points to the variant row.
No new columns needed on the ratings table itself.
**Image storage:** Each variant's rendered card image URL is stored on `battingcard.image_url`
and `pitchingcard.image_url` (new nullable columns). The bot's display logic checks `card.variant`:
if set, look up the variant's `battingcard.image_url`; if null, fall back to `player.image`.
Images are rendered once via the existing Playwright pipeline (with cosmetic CSS applied) and
uploaded to S3 at a predictable path: `cards/cardset-{id}/player-{player_id}/v{variant}/battingcard.png`.
The 5-6 second render cost is paid once per variant creation, not on every display.