docs: reconcile PRD boost spec with shipped implementation

Section 5.3 of 05-rating-boosts.md described profile-based boost distribution (power/contact/patient hitter profiles) that was never built. Updated to document the actual shipped algorithms: fixed column deltas for batters and TB-budget priority drain for pitchers, both in database/app/services/refractor_boost.py. Corrected the truncation behavior — the implementation scales positive deltas proportionally rather than discarding them, preserving the 108-sum exactly in all cases. Updated REFRACTOR_PHASE2_VALIDATION_SPEC.md T4-1 to reflect shipped function signatures and marked the case as complete. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 20:25:02 -05:00 · 2026-04-07 20:25:02 -05:00 · 3ad893c949
commit 3ad893c949
parent f868e5a329
2 changed files with 305 additions and 30 deletions
--- a/docs/REFRACTOR_PHASE2_VALIDATION_SPEC.md
+++ b/docs/REFRACTOR_PHASE2_VALIDATION_SPEC.md
@ -61,40 +61,48 @@ the upgrade logic must use an ordered rarity ladder, not arithmetic.

 ---

-### T4-1: 108-sum preservation under profile-based boosts
+### T4-1: 108-sum preservation under batter and pitcher boosts

-**Status:** Pending — Phase 2
+**Status:** Shipped — Phase 2 complete
+
+> **Updated 2026-04-08:** Profile-based boost distribution was not implemented. The shipped
+> implementation uses `apply_batter_boost()` (fixed column deltas) and `apply_pitcher_boost()`
+> (TB-budget priority algorithm) in `database/app/services/refractor_boost.py`. There is no
+> `apply_evolution_boosts(card_ratings, boost_tier, player_profile)` function and no
+> `pd_cards/evo/boost_profiles.py` module. See `docs/prd-evolution/05-rating-boosts.md`
+> section 5.3 for the shipped algorithm details.

 **Scenario:**

-`apply_evolution_boosts(card_ratings, boost_tier, player_profile)` redistributes 1.0 chance per
-tier across outcome columns according to the player's detected profile (power hitter, contact
-hitter, patient hitter, starting pitcher, relief pitcher). Every combination of profile and tier
-must leave the 22-column sum exactly equal to 108 after the boost is applied. This must hold for
-all four tier applications, cumulative as well as individual.
+`apply_batter_boost(ratings_dict)` applies fixed deltas (+0.50 to `homerun`, `double_pull`,
+`single_one`, `walk`; -1.50 from `strikeout`, -0.50 from `groundout_a`) per tier. The 22-column
+sum must equal exactly 108 after every application.

-The edge case: a batter card where `flyout_a = 0`. The power and contact hitter profiles draw
-reductions from out columns including `flyout_a`. If the preferred reduction column is at zero,
-the implementation must not produce a negative value and must not silently drop the remainder of
-the budget. The 0-floor cap is enforced per column (see `05-rating-boosts.md` section 5.1:
-"Truncated points are lost, not redistributed").
+`apply_pitcher_boost(ratings_dict, tb_budget=1.5)` drains a 1.5 TB-unit budget by converting
+hit-allowed chances into strikeouts in priority order. The 18 variable outcome columns must sum
+to their pre-boost total (the conversion is chance-for-chance; only column identity changes,
+not the total).
+
+The edge case: a batter card where `strikeout = 0` and `groundout_a = 0`. The negative funding
+columns are both at zero, so no reduction can occur. The shipped implementation handles this by
+scaling the positive deltas to zero (`scale = 0`), leaving all columns unchanged. The 108-sum
+is preserved exactly. A warning is logged.

 Verify:
- After each of T1, T2, T3, T4 boost applications, `sum(all outcome columns) == 108` exactly.
- A card with `flyout_a = 0` does not raise an error and does not produce a column below 0.
- When truncation occurs (column already at 0), the lost budget is discarded, not moved
-  elsewhere — the post-boost sum will be less than 108 + budget_added only in the case of
-  truncation, but must never exceed 108.
+- After each of T1, T2, T3, T4 boost applications, `sum(all batter outcome columns) == 108`.
+- After each pitcher boost, `sum(pitcher outcome columns) + sum(xcheck columns) == 108`.
+- A batter card with `strikeout = 0` and `groundout_a = 0` does not raise an error, does not
+  produce any column below 0, and leaves the sum at exactly 108.
+- No column value falls below 0 under any input.

 **Expected Outcome:**

 Sum remains 108 after every boost under non-truncation conditions. Under truncation conditions
-(a column hits 0), the final column sum must equal exactly `108 - truncated_amount` — where
-`truncated_amount` is the portion of the 1.0-chance budget that was dropped due to the 0-floor
-cap. This is a single combined assertion: `sum(columns) == 108 - truncated_amount`. Checking
-"sum <= 108" and "truncated amount was discarded" as two independent conditions is insufficient
-— a test can pass both checks while the sum is wrong for an unrelated reason (e.g., a positive
-column also lost value due to a bug). No column value falls below 0.
+(funding columns already at or near zero), the positive deltas are scaled proportionally to the
+amount actually reduced — the 108-sum is preserved exactly (not approximately). The original
+spec's statement that "truncated points are lost, not redistributed" does not reflect the
+shipped behavior: positive deltas ARE scaled down to match what was taken, ensuring the sum
+invariant holds in all cases. No column value falls below 0.

 **Risk If Failed:**

@ -105,10 +113,9 @@ corrupts game results without any visible failure.

 **Files Involved:**

- `docs/prd-evolution/05-rating-boosts.md` — boost budget, profile definitions, cap behavior
- Phase 2: `pd_cards/evo/boost_profiles.py` (to be created) — `apply_evolution_boosts`
- `batters/creation.py` — `battingcardratings` model column set (22 columns)
- `pitchers/creation.py` — `pitchingcardratings` model column set (18 columns + 9 x-checks)
+- `docs/prd-evolution/05-rating-boosts.md` — section 5.3 (shipped algorithm), section 5.1 (cap behavior)
+- `database/app/services/refractor_boost.py` — `apply_batter_boost`, `apply_pitcher_boost` (shipped)
+- `database/tests/test_refractor_boost.py` — existing test coverage for these functions

 ---

@ -153,10 +160,10 @@ undermines the core design intent.

 **Files Involved:**

- `docs/prd-evolution/05-rating-boosts.md` — section 5.2 (boost budgets), section 5.3 (profiles)
+- `docs/prd-evolution/05-rating-boosts.md` — section 5.2 (boost budgets), section 5.3 (shipped algorithm)
 - `rarity_thresholds.py` — OPS boundary values used to assess whether evolution crosses a rarity
  threshold as a side effect (it should not for mid-range cards)
- Phase 2: `pd_cards/evo/boost_profiles.py` — boost distribution logic
+- `database/app/services/refractor_boost.py` — `apply_batter_boost`, `apply_pitcher_boost` (shipped)

 ---

@ -455,7 +462,7 @@ heavily in T3 at season end will be angry when their progress disappears.

 | ID | Title | Status |
 |---|---|---|
-| T4-1 | 108-sum preservation under profile-based boosts | Pending — Phase 2 |
+| T4-1 | 108-sum preservation under batter and pitcher boosts | Shipped — Phase 2 complete |
 | T4-2 | D20 probability shift at T4 | Pending — Phase 2 |
 | T4-3 | T4 rarity upgrade — pipeline collision risk | Pending — Phase 2 |
 | T4-4 | T4 rarity cap for HoF cards | Pending — Phase 2 |
--- a/docs/prd-evolution/05-rating-boosts.md
+++ b/docs/prd-evolution/05-rating-boosts.md
@ -0,0 +1,268 @@
+# 5. Rating Boost Mechanics
+
+[< Back to Index](README.md) | [Next: Database Schema >](06-database.md)
+
+---
+
+## 5.1 Rating Model Overview
+
+The card rating system is built on the `battingcardratings` and `pitchingcardratings` models.
+Each model defines outcome columns whose values represent chances out of a **108-chance total**
+(derived from the D20 probability system: 2d6 × 3 columns × 6 rows = 108 total chances).
+
+**Batter ratings** have **22 outcome columns** summing to 108:
+
+| Category | Columns |
+|---|---|
+| Hits | `homerun`, `bp_homerun`, `triple`, `double_three`, `double_two`, `double_pull`, `single_two`, `single_one`, `single_center`, `bp_single` |
+| On-base | `hbp`, `walk` |
+| Outs | `strikeout`, `lineout`, `popout`, `flyout_a`, `flyout_bq`, `flyout_lf_b`, `flyout_rf_b`, `groundout_a`, `groundout_b`, `groundout_c` |
+
+**Pitcher ratings** have **18 outcome columns + 9 x-check fields** summing to 108:
+
+| Category | Columns |
+|---|---|
+| Hits allowed | `homerun`, `bp_homerun`, `triple`, `double_three`, `double_two`, `double_cf`, `single_two`, `single_one`, `single_center`, `bp_single` |
+| On-base | `hbp`, `walk` |
+| Outs | `strikeout`, `flyout_lf_b`, `flyout_cf_b`, `flyout_rf_b`, `groundout_a`, `groundout_b` |
+| X-checks | `xcheck_p` (1), `xcheck_c` (3), `xcheck_1b` (2), `xcheck_2b` (6), `xcheck_3b` (3), `xcheck_ss` (7), `xcheck_lf` (2), `xcheck_cf` (3), `xcheck_rf` (2) — always sum to 29 |
+
+**Key differences:** Batters have `double_pull`, pitchers have `double_cf`. Batters have
+`lineout`, `popout`, `flyout_a`, `flyout_bq`, `groundout_c` — pitchers do not. Pitchers have
+`flyout_cf_b` and x-check fields — batters do not.
+
+Evolution boosts apply **flat deltas to individual result columns** within these models. The
+108-sum constraint must be maintained: any increase to a positive outcome column requires an
+equal decrease to a negative outcome column.
+
+### Rating Cap Enforcement
+
+All boosts are subject to the existing hard caps on individual stat columns. If applying a delta
+would push a value past its cap, the delta is **truncated** to the cap value.
+
+**Key caps (from existing card creation system):**
+
+| Stat | Cap | Direction | Example |
+|---|---|---|---|
+| Hold rating (pitcher) | -5 | Lower is better | A pitcher at -4 hold can only receive -1 more |
+| Result columns | 0 floor | Cannot go negative | A 0.1 strikeout column can only lose 0.1 |
+
+**Truncated points are lost, not redistributed.** If a boost would push a stat past its cap, the
+delta is truncated and the excess is simply discarded. This is an intentional soft penalty for
+cards that are already near their ceiling — they're being penalized because they're already that
+good. Lower-rated cards have more headroom and benefit more from the same flat delta.
+
+## 5.2 Boost Budgets Per Tier
+
+Rating boosts are defined as **flat deltas to specific result columns** within the 108-sum model.
+The budget per tier is the total number of chances that can be shifted from negative outcomes
+(outs) to positive outcomes (hits, on-base).
+
+| Tier | Batter Budget | Pitcher TB Budget | Approx Impact |
+|------|--------------|-------------------|---------------|
+| T1 | 2.0 chances net (+2.0 pos, -2.0 neg) | 1.5 TB units | Fixed deltas / priority drain |
+| T2 | 2.0 chances net | 1.5 TB units | Same — consistent per-tier reward |
+| T3 | 2.0 chances net | 1.5 TB units | Same — consistent per-tier reward |
+| T4 | 2.0 chances net | 1.5 TB units | Same — plus rarity upgrade |
+| **Total** | **8.0 chances net** | **6.0 TB units** | **~7.4% of chances shifted (batter)** |
+
+Every tier provides the same fixed boost. T4 is distinguished not by a larger delta but by the
+rarity upgrade, which is the real capstone reward.
+
+**Flat delta design rationale:** All cards receive the same absolute boost regardless of rarity.
+A Replacement card (where `homerun` might be 0.3) gains much more relative value from a fixed
+0.50 HR boost than a Hall of Fame card (where `homerun` might be 5.0). This intentionally
+incentivizes using lower-rated cards and prevents elite cards from becoming god-tier. Cards
+already near column caps receive even less due to truncation.
+
+**Example — T1 batter boost:**
+```
+homerun:     +0.50  (from 2.0 → 2.50)
+double_pull: +0.50  (from 3.5 → 4.00)
+single_one:  +0.50  (from 4.0 → 4.50)
+walk:        +0.50  (from 3.0 → 3.50)
+strikeout:   -1.50  (from 15.0 → 13.50)
+groundout_a: -0.50  (from 8.0 → 7.50)
+                     Net: +2.0 / -2.0 = 0, sum stays at 108
+```
+
+## 5.3 Shipped Boost Distribution
+
+> **Updated 2026-04-08 to reflect shipped implementation.**
+> The original spec described profile-based boost distribution (power hitter, contact hitter,
+> patient hitter profiles). The implementation uses a simpler, more predictable approach:
+> fixed deltas for batters and a TB-budget priority algorithm for pitchers. Profile detection
+> was not implemented.
+
+### 5.3.1 Batter Boost — Fixed Column Deltas
+
+Every batter receives identical fixed deltas per tier regardless of their profile. There is no
+player-style detection. The implementation is in `apply_batter_boost()` in
+`database/app/services/refractor_boost.py`.
+
+**Positive deltas (applied each tier):**
+
+| Column | Delta |
+|---|---|
+| `homerun` | +0.50 |
+| `double_pull` | +0.50 |
+| `single_one` | +0.50 |
+| `walk` | +0.50 |
+
+**Negative deltas (funding source):**
+
+| Column | Delta |
+|---|---|
+| `strikeout` | -1.50 |
+| `groundout_a` | -0.50 |
+
+**0-floor truncation behavior:** If `strikeout` or `groundout_a` cannot supply their full
+requested reduction (because the column is already near zero), the positive deltas are scaled
+proportionally so the 108-sum invariant is always preserved. Specifically:
+
+1. Negative deltas are applied first, each capped at the column's current value (0 floor).
+2. The total amount actually reduced is computed.
+3. Positive deltas are scaled by `actually_reduced / total_requested_reduction` so that
+   additions always equal reductions.
+4. A warning is logged when truncation occurs.
+
+This differs from the original spec's statement that "truncated points are lost, not
+redistributed." In the shipped implementation, positive deltas are scaled down to match what
+was actually taken — the 108-sum is always exactly preserved.
+
+### 5.3.2 Pitcher Boost — TB-Budget Priority Algorithm
+
+Pitchers use a total-bases budget approach instead of fixed column deltas. Each tier awards a
+**1.5 TB-unit budget**. The algorithm converts hit-allowed chances into strikeouts, iterating
+through outcome types in priority order (most damaging hits first) until the budget is exhausted.
+
+The implementation is in `apply_pitcher_boost()` in `database/app/services/refractor_boost.py`.
+
+**Priority order and TB cost per chance:**
+
+| Priority | Column | TB Cost |
+|---|---|---|
+| 1 | `double_cf` | 2 |
+| 2 | `double_three` | 2 |
+| 3 | `double_two` | 2 |
+| 4 | `single_center` | 1 |
+| 5 | `single_two` | 1 |
+| 6 | `single_one` | 1 |
+| 7 | `bp_single` | 1 |
+| 8 | `walk` | 1 |
+| 9 | `homerun` | 4 |
+| 10 | `bp_homerun` | 4 |
+| 11 | `triple` | 3 |
+| 12 | `hbp` | 1 |
+
+**Algorithm per tier:**
+1. Start with `remaining = 1.5` TB budget.
+2. Iterate priority list in order. Skip columns already at 0.
+3. For each column: compute `chances_to_take = min(column_value, remaining / tb_cost)`.
+4. Reduce the column by `chances_to_take`; add `chances_to_take` to `strikeout`.
+5. Reduce `remaining` by `chances_to_take * tb_cost`.
+6. Stop when `remaining <= 0` or the priority list is exhausted.
+
+X-check columns (`xcheck_p` through `xcheck_rf`, always summing to 29) are never touched by
+the boost algorithm.
+
+**Budget not fully spent:** If all priority columns are already at zero before the budget is
+exhausted (extremely rare), the remaining budget is discarded and a warning is logged.
+
+**No separate SP vs. RP logic:** The same algorithm applies to both starting pitchers and
+relief pitchers. Card type (`sp` vs. `rp`) determines how the card is used in the game engine
+but does not change the boost formula.
+
+### 5.3.3 Function Signatures (Shipped)
+
+The boost logic lives in the **database repo** (`database/app/services/refractor_boost.py`),
+not in card-creation. The functions called per tier-up are:
+
+```python
+# Batter
+apply_batter_boost(ratings_dict: dict) -> dict
+
+# Pitcher (sp or rp)
+apply_pitcher_boost(ratings_dict: dict, tb_budget: float = 1.5) -> dict
+```
+
+Both functions accept a dict of outcome column values and return a new dict with updated values
+(all other keys passed through unchanged). They are pure functions — no DB access.
+
+The orchestration function that applies the correct boost, creates the variant card row, updates
+`RefractorCardState`, and writes the audit record is:
+
+```python
+apply_tier_boost(
+    player_id: int,
+    team_id: int,
+    new_tier: int,
+    card_type: str,  # 'batter', 'sp', or 'rp'
+    ...injectable test stubs...
+) -> dict  # {'variant_created': int, 'boost_deltas': dict}
+```
+
+The `card-creation` repo does not contain boost application code. The `pd_cards/evo/` package
+referenced in the original spec was not created; the boost logic was implemented directly in the
+database API service layer.
+
+## 5.4 Rarity Upgrade at T4
+
+When a card completes T4, the card's rarity is upgraded by one tier (if below HoF):
+
+- The `player.rarity_id` field is incremented by one step (e.g., Sta -> All)
+- The card's base rating recalculation is skipped; only the T4 boost deltas are applied on top of the
+  accumulated evolved ratings
+- The card cost field is NOT automatically recalculated (rarity upgrade is a gameplay reward, not
+  a market event; admin can manually adjust if needed)
+- The rarity change is recorded in `evolution_card_state.final_rarity_id` for audit purposes
+- **HoF cards cannot upgrade further** — they receive the T4 boost deltas but no rarity change
+
+**Live series interaction:** If a card's rarity changes due to a live series update (e.g.,
+Reserve → All-Star after a hot streak), the evolution rarity upgrade stacks on top of the
+*current* rarity at the time T4 completes. The evolution system does not track or care about
+historical rarity — it simply increments whatever the current rarity is by one step.
+
+## 5.5 Variant System Usage (Hash-Based)
+
+The existing `battingcard.variant` and `pitchingcard.variant` fields (integer, UNIQUE with player)
+are currently always 0. The evolution system uses variant to store evolved versions, with the
+variant number derived from a **deterministic hash** of all inputs that affect the card:
+
+```python
+import hashlib
+
+def compute_variant_hash(player_id: int, evolution_tier: int,
+                         cosmetics: list[str] | None) -> int:
+    """Compute a stable variant number from evolution + cosmetic state."""
+    inputs = {
+        "player_id": player_id,
+        "evolution_tier": evolution_tier,
+        "cosmetics": sorted(cosmetics or []),
+    }
+    raw = hashlib.sha256(str(inputs).encode()).hexdigest()
+    return int(raw[:8], 16)  # 32-bit unsigned integer from first 8 hex chars
+```
+
+- `variant = 0`: Base card (standard, shared across all teams)
+- `variant = <hash>`: Evolution/cosmetic-specific card with boosted ratings and custom image
+
+**Key property: two teams with the same player_id, same evolution tier, and same cosmetics
+produce the same variant hash.** This means they share the same ratings rows and the same
+rendered S3 image — no duplication. If either team changes any input (buys a cosmetic), the
+hash changes, creating a new variant.
+
+Each tier completion or cosmetic change computes the new variant hash, checks if a `battingcard`
+row with that variant exists (reuse if so), and creates one if not. The `card` table instance
+points to its current variant via `card.variant`.
+
+Evolved rating rows coexist with the base card in the same `battingcardratings`/`pitchingcardratings`
+tables, keyed by `(battingcard_id, vs_hand)` where `battingcard_id` points to the variant row.
+No new columns needed on the ratings table itself.
+
+**Image storage:** Each variant's rendered card image URL is stored on `battingcard.image_url`
+and `pitchingcard.image_url` (new nullable columns). The bot's display logic checks `card.variant`:
+if set, look up the variant's `battingcard.image_url`; if null, fall back to `player.image`.
+Images are rendered once via the existing Playwright pipeline (with cosmetic CSS applied) and
+uploaded to S3 at a predictable path: `cards/cardset-{id}/player-{player_id}/v{variant}/battingcard.png`.
+The 5-6 second render cost is paid once per variant creation, not on every display.