fix(refractor): mask variant hash to 31 bits to fit Postgres INTEGER

compute_variant_hash took the first 8 hex chars of a SHA-256 digest and
cast to int, producing values up to 2^32 - 1. The variant columns on
Card, BattingCard, PitchingCard, and RefractorCardState are Peewee
IntegerField → Postgres INTEGER, which is signed 32-bit (max 2^31 - 1).
Roughly half of all players (~50%) would hash into the range [2^31,
2^32 - 1] and crash tier-up writes with:

  peewee.DataError: integer out of range

Surfaced via /dev refractor-test card_id:64460 (Charles Nagy,
player_id=10795), whose tier-1 hash was 2874960417. The outer
exception handler in refractor.evaluate_game caught the error and
logged a warning, so the tier-up was silently dropped — the test
harness reported "No tier-up detected (evaluated 2 cards)" while
apply_tier_boost was actually failing mid-write.

Fix: mask the hash with & 0x7FFFFFFF, dropping one bit of entropy.
~2.1B distinct values remain — still astronomically collision-safe.

Backwards-compatible: all 9 existing refractor_boost_audit rows and
9 persisted non-zero variants have hashes where the high bit was
already 0 (those tier-ups happened to land in the safe half). Masking
leaves those values unchanged.

Added regression test test_fits_postgres_int32 covering 10,000 player
IDs × 5 tiers = 50,000 combinations, all asserted ≤ 2,147,483,647.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Cal Corum 2026-04-11 13:11:40 -05:00
parent 1c2243602d
commit ab15228c44
2 changed files with 31 additions and 2 deletions

View File

@ -322,7 +322,7 @@ def compute_variant_hash(
identifiers). Order is normalised callers need not sort.
Returns:
A positive integer in the range [1, 2^32 - 1].
A positive integer in the range [1, 2^31 - 1].
"""
inputs = {
"player_id": player_id,
@ -330,7 +330,11 @@ def compute_variant_hash(
"cosmetics": sorted(cosmetics or []),
}
raw = hashlib.sha256(json.dumps(inputs, sort_keys=True).encode()).hexdigest()
result = int(raw[:8], 16)
# Mask to 31 bits so the result fits in Postgres signed INTEGER
# (the type used by every variant column). Without this mask, roughly
# half of all players would produce hashes above 2^31 - 1 and crash
# tier-up writes with peewee.DataError: integer out of range.
result = int(raw[:8], 16) & 0x7FFFFFFF
return result if result != 0 else 1 # variant=0 is reserved

View File

@ -878,6 +878,31 @@ class TestVariantHash:
f"compute_variant_hash({player_id}, {tier}) returned 0"
)
def test_fits_postgres_int32(self):
"""Hash always fits in Postgres signed INTEGER range (2^31 - 1).
What: Generate hashes for player_ids 0-9999 at tiers 0-4 and assert
every result is <= 2147483647.
Why: The variant columns on Card, BattingCard, PitchingCard, and
RefractorCardState are Peewee IntegerField Postgres INTEGER, which
is signed 32-bit. Before the 31-bit mask was added, the SHA-256-derived
hash could return values up to 2^32 - 1, causing roughly half of all
tier-up attempts to crash with peewee.DataError: integer out of range.
Discovered via /dev refractor-test on card_id=64460 (Charles Nagy),
whose tier-1 hash is 2874960417 above the int32 ceiling. Masking
to 31 bits drops one bit of entropy but keeps ~2.1B distinct values,
more than enough for collision safety.
"""
int32_max = 2147483647
for player_id in range(10000):
for tier in range(5):
result = compute_variant_hash(player_id, tier)
assert result <= int32_max, (
f"compute_variant_hash({player_id}, {tier}) returned "
f"{result}, which exceeds Postgres INTEGER max {int32_max}"
)
def test_cosmetics_affect_hash(self):
"""Adding cosmetics to the same player/tier produces a different hash.