fix(refractor): mask variant hash to 31 bits to fit Postgres INTEGER #217

Merged
cal merged 1 commits from fix/variant-hash-int32-overflow into main 2026-04-11 18:16:09 +00:00
Owner

Problem

compute_variant_hash takes the first 8 hex chars of a SHA-256 digest
and casts to int, producing values up to 2^32 - 1. The variant columns
on Card, BattingCard, PitchingCard, and RefractorCardState are
Peewee IntegerField → Postgres INTEGER (signed 32-bit, max 2^31 - 1).
Roughly 50% of all player-tier combinations fall in the range
[2^31, 2^32 - 1] and crash the tier-up write:

peewee.DataError: integer out of range

Discovery

Surfaced during /dev refractor-test card_id:64460 (Charles Nagy,
player_id=10795). His tier-1 hash is 2874960417 — above the int32
ceiling. The outer exception handler in routers_v2/refractor.py
(evaluate_game) caught the DataError and logged a warning, so
the dev harness reported ⚠️ No tier-up detected while
apply_tier_boost was actually failing mid-write. Silently dropped
tier-ups in production would look like "some cards just never
tier up, others do."

Fix

One-line mask in compute_variant_hash:

result = int(raw[:8], 16) & 0x7FFFFFFF

Drops 1 bit of entropy (~2.1B distinct values remain, still
astronomically collision-safe) and guarantees every return value
fits in Postgres INTEGER.

Backwards compatibility

All 9 existing refractor_boost_audit rows and 9 persisted
non-zero variants have hashes ≤ 1.88B — their high bit was
already 0. Masking leaves those values unchanged. No migration
required.

Test

New regression test test_fits_postgres_int32 covers 10,000
player IDs × 5 tiers = 50,000 combinations, all asserted
≤ 2,147,483,647. Full test_refractor_boost.py +
test_refractor_boost_integration.py suites pass (72/72).

## Problem `compute_variant_hash` takes the first 8 hex chars of a SHA-256 digest and casts to int, producing values up to 2^32 - 1. The variant columns on `Card`, `BattingCard`, `PitchingCard`, and `RefractorCardState` are Peewee `IntegerField` → Postgres `INTEGER` (signed 32-bit, max 2^31 - 1). Roughly 50% of all player-tier combinations fall in the range [2^31, 2^32 - 1] and crash the tier-up write: ``` peewee.DataError: integer out of range ``` ## Discovery Surfaced during `/dev refractor-test card_id:64460` (Charles Nagy, player_id=10795). His tier-1 hash is 2874960417 — above the int32 ceiling. The outer exception handler in `routers_v2/refractor.py` (`evaluate_game`) caught the `DataError` and logged a warning, so the dev harness reported `⚠️ No tier-up detected` while `apply_tier_boost` was actually failing mid-write. Silently dropped tier-ups in production would look like "some cards just never tier up, others do." ## Fix One-line mask in `compute_variant_hash`: ```python result = int(raw[:8], 16) & 0x7FFFFFFF ``` Drops 1 bit of entropy (~2.1B distinct values remain, still astronomically collision-safe) and guarantees every return value fits in Postgres INTEGER. ## Backwards compatibility All 9 existing `refractor_boost_audit` rows and 9 persisted non-zero variants have hashes ≤ 1.88B — their high bit was already 0. Masking leaves those values unchanged. No migration required. ## Test New regression test `test_fits_postgres_int32` covers 10,000 player IDs × 5 tiers = 50,000 combinations, all asserted ≤ 2,147,483,647. Full `test_refractor_boost.py` + `test_refractor_boost_integration.py` suites pass (72/72).
cal added 1 commit 2026-04-11 18:13:23 +00:00
compute_variant_hash took the first 8 hex chars of a SHA-256 digest and
cast to int, producing values up to 2^32 - 1. The variant columns on
Card, BattingCard, PitchingCard, and RefractorCardState are Peewee
IntegerField → Postgres INTEGER, which is signed 32-bit (max 2^31 - 1).
Roughly half of all players (~50%) would hash into the range [2^31,
2^32 - 1] and crash tier-up writes with:

  peewee.DataError: integer out of range

Surfaced via /dev refractor-test card_id:64460 (Charles Nagy,
player_id=10795), whose tier-1 hash was 2874960417. The outer
exception handler in refractor.evaluate_game caught the error and
logged a warning, so the tier-up was silently dropped — the test
harness reported "No tier-up detected (evaluated 2 cards)" while
apply_tier_boost was actually failing mid-write.

Fix: mask the hash with & 0x7FFFFFFF, dropping one bit of entropy.
~2.1B distinct values remain — still astronomically collision-safe.

Backwards-compatible: all 9 existing refractor_boost_audit rows and
9 persisted non-zero variants have hashes where the high bit was
already 0 (those tier-ups happened to land in the safe half). Masking
leaves those values unchanged.

Added regression test test_fits_postgres_int32 covering 10,000 player
IDs × 5 tiers = 50,000 combinations, all asserted ≤ 2,147,483,647.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Author
Owner

PR #217 Review — fix(refractor): mask variant hash to 31 bits to fit Postgres INTEGER

Self-review restriction applies — posting as COMMENT. No blocking issues. Ready to merge.


Fix correctness

0x7FFFFFFF = 2^31 − 1 = 2,147,483,647. Masking the raw 32-bit SHA-256 slice to 31 bits zeroes the sign bit, so every result is a non-negative value in [0, 2^31 − 1] — exactly the Postgres INTEGER range. The existing zero-remap (return result if result != 0 else 1) runs after the mask, so the reserved-zero contract is preserved. Arithmetic is correct.

Model coverage

Confirmed in db_engine.py — every column that receives a value from compute_variant_hash is IntegerField (Postgres INTEGER):

Model Column Type
Card variant IntegerField(null=True)
BattingCard variant IntegerField()
PitchingCard variant IntegerField()
RefractorCardState variant IntegerField(null=True)
RefractorBoostAudit variant_created IntegerField()

All five are covered by the single-function fix. CardPosition.variant (also IntegerField) is not written to by apply_tier_boost, so it is out of scope and unaffected.

Regression test

test_fits_postgres_int32 generates 10,000 player IDs × 5 tiers = 50,000 combinations and asserts every result ≤ 2,147,483,647. Coverage is adequate. The known reproducer (player_id=10795, Charles Nagy) falls outside the range(10000) loop, but the mask is unconditional — if 0–9999 all pass, 10795 passes too. Would be a minor nicety to add compute_variant_hash(10795, 1) as a concrete named example inside the test (documents the original incident), but it is not required.

Backwards compatibility

PR body is correct: any existing hash ≤ 1.88B has its high bit already clear. value & 0x7FFFFFFF == value for all such values. No migration required, no data drift.

Docstring

Return range updated to [1, 2^31 - 1] — accurate after the mask.

Merge status

mergeable: true


No blocking issues.

## PR #217 Review — fix(refractor): mask variant hash to 31 bits to fit Postgres INTEGER **Self-review restriction applies — posting as COMMENT. No blocking issues. Ready to merge.** --- ### Fix correctness ✅ `0x7FFFFFFF` = 2^31 − 1 = 2,147,483,647. Masking the raw 32-bit SHA-256 slice to 31 bits zeroes the sign bit, so every result is a non-negative value in [0, 2^31 − 1] — exactly the Postgres `INTEGER` range. The existing zero-remap (`return result if result != 0 else 1`) runs after the mask, so the reserved-zero contract is preserved. Arithmetic is correct. ### Model coverage ✅ Confirmed in `db_engine.py` — every column that receives a value from `compute_variant_hash` is `IntegerField` (Postgres `INTEGER`): | Model | Column | Type | |---|---|---| | `Card` | `variant` | `IntegerField(null=True)` | | `BattingCard` | `variant` | `IntegerField()` | | `PitchingCard` | `variant` | `IntegerField()` | | `RefractorCardState` | `variant` | `IntegerField(null=True)` | | `RefractorBoostAudit` | `variant_created` | `IntegerField()` | All five are covered by the single-function fix. `CardPosition.variant` (also `IntegerField`) is not written to by `apply_tier_boost`, so it is out of scope and unaffected. ### Regression test ✅ `test_fits_postgres_int32` generates 10,000 player IDs × 5 tiers = 50,000 combinations and asserts every result ≤ 2,147,483,647. Coverage is adequate. The known reproducer (player_id=10795, Charles Nagy) falls outside the `range(10000)` loop, but the mask is unconditional — if 0–9999 all pass, 10795 passes too. Would be a minor nicety to add `compute_variant_hash(10795, 1)` as a concrete named example inside the test (documents the original incident), but it is not required. ### Backwards compatibility ✅ PR body is correct: any existing hash ≤ 1.88B has its high bit already clear. `value & 0x7FFFFFFF == value` for all such values. No migration required, no data drift. ### Docstring ✅ Return range updated to `[1, 2^31 - 1]` — accurate after the mask. ### Merge status `mergeable: true` --- No blocking issues.
Claude added the
ai-reviewing
label 2026-04-11 18:15:50 +00:00
Author
Owner

PR #217 Review — fix(refractor): mask variant hash to 31 bits to fit Postgres INTEGER

Self-review restriction applies — posting as COMMENT. No blocking issues. Ready to merge.


Fix correctness

0x7FFFFFFF = 2^31 - 1 = 2,147,483,647. Masking the raw 32-bit SHA-256 slice to 31 bits zeroes the sign bit, so every result is a non-negative value in [0, 2^31 - 1] — exactly the Postgres INTEGER range. The existing zero-remap runs after the mask, so the reserved-zero contract is preserved.

Model coverage

Confirmed in db_engine.py — every column that receives a value from compute_variant_hash is IntegerField:

  • Card.variant (IntegerField, null=True)
  • BattingCard.variant (IntegerField)
  • PitchingCard.variant (IntegerField)
  • RefractorCardState.variant (IntegerField, null=True)
  • RefractorBoostAudit.variant_created (IntegerField)

CardPosition.variant is also IntegerField but is not written to by apply_tier_boost — out of scope.

Regression test

test_fits_postgres_int32 covers 50,000 combinations (10,000 player IDs x 5 tiers). The known reproducer (player_id=10795) falls outside range(10000), but the mask is unconditional — if 0-9999 all pass, 10795 passes too. Minor nicety: could add compute_variant_hash(10795, 1) as a named concrete example, but not required.

Backwards compatibility

Any existing hash <= 1.88B has high bit already clear — masking is a no-op for those values. No migration required.

Docstring

Return range updated to [1, 2^31 - 1] — accurate after the mask.

mergeable: true — no blocking issues.

## PR #217 Review — fix(refractor): mask variant hash to 31 bits to fit Postgres INTEGER **Self-review restriction applies — posting as COMMENT. No blocking issues. Ready to merge.** --- ### Fix correctness `0x7FFFFFFF` = 2^31 - 1 = 2,147,483,647. Masking the raw 32-bit SHA-256 slice to 31 bits zeroes the sign bit, so every result is a non-negative value in [0, 2^31 - 1] — exactly the Postgres INTEGER range. The existing zero-remap runs after the mask, so the reserved-zero contract is preserved. ### Model coverage Confirmed in db_engine.py — every column that receives a value from `compute_variant_hash` is IntegerField: - Card.variant (IntegerField, null=True) - BattingCard.variant (IntegerField) - PitchingCard.variant (IntegerField) - RefractorCardState.variant (IntegerField, null=True) - RefractorBoostAudit.variant_created (IntegerField) CardPosition.variant is also IntegerField but is not written to by `apply_tier_boost` — out of scope. ### Regression test `test_fits_postgres_int32` covers 50,000 combinations (10,000 player IDs x 5 tiers). The known reproducer (player_id=10795) falls outside `range(10000)`, but the mask is unconditional — if 0-9999 all pass, 10795 passes too. Minor nicety: could add `compute_variant_hash(10795, 1)` as a named concrete example, but not required. ### Backwards compatibility Any existing hash <= 1.88B has high bit already clear — masking is a no-op for those values. No migration required. ### Docstring Return range updated to [1, 2^31 - 1] — accurate after the mask. `mergeable: true` — no blocking issues.
cal merged commit f53bc09cbf into main 2026-04-11 18:16:09 +00:00
cal deleted branch fix/variant-hash-int32-overflow 2026-04-11 18:16:10 +00:00
Claude reviewed 2026-04-11 18:16:48 +00:00
Claude left a comment
Collaborator

AI Code Review

Files Reviewed

  • app/services/refractor_boost.py (modified)
  • tests/test_refractor_boost.py (modified)

Findings

Correctness

  • Fix is mathematically correct. int(raw[:8], 16) & 0x7FFFFFFF clears bit 31, capping the result at 2,147,483,647 — exactly the Postgres INTEGER ceiling. The zero-guard (return result if result != 0 else 1) still functions correctly after masking since 0 remains reachable (e.g. when the 8-char hex prefix is 00000000) and is properly remapped to 1.
  • Docstring range updated from [1, 2^32 - 1] to [1, 2^31 - 1] — correct.
  • Backwards compatibility claim is sound: masking a value whose high bit was already 0 is a no-op. Existing rows ≤ 1.88B are unaffected.

Security

  • No issues found.

Style & Conventions

  • Explanatory comment in code is clear and accurate — matches the level of detail warranted for a non-obvious numeric constraint.
  • New test docstring follows the project's What/Why convention per CLAUDE.md.
  • No lazy imports introduced.

Suggestions

  • test_fits_postgres_int32 runs 50,000 pure-Python iterations — fast in practice, but could be marked with a size marker or reduced to a smaller but still statistically representative sample (e.g. 1,000 × 5) if CI time becomes a concern. Non-blocking.

Verdict: COMMENT

Self-review restriction applies (author == cal). No correctness, security, or convention issues found. The one-line fix is correct, well-commented, and backed by a solid regression test. Ready to merge.


Automated review by Claude PR Reviewer

## AI Code Review ### Files Reviewed - `app/services/refractor_boost.py` (modified) - `tests/test_refractor_boost.py` (modified) ### Findings #### Correctness - Fix is mathematically correct. `int(raw[:8], 16) & 0x7FFFFFFF` clears bit 31, capping the result at 2,147,483,647 — exactly the Postgres `INTEGER` ceiling. The zero-guard (`return result if result != 0 else 1`) still functions correctly after masking since 0 remains reachable (e.g. when the 8-char hex prefix is `00000000`) and is properly remapped to 1. - Docstring range updated from `[1, 2^32 - 1]` to `[1, 2^31 - 1]` — correct. - Backwards compatibility claim is sound: masking a value whose high bit was already 0 is a no-op. Existing rows ≤ 1.88B are unaffected. #### Security - No issues found. #### Style & Conventions - Explanatory comment in code is clear and accurate — matches the level of detail warranted for a non-obvious numeric constraint. - New test docstring follows the project's What/Why convention per CLAUDE.md. ✅ - No lazy imports introduced. ✅ #### Suggestions - `test_fits_postgres_int32` runs 50,000 pure-Python iterations — fast in practice, but could be marked with a size marker or reduced to a smaller but still statistically representative sample (e.g. 1,000 × 5) if CI time becomes a concern. Non-blocking. ### Verdict: COMMENT Self-review restriction applies (author == cal). No correctness, security, or convention issues found. The one-line fix is correct, well-commented, and backed by a solid regression test. Ready to merge. --- *Automated review by Claude PR Reviewer*
Claude added
ai-reviewed
and removed
ai-reviewing
labels 2026-04-11 18:17:03 +00:00
Sign in to join this conversation.
No description provided.