paper-dynasty-card-creation/docs/formula_weight_comparison.md
2025-11-23 01:28:33 -06:00

5.2 KiB

Formula Weight Comparison - Before vs After

The Problem

Original Formula over-weighted volume (total assists) and under-weighted rate stats:

# OLD - Volume Dominant
raw_score = (
    (assist_rate * 30) +           # Weak: 0.86 pts for elite 2.87% rate
    (throwout_rate * 5) +          # Moderate
    (home_throws * 2) +            # DOMINANT: 16 pts for 8 home throws
    (batter_extra_outs * 1.5) +    # Moderate
    (total_assists * 0.5)          # Strong: 4 pts for 8 assists
)
# Result: Volume (20 pts) overwhelmed rate (0.86 pts) by 23x

Old Formula Results

Rank Player Score Assist Rate Home Throws Total Assists
1 Lieberthal 25.86 2.87% 8 ✓ 8 ✓
2 Matthews 23.30 2.66% 7 ✓ 7 ✓
3 gathj001 21.00 6.67% ✓ 4 6
4 Taguchi 20.88 8.64% ✓✓ 3 7

Problem: So Taguchi has 3x better assist rate but ranks last because of low volume!


The Solution

New Formula prioritizes rate stats while giving credit for quality:

# NEW - Rate Dominant (Simplified)
raw_score = (
    (assist_rate * 300) +          # DOMINANT: 25.9 pts for elite 8.64% rate
    (home_throws * 1.0) +          # Quality bonus: 8 pts for 8 home throws
    (batter_extra_outs * 1.0) +    # Quality bonus
    (total_assists * 0.1)          # Minimal volume: 0.8 pts for 8 assists
)
# Result: Rate (25.9 pts) dominates, quality/volume are tiebreakers
# Note: Removed throwout_rate - redundant since assists = outs

New Formula Results

Rank Player Score Assist Rate Home Throws Total Assists Why?
1 Taguchi 26.52 8.64% ✓✓ 3 7 Elite rate rewarded!
2 gathj001 24.61 6.67% ✓ 4 6 Good rate
3 Lieberthal 21.11 2.87% 8 ✓ 8 Volume can't overcome low rate
4 Matthews 19.69 2.66% 7 ✓ 7 Lowest rate = lowest score

Solution: Players ranked by assist rate quality first, volume second!


Weight Changes Breakdown

Component Old Weight New Weight Change Impact
Assist Rate 30 300 +900% Dominant driver
Throwout Rate 5 REMOVED -100% Redundant (assists = outs)
Home Throws 2.0 1.0 -50% Minor quality indicator
Batter Extra Outs 1.5 1.0 -33% Minor quality indicator
Volume (Total Assists) 0.5 0.1 -80% Minimal bonus only

Component Contribution Examples

Elite Rate Player (8.64% rate, 3 home throws, 7 assists)

Example High Assist Rate Player

Component Old Formula New Formula (Simplified)
Assist Rate (8.64%) 2.59 pts 25.92 pts
Throwout Rate (86%) 4.29 pts REMOVED
Home Throws (3) 6.0 pts 3.0 pts
Batter Extra (3) 4.5 pts 3.0 pts
Volume (7) 3.5 pts 0.7 pts
TOTAL 20.88 32.62

High Volume Player (2.87% rate, 8 home throws, 8 assists)

Example Volume-Based Player

Component Old Formula New Formula (Simplified)
Assist Rate (2.87%) 0.86 pts 8.61 pts
Throwout Rate (100%) 5.0 pts REMOVED
Home Throws (8) 16.0 pts 8.0 pts ✓
Batter Extra (0) 0.0 pts 0.0 pts
Volume (8) 4.0 pts 0.8 pts ✓
TOTAL 25.86 17.41

Key Difference:

  • Old: Volume (20 pts) >> Rate (0.86 pts) - Wrong priority!
  • New: Rate (8.61 pts) dominates, quality/volume are minor bonuses
  • Removed: Throwout rate (redundant with assists)

Design Philosophy

Simplified Rate-First Approach

  1. Assist Rate (300x) - How often do they throw out runners per opportunity?

    • Elite (8%+): 24+ points
    • Average (3%): ~9 points
    • Poor (1%): ~3 points
  2. Quality Indicators (1.0x each) - What types of plays do they make?

    • Home throws: 1 point each (minor bonus)
    • Batter extra outs: 1 point each (minor bonus)
  3. Volume Bonus (0.1x) - Minimal credit for total assists

    • Only used as tiebreaker
    • 10 assists = 1 point
  4. Throwout Rate - REMOVED (redundant)

    • Assists are already outs by definition
    • No need to measure "efficiency" of assists

Expected Behavior

  • High rate, low volume (platoon player with cannon arm): Ranks high ✓
  • High rate, high volume (everyday elite arm): Ranks highest ✓
  • Low rate, high volume (lucky positioning): Ranks lower ✓
  • Low rate, low volume (weak arm): Ranks lowest ✓

Validation Questions

Run python test_retrosheet_arms.py and check:

  1. ✓ Are players with 8%+ assist rates in the top tier?
  2. ✓ Are players with <2% assist rates in the bottom tier?
  3. ✓ Do home throws provide meaningful but not dominant bonuses?
  4. ✓ Is the distribution still normal (-6 to +5)?
  5. ✓ Do known strong arms (Ichiro, Guerrero) rank appropriately?

Updated: 2025-11-15 Status: Rate-dominant formula implemented Next Step: Run validation script to confirm distribution