# Formula Weight Comparison - Before vs After ## The Problem **Original Formula** over-weighted volume (total assists) and under-weighted rate stats: ```python # OLD - Volume Dominant raw_score = ( (assist_rate * 30) + # Weak: 0.86 pts for elite 2.87% rate (throwout_rate * 5) + # Moderate (home_throws * 2) + # DOMINANT: 16 pts for 8 home throws (batter_extra_outs * 1.5) + # Moderate (total_assists * 0.5) # Strong: 4 pts for 8 assists ) # Result: Volume (20 pts) overwhelmed rate (0.86 pts) by 23x ``` ### Old Formula Results | Rank | Player | Score | Assist Rate | Home Throws | Total Assists | |------|--------|-------|-------------|-------------|---------------| | 1 | Lieberthal | 25.86 | 2.87% ❌ | 8 ✓ | 8 ✓ | | 2 | Matthews | 23.30 | 2.66% ❌ | 7 ✓ | 7 ✓ | | 3 | gathj001 | 21.00 | 6.67% ✓ | 4 | 6 | | 4 | Taguchi | 20.88 | 8.64% ✓✓ | 3 | 7 | **Problem:** So Taguchi has **3x better assist rate** but ranks **last** because of low volume! --- ## The Solution **New Formula** prioritizes rate stats while giving credit for quality: ```python # NEW - Rate Dominant (Simplified) raw_score = ( (assist_rate * 300) + # DOMINANT: 25.9 pts for elite 8.64% rate (home_throws * 1.0) + # Quality bonus: 8 pts for 8 home throws (batter_extra_outs * 1.0) + # Quality bonus (total_assists * 0.1) # Minimal volume: 0.8 pts for 8 assists ) # Result: Rate (25.9 pts) dominates, quality/volume are tiebreakers # Note: Removed throwout_rate - redundant since assists = outs ``` ### New Formula Results | Rank | Player | Score | Assist Rate | Home Throws | Total Assists | Why? | |------|--------|-------|-------------|-------------|---------------|------| | 1 | **Taguchi** | **26.52** | **8.64%** ✓✓ | 3 | 7 | Elite rate rewarded! | | 2 | gathj001 | 24.61 | 6.67% ✓ | 4 | 6 | Good rate | | 3 | Lieberthal | 21.11 | 2.87% | 8 ✓ | 8 | Volume can't overcome low rate | | 4 | Matthews | 19.69 | 2.66% | 7 ✓ | 7 | Lowest rate = lowest score | **Solution:** Players ranked by **assist rate quality** first, volume second! --- ## Weight Changes Breakdown | Component | Old Weight | New Weight | Change | Impact | |-----------|-----------|-----------|--------|---------| | **Assist Rate** | 30 | **300** | +900% | Dominant driver | | Throwout Rate | 5 | **REMOVED** | -100% | Redundant (assists = outs) | | Home Throws | 2.0 | **1.0** | -50% | Minor quality indicator | | Batter Extra Outs | 1.5 | **1.0** | -33% | Minor quality indicator | | Volume (Total Assists) | 0.5 | **0.1** | -80% | Minimal bonus only | --- ## Component Contribution Examples ### Elite Rate Player (8.64% rate, 3 home throws, 7 assists) **Example High Assist Rate Player** | Component | Old Formula | New Formula (Simplified) | |-----------|-------------|--------------------------| | Assist Rate (8.64%) | 2.59 pts | **25.92 pts** ✓ | | Throwout Rate (86%) | 4.29 pts | **REMOVED** | | Home Throws (3) | 6.0 pts | 3.0 pts | | Batter Extra (3) | 4.5 pts | 3.0 pts | | Volume (7) | 3.5 pts | 0.7 pts | | **TOTAL** | **20.88** | **32.62** ✓ | ### High Volume Player (2.87% rate, 8 home throws, 8 assists) **Example Volume-Based Player** | Component | Old Formula | New Formula (Simplified) | |-----------|-------------|--------------------------| | Assist Rate (2.87%) | 0.86 pts | **8.61 pts** | | Throwout Rate (100%) | 5.0 pts | **REMOVED** | | Home Throws (8) | **16.0 pts** ❌ | 8.0 pts ✓ | | Batter Extra (0) | 0.0 pts | 0.0 pts | | Volume (8) | **4.0 pts** | 0.8 pts ✓ | | **TOTAL** | **25.86** ❌ | **17.41** ✓ | **Key Difference:** - **Old:** Volume (20 pts) >> Rate (0.86 pts) - Wrong priority! - **New:** Rate (8.61 pts) dominates, quality/volume are minor bonuses - **Removed:** Throwout rate (redundant with assists) --- ## Design Philosophy ### Simplified Rate-First Approach 1. **Assist Rate (300x)** - How often do they throw out runners per opportunity? - Elite (8%+): 24+ points - Average (3%): ~9 points - Poor (1%): ~3 points 2. **Quality Indicators (1.0x each)** - What types of plays do they make? - Home throws: 1 point each (minor bonus) - Batter extra outs: 1 point each (minor bonus) 3. **Volume Bonus (0.1x)** - Minimal credit for total assists - Only used as tiebreaker - 10 assists = 1 point 4. **Throwout Rate** - REMOVED (redundant) - Assists are already outs by definition - No need to measure "efficiency" of assists ### Expected Behavior - **High rate, low volume** (platoon player with cannon arm): Ranks high ✓ - **High rate, high volume** (everyday elite arm): Ranks highest ✓ - **Low rate, high volume** (lucky positioning): Ranks lower ✓ - **Low rate, low volume** (weak arm): Ranks lowest ✓ --- ## Validation Questions Run `python test_retrosheet_arms.py` and check: 1. ✓ Are players with 8%+ assist rates in the top tier? 2. ✓ Are players with <2% assist rates in the bottom tier? 3. ✓ Do home throws provide meaningful but not dominant bonuses? 4. ✓ Is the distribution still normal (-6 to +5)? 5. ✓ Do known strong arms (Ichiro, Guerrero) rank appropriately? --- **Updated:** 2025-11-15 **Status:** Rate-dominant formula implemented **Next Step:** Run validation script to confirm distribution