147 lines
5.2 KiB
Markdown
147 lines
5.2 KiB
Markdown
# Formula Weight Comparison - Before vs After
|
|
|
|
## The Problem
|
|
|
|
**Original Formula** over-weighted volume (total assists) and under-weighted rate stats:
|
|
|
|
```python
|
|
# OLD - Volume Dominant
|
|
raw_score = (
|
|
(assist_rate * 30) + # Weak: 0.86 pts for elite 2.87% rate
|
|
(throwout_rate * 5) + # Moderate
|
|
(home_throws * 2) + # DOMINANT: 16 pts for 8 home throws
|
|
(batter_extra_outs * 1.5) + # Moderate
|
|
(total_assists * 0.5) # Strong: 4 pts for 8 assists
|
|
)
|
|
# Result: Volume (20 pts) overwhelmed rate (0.86 pts) by 23x
|
|
```
|
|
|
|
### Old Formula Results
|
|
| Rank | Player | Score | Assist Rate | Home Throws | Total Assists |
|
|
|------|--------|-------|-------------|-------------|---------------|
|
|
| 1 | Lieberthal | 25.86 | 2.87% ❌ | 8 ✓ | 8 ✓ |
|
|
| 2 | Matthews | 23.30 | 2.66% ❌ | 7 ✓ | 7 ✓ |
|
|
| 3 | gathj001 | 21.00 | 6.67% ✓ | 4 | 6 |
|
|
| 4 | Taguchi | 20.88 | 8.64% ✓✓ | 3 | 7 |
|
|
|
|
**Problem:** So Taguchi has **3x better assist rate** but ranks **last** because of low volume!
|
|
|
|
---
|
|
|
|
## The Solution
|
|
|
|
**New Formula** prioritizes rate stats while giving credit for quality:
|
|
|
|
```python
|
|
# NEW - Rate Dominant (Simplified)
|
|
raw_score = (
|
|
(assist_rate * 300) + # DOMINANT: 25.9 pts for elite 8.64% rate
|
|
(home_throws * 1.0) + # Quality bonus: 8 pts for 8 home throws
|
|
(batter_extra_outs * 1.0) + # Quality bonus
|
|
(total_assists * 0.1) # Minimal volume: 0.8 pts for 8 assists
|
|
)
|
|
# Result: Rate (25.9 pts) dominates, quality/volume are tiebreakers
|
|
# Note: Removed throwout_rate - redundant since assists = outs
|
|
```
|
|
|
|
### New Formula Results
|
|
| Rank | Player | Score | Assist Rate | Home Throws | Total Assists | Why? |
|
|
|------|--------|-------|-------------|-------------|---------------|------|
|
|
| 1 | **Taguchi** | **26.52** | **8.64%** ✓✓ | 3 | 7 | Elite rate rewarded! |
|
|
| 2 | gathj001 | 24.61 | 6.67% ✓ | 4 | 6 | Good rate |
|
|
| 3 | Lieberthal | 21.11 | 2.87% | 8 ✓ | 8 | Volume can't overcome low rate |
|
|
| 4 | Matthews | 19.69 | 2.66% | 7 ✓ | 7 | Lowest rate = lowest score |
|
|
|
|
**Solution:** Players ranked by **assist rate quality** first, volume second!
|
|
|
|
---
|
|
|
|
## Weight Changes Breakdown
|
|
|
|
| Component | Old Weight | New Weight | Change | Impact |
|
|
|-----------|-----------|-----------|--------|---------|
|
|
| **Assist Rate** | 30 | **300** | +900% | Dominant driver |
|
|
| Throwout Rate | 5 | **REMOVED** | -100% | Redundant (assists = outs) |
|
|
| Home Throws | 2.0 | **1.0** | -50% | Minor quality indicator |
|
|
| Batter Extra Outs | 1.5 | **1.0** | -33% | Minor quality indicator |
|
|
| Volume (Total Assists) | 0.5 | **0.1** | -80% | Minimal bonus only |
|
|
|
|
---
|
|
|
|
## Component Contribution Examples
|
|
|
|
### Elite Rate Player (8.64% rate, 3 home throws, 7 assists)
|
|
**Example High Assist Rate Player**
|
|
|
|
| Component | Old Formula | New Formula (Simplified) |
|
|
|-----------|-------------|--------------------------|
|
|
| Assist Rate (8.64%) | 2.59 pts | **25.92 pts** ✓ |
|
|
| Throwout Rate (86%) | 4.29 pts | **REMOVED** |
|
|
| Home Throws (3) | 6.0 pts | 3.0 pts |
|
|
| Batter Extra (3) | 4.5 pts | 3.0 pts |
|
|
| Volume (7) | 3.5 pts | 0.7 pts |
|
|
| **TOTAL** | **20.88** | **32.62** ✓ |
|
|
|
|
### High Volume Player (2.87% rate, 8 home throws, 8 assists)
|
|
**Example Volume-Based Player**
|
|
|
|
| Component | Old Formula | New Formula (Simplified) |
|
|
|-----------|-------------|--------------------------|
|
|
| Assist Rate (2.87%) | 0.86 pts | **8.61 pts** |
|
|
| Throwout Rate (100%) | 5.0 pts | **REMOVED** |
|
|
| Home Throws (8) | **16.0 pts** ❌ | 8.0 pts ✓ |
|
|
| Batter Extra (0) | 0.0 pts | 0.0 pts |
|
|
| Volume (8) | **4.0 pts** | 0.8 pts ✓ |
|
|
| **TOTAL** | **25.86** ❌ | **17.41** ✓ |
|
|
|
|
**Key Difference:**
|
|
- **Old:** Volume (20 pts) >> Rate (0.86 pts) - Wrong priority!
|
|
- **New:** Rate (8.61 pts) dominates, quality/volume are minor bonuses
|
|
- **Removed:** Throwout rate (redundant with assists)
|
|
|
|
---
|
|
|
|
## Design Philosophy
|
|
|
|
### Simplified Rate-First Approach
|
|
1. **Assist Rate (300x)** - How often do they throw out runners per opportunity?
|
|
- Elite (8%+): 24+ points
|
|
- Average (3%): ~9 points
|
|
- Poor (1%): ~3 points
|
|
|
|
2. **Quality Indicators (1.0x each)** - What types of plays do they make?
|
|
- Home throws: 1 point each (minor bonus)
|
|
- Batter extra outs: 1 point each (minor bonus)
|
|
|
|
3. **Volume Bonus (0.1x)** - Minimal credit for total assists
|
|
- Only used as tiebreaker
|
|
- 10 assists = 1 point
|
|
|
|
4. **Throwout Rate** - REMOVED (redundant)
|
|
- Assists are already outs by definition
|
|
- No need to measure "efficiency" of assists
|
|
|
|
### Expected Behavior
|
|
- **High rate, low volume** (platoon player with cannon arm): Ranks high ✓
|
|
- **High rate, high volume** (everyday elite arm): Ranks highest ✓
|
|
- **Low rate, high volume** (lucky positioning): Ranks lower ✓
|
|
- **Low rate, low volume** (weak arm): Ranks lowest ✓
|
|
|
|
---
|
|
|
|
## Validation Questions
|
|
|
|
Run `python test_retrosheet_arms.py` and check:
|
|
|
|
1. ✓ Are players with 8%+ assist rates in the top tier?
|
|
2. ✓ Are players with <2% assist rates in the bottom tier?
|
|
3. ✓ Do home throws provide meaningful but not dominant bonuses?
|
|
4. ✓ Is the distribution still normal (-6 to +5)?
|
|
5. ✓ Do known strong arms (Ichiro, Guerrero) rank appropriately?
|
|
|
|
---
|
|
|
|
**Updated:** 2025-11-15
|
|
**Status:** Rate-dominant formula implemented
|
|
**Next Step:** Run validation script to confirm distribution
|