paper-dynasty-card-creation/docs/formula_weight_comparison.md
2025-11-23 01:28:33 -06:00

147 lines
5.2 KiB
Markdown

# Formula Weight Comparison - Before vs After
## The Problem
**Original Formula** over-weighted volume (total assists) and under-weighted rate stats:
```python
# OLD - Volume Dominant
raw_score = (
(assist_rate * 30) + # Weak: 0.86 pts for elite 2.87% rate
(throwout_rate * 5) + # Moderate
(home_throws * 2) + # DOMINANT: 16 pts for 8 home throws
(batter_extra_outs * 1.5) + # Moderate
(total_assists * 0.5) # Strong: 4 pts for 8 assists
)
# Result: Volume (20 pts) overwhelmed rate (0.86 pts) by 23x
```
### Old Formula Results
| Rank | Player | Score | Assist Rate | Home Throws | Total Assists |
|------|--------|-------|-------------|-------------|---------------|
| 1 | Lieberthal | 25.86 | 2.87% ❌ | 8 ✓ | 8 ✓ |
| 2 | Matthews | 23.30 | 2.66% ❌ | 7 ✓ | 7 ✓ |
| 3 | gathj001 | 21.00 | 6.67% ✓ | 4 | 6 |
| 4 | Taguchi | 20.88 | 8.64% ✓✓ | 3 | 7 |
**Problem:** So Taguchi has **3x better assist rate** but ranks **last** because of low volume!
---
## The Solution
**New Formula** prioritizes rate stats while giving credit for quality:
```python
# NEW - Rate Dominant (Simplified)
raw_score = (
(assist_rate * 300) + # DOMINANT: 25.9 pts for elite 8.64% rate
(home_throws * 1.0) + # Quality bonus: 8 pts for 8 home throws
(batter_extra_outs * 1.0) + # Quality bonus
(total_assists * 0.1) # Minimal volume: 0.8 pts for 8 assists
)
# Result: Rate (25.9 pts) dominates, quality/volume are tiebreakers
# Note: Removed throwout_rate - redundant since assists = outs
```
### New Formula Results
| Rank | Player | Score | Assist Rate | Home Throws | Total Assists | Why? |
|------|--------|-------|-------------|-------------|---------------|------|
| 1 | **Taguchi** | **26.52** | **8.64%** ✓✓ | 3 | 7 | Elite rate rewarded! |
| 2 | gathj001 | 24.61 | 6.67% ✓ | 4 | 6 | Good rate |
| 3 | Lieberthal | 21.11 | 2.87% | 8 ✓ | 8 | Volume can't overcome low rate |
| 4 | Matthews | 19.69 | 2.66% | 7 ✓ | 7 | Lowest rate = lowest score |
**Solution:** Players ranked by **assist rate quality** first, volume second!
---
## Weight Changes Breakdown
| Component | Old Weight | New Weight | Change | Impact |
|-----------|-----------|-----------|--------|---------|
| **Assist Rate** | 30 | **300** | +900% | Dominant driver |
| Throwout Rate | 5 | **REMOVED** | -100% | Redundant (assists = outs) |
| Home Throws | 2.0 | **1.0** | -50% | Minor quality indicator |
| Batter Extra Outs | 1.5 | **1.0** | -33% | Minor quality indicator |
| Volume (Total Assists) | 0.5 | **0.1** | -80% | Minimal bonus only |
---
## Component Contribution Examples
### Elite Rate Player (8.64% rate, 3 home throws, 7 assists)
**Example High Assist Rate Player**
| Component | Old Formula | New Formula (Simplified) |
|-----------|-------------|--------------------------|
| Assist Rate (8.64%) | 2.59 pts | **25.92 pts** ✓ |
| Throwout Rate (86%) | 4.29 pts | **REMOVED** |
| Home Throws (3) | 6.0 pts | 3.0 pts |
| Batter Extra (3) | 4.5 pts | 3.0 pts |
| Volume (7) | 3.5 pts | 0.7 pts |
| **TOTAL** | **20.88** | **32.62** ✓ |
### High Volume Player (2.87% rate, 8 home throws, 8 assists)
**Example Volume-Based Player**
| Component | Old Formula | New Formula (Simplified) |
|-----------|-------------|--------------------------|
| Assist Rate (2.87%) | 0.86 pts | **8.61 pts** |
| Throwout Rate (100%) | 5.0 pts | **REMOVED** |
| Home Throws (8) | **16.0 pts** ❌ | 8.0 pts ✓ |
| Batter Extra (0) | 0.0 pts | 0.0 pts |
| Volume (8) | **4.0 pts** | 0.8 pts ✓ |
| **TOTAL** | **25.86** ❌ | **17.41** ✓ |
**Key Difference:**
- **Old:** Volume (20 pts) >> Rate (0.86 pts) - Wrong priority!
- **New:** Rate (8.61 pts) dominates, quality/volume are minor bonuses
- **Removed:** Throwout rate (redundant with assists)
---
## Design Philosophy
### Simplified Rate-First Approach
1. **Assist Rate (300x)** - How often do they throw out runners per opportunity?
- Elite (8%+): 24+ points
- Average (3%): ~9 points
- Poor (1%): ~3 points
2. **Quality Indicators (1.0x each)** - What types of plays do they make?
- Home throws: 1 point each (minor bonus)
- Batter extra outs: 1 point each (minor bonus)
3. **Volume Bonus (0.1x)** - Minimal credit for total assists
- Only used as tiebreaker
- 10 assists = 1 point
4. **Throwout Rate** - REMOVED (redundant)
- Assists are already outs by definition
- No need to measure "efficiency" of assists
### Expected Behavior
- **High rate, low volume** (platoon player with cannon arm): Ranks high ✓
- **High rate, high volume** (everyday elite arm): Ranks highest ✓
- **Low rate, high volume** (lucky positioning): Ranks lower ✓
- **Low rate, low volume** (weak arm): Ranks lowest ✓
---
## Validation Questions
Run `python test_retrosheet_arms.py` and check:
1. ✓ Are players with 8%+ assist rates in the top tier?
2. ✓ Are players with <2% assist rates in the bottom tier?
3. Do home throws provide meaningful but not dominant bonuses?
4. Is the distribution still normal (-6 to +5)?
5. Do known strong arms (Ichiro, Guerrero) rank appropriately?
---
**Updated:** 2025-11-15
**Status:** Rate-dominant formula implemented
**Next Step:** Run validation script to confirm distribution