strat-gameplay-webapp/.claude/implementation/STAT_SYSTEM_MATERIALIZED_VIEWS.md
Cal Corum b5677d0c55 CLAUDE: Phase 3.5 Planning - Code Polish & Statistics System
Completed comprehensive planning for Phase 3.5 with focus on production
readiness through materialized views approach for statistics.

Planning Documents Created:
- STAT_SYSTEM_ANALYSIS.md: Analysis of existing major-domo schema
  * Reviewed legacy BattingStat/PitchingStat tables (deprecated)
  * Analyzed existing /plays/batting and /plays/pitching endpoints
  * Evaluated 3 approaches (legacy port, modern, hybrid)

- STAT_SYSTEM_MATERIALIZED_VIEWS.md: Recommended approach
  * PostgreSQL materialized views (following major-domo pattern)
  * Add stat fields to plays table (18 new columns)
  * 3 views: batting_game_stats, pitching_game_stats, game_stats
  * PlayStatCalculator service (~150 lines vs 400+ for StatTracker)
  * 80% less code, single source of truth, always consistent

- phase-3.5-polish-stats.md: Complete implementation plan
  * Task 1: Game Statistics System (materialized views)
  * Task 2: Authorization Framework (WebSocket security)
  * Task 3: Uncapped Hit Decision Trees
  * Task 4: Code Cleanup (remove TODOs, integrate features)
  * Task 5: Integration Test Infrastructure
  * Estimated: 16-24 hours (2-3 days)

NEXT_SESSION.md Updates:
- Phase 3.5 ready to begin (0% → implementation phase)
- Complete task breakdown with acceptance criteria
- Materialized view approach detailed
- Commit strategy for 3 separate commits
- Files to review before starting

Implementation Status Updates:
- Phase 3: 100% Complete (688 tests passing)
- Phase 3F: Substitution system fully tested
- Phase 3.5: Planning complete, ready for implementation
- Updated component status table with Phase 3 completion

Key Decisions:
- Use materialized views (not separate stat tables)
- Add stat fields to plays table
- Refresh views after game completion + on-demand
- Use legacy field names (pa, ab, run, hit) for compatibility
- Skip experimental fields (bphr, xba, etc.) for MVP

Benefits of Materialized Views:
- 80% less code (~400 lines → ~150 lines)
- Single source of truth (plays table)
- Always consistent (stats derived, not tracked)
- Follows existing major-domo pattern
- PostgreSQL optimized (indexed, cached)

Next Steps:
1. Implement PlayStatCalculator (map PlayOutcome → stats)
2. Add stat fields to plays table (migration 004)
3. Create materialized views (migration 005)
4. Create BoxScoreService (query views)
5. Refresh logic after game completion

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-06 16:08:23 -06:00

22 KiB

Statistics System: Materialized Views Approach

Date: 2025-11-06 Approach: PostgreSQL Materialized Views (following legacy API pattern)


Why Materialized Views?

Legacy API Pattern

The existing major-domo API already does this:

  • /plays/batting - Aggregates plays on-the-fly with SQL SUM/COUNT
  • /plays/pitching - Aggregates plays on-the-fly with SQL SUM/COUNT
  • BattingStat/PitchingStat tables - Deprecated in favor of play aggregation

Benefits

  1. Single source of truth: Plays table is the only data we write
  2. No stat tracking code: PostgreSQL does the aggregation
  3. Always consistent: Stats are derived, not stored separately
  4. Fast queries: Materialized views are indexed and cached
  5. Refresh on demand: Update views when needed (after games, on schedule)
  6. No sync issues: Can't have play/stat mismatches

Drawbacks

  • ⚠️ Refresh required: Views must be refreshed to show latest data
  • ⚠️ PostgreSQL specific: Can't easily port to other databases
  • ⚠️ Migration complexity: Need to manage view schema changes

Verdict: Benefits far outweigh drawbacks for this use case


Proposed Schema

Existing Tables (Already Have)

-- We already have these in the web app
CREATE TABLE games (
    id UUID PRIMARY KEY,
    league_id VARCHAR(50),
    home_team_id INT,
    away_team_id INT,
    status VARCHAR(20),
    current_inning INT,
    current_half VARCHAR(3),
    home_score INT,
    away_score INT,
    -- ... other game fields
);

CREATE TABLE lineups (
    id SERIAL PRIMARY KEY,
    game_id UUID REFERENCES games(id),
    card_id INT,  -- Player card ID
    team_id INT,
    position VARCHAR(3),
    batting_order INT,
    is_active BOOLEAN,
    is_starter BOOLEAN,
    -- ... other lineup fields
);

CREATE TABLE plays (
    id SERIAL PRIMARY KEY,
    game_id UUID REFERENCES games(id),
    play_num INT,
    inning INT,
    half VARCHAR(3),
    batter_lineup_id INT REFERENCES lineups(id),
    pitcher_lineup_id INT REFERENCES lineups(id),
    play_result VARCHAR(50),  -- PlayOutcome enum value
    -- ... other play fields (runners, outs, scores, etc.)
);

Need to add to Plays table (currently missing):

ALTER TABLE plays ADD COLUMN IF NOT EXISTS pa INT DEFAULT 0;           -- Plate appearance (0 or 1)
ALTER TABLE plays ADD COLUMN IF NOT EXISTS ab INT DEFAULT 0;           -- At bat (0 or 1)
ALTER TABLE plays ADD COLUMN IF NOT EXISTS hit INT DEFAULT 0;          -- Hit (0 or 1)
ALTER TABLE plays ADD COLUMN IF NOT EXISTS run INT DEFAULT 0;          -- Runs scored this play
ALTER TABLE plays ADD COLUMN IF NOT EXISTS rbi INT DEFAULT 0;          -- RBIs this play
ALTER TABLE plays ADD COLUMN IF NOT EXISTS double INT DEFAULT 0;       -- Double (0 or 1)
ALTER TABLE plays ADD COLUMN IF NOT EXISTS triple INT DEFAULT 0;       -- Triple (0 or 1)
ALTER TABLE plays ADD COLUMN IF NOT EXISTS hr INT DEFAULT 0;           -- Home run (0 or 1)
ALTER TABLE plays ADD COLUMN IF NOT EXISTS bb INT DEFAULT 0;           -- Walk (0 or 1)
ALTER TABLE plays ADD COLUMN IF NOT EXISTS so INT DEFAULT 0;           -- Strikeout (0 or 1)
ALTER TABLE plays ADD COLUMN IF NOT EXISTS hbp INT DEFAULT 0;          -- Hit by pitch (0 or 1)
ALTER TABLE plays ADD COLUMN IF NOT EXISTS sac INT DEFAULT 0;          -- Sacrifice (0 or 1)
ALTER TABLE plays ADD COLUMN IF NOT EXISTS sb INT DEFAULT 0;           -- Stolen base (0 or 1)
ALTER TABLE plays ADD COLUMN IF NOT EXISTS cs INT DEFAULT 0;           -- Caught stealing (0 or 1)
ALTER TABLE plays ADD COLUMN IF NOT EXISTS gidp INT DEFAULT 0;         -- Ground into DP (0 or 1)
ALTER TABLE plays ADD COLUMN IF NOT EXISTS error INT DEFAULT 0;        -- Error this play
ALTER TABLE plays ADD COLUMN IF NOT EXISTS wp INT DEFAULT 0;           -- Wild pitch (0 or 1)
ALTER TABLE plays ADD COLUMN IF NOT EXISTS pb INT DEFAULT 0;           -- Passed ball (0 or 1)
ALTER TABLE plays ADD COLUMN IF NOT EXISTS outs_recorded INT DEFAULT 0; -- Outs recorded this play

Why add stat fields to plays?

  • Each play records the statistical events that occurred
  • Makes aggregation queries simple and fast
  • Matches legacy StratPlay model exactly
  • Can be calculated from PlayOutcome when play is recorded

New Materialized Views

1. Batting Stats by Player-Game

CREATE MATERIALIZED VIEW batting_game_stats AS
SELECT
    l.game_id,
    l.id AS lineup_id,
    l.card_id AS player_card_id,
    l.team_id,

    -- Aggregate batting stats
    COUNT(*) FILTER (WHERE p.pa = 1) AS pa,
    COUNT(*) FILTER (WHERE p.ab = 1) AS ab,
    SUM(p.run) AS run,
    SUM(p.hit) AS hit,
    SUM(p.rbi) AS rbi,
    SUM(p.double) AS double,
    SUM(p.triple) AS triple,
    SUM(p.hr) AS hr,
    SUM(p.bb) AS bb,
    SUM(p.so) AS so,
    SUM(p.hbp) AS hbp,
    SUM(p.sac) AS sac,
    SUM(p.sb) AS sb,
    SUM(p.cs) AS cs,
    SUM(p.gidp) AS gidp

FROM lineups l
JOIN plays p ON p.batter_lineup_id = l.id
WHERE l.is_active = TRUE OR l.is_starter = TRUE  -- Include all who played
GROUP BY l.game_id, l.id, l.card_id, l.team_id;

-- Indexes for fast lookups
CREATE UNIQUE INDEX idx_batting_game_stats_lineup ON batting_game_stats(lineup_id);
CREATE INDEX idx_batting_game_stats_game ON batting_game_stats(game_id);
CREATE INDEX idx_batting_game_stats_player ON batting_game_stats(player_card_id);

Usage: SELECT * FROM batting_game_stats WHERE game_id = ?


2. Pitching Stats by Player-Game

CREATE MATERIALIZED VIEW pitching_game_stats AS
SELECT
    l.game_id,
    l.id AS lineup_id,
    l.card_id AS player_card_id,
    l.team_id,

    -- Aggregate pitching stats
    COUNT(*) FILTER (WHERE p.pa = 1) AS batters_faced,
    SUM(p.hit) AS hit_allowed,
    SUM(p.run) AS run_allowed,
    -- TODO: Calculate earned runs (need error tracking)
    SUM(p.run) AS erun,  -- Placeholder: all runs as earned for now
    SUM(p.bb) AS bb,
    SUM(p.so) AS so,
    SUM(p.hbp) AS hbp,
    SUM(p.hr) AS hr_allowed,
    SUM(p.wp) AS wp,

    -- Calculate innings pitched from outs
    -- IP = outs / 3.0 (e.g., 16 outs = 5.1 IP)
    (SUM(p.outs_recorded)::float / 3.0) AS ip

FROM lineups l
JOIN plays p ON p.pitcher_lineup_id = l.id
WHERE l.position = 'P' AND (l.is_active = TRUE OR l.is_starter = TRUE)
GROUP BY l.game_id, l.id, l.card_id, l.team_id;

-- Indexes
CREATE UNIQUE INDEX idx_pitching_game_stats_lineup ON pitching_game_stats(lineup_id);
CREATE INDEX idx_pitching_game_stats_game ON pitching_game_stats(game_id);
CREATE INDEX idx_pitching_game_stats_player ON pitching_game_stats(player_card_id);

Usage: SELECT * FROM pitching_game_stats WHERE game_id = ?


3. Team Game Stats (Summary + Linescore)

For box scores, we need team-level aggregates and linescore (runs by inning):

CREATE MATERIALIZED VIEW game_stats AS
WITH inning_scores AS (
    -- Calculate runs scored per team per inning
    SELECT
        game_id,
        inning,
        half,
        SUM(run) AS runs_scored
    FROM plays
    GROUP BY game_id, inning, half
),
linescores AS (
    -- Build linescore JSON arrays
    SELECT
        game_id,
        json_agg(runs_scored ORDER BY inning) FILTER (WHERE half = 'top') AS away_linescore,
        json_agg(runs_scored ORDER BY inning) FILTER (WHERE half = 'bot') AS home_linescore
    FROM inning_scores
    GROUP BY game_id
),
batting_totals AS (
    -- Team batting totals
    SELECT
        game_id,
        l.team_id,
        SUM(p.hit) AS hits,
        SUM(p.error) AS errors
    FROM plays p
    JOIN lineups l ON p.batter_lineup_id = l.id
    GROUP BY game_id, l.team_id
)
SELECT
    g.id AS game_id,
    g.home_score AS home_runs,
    g.away_score AS away_runs,

    -- Linescore
    COALESCE(ls.home_linescore, '[]'::json) AS home_linescore,
    COALESCE(ls.away_linescore, '[]'::json) AS away_linescore,

    -- Home team totals
    MAX(bth.hits) AS home_hits,
    MAX(bth.errors) AS home_errors,

    -- Away team totals
    MAX(bta.hits) AS away_hits,
    MAX(bta.errors) AS away_errors

FROM games g
LEFT JOIN linescores ls ON g.id = ls.game_id
LEFT JOIN batting_totals bth ON g.id = bth.game_id AND bth.team_id = g.home_team_id
LEFT JOIN batting_totals bta ON g.id = bta.game_id AND bta.team_id = g.away_team_id
GROUP BY g.id, ls.home_linescore, ls.away_linescore;

-- Index
CREATE UNIQUE INDEX idx_game_stats_game ON game_stats(game_id);

Usage: SELECT * FROM game_stats WHERE game_id = ?

Linescore Format: [0, 1, 0, 3, 0, 0, 2, 1, 0] (9 innings)


Implementation Strategy

Phase 3.5 Changes

1. Add Stat Fields to Plays Table (Migration)

File: backend/alembic/versions/004_add_play_stat_fields.py

"""Add statistical fields to plays table.

Revision ID: 004
Revises: 003
"""

def upgrade():
    # Add batting stat fields
    op.add_column('plays', sa.Column('pa', sa.Integer(), default=0))
    op.add_column('plays', sa.Column('ab', sa.Integer(), default=0))
    op.add_column('plays', sa.Column('hit', sa.Integer(), default=0))
    op.add_column('plays', sa.Column('run', sa.Integer(), default=0))
    op.add_column('plays', sa.Column('rbi', sa.Integer(), default=0))
    op.add_column('plays', sa.Column('double', sa.Integer(), default=0))
    op.add_column('plays', sa.Column('triple', sa.Integer(), default=0))
    op.add_column('plays', sa.Column('hr', sa.Integer(), default=0))
    op.add_column('plays', sa.Column('bb', sa.Integer(), default=0))
    op.add_column('plays', sa.Column('so', sa.Integer(), default=0))
    op.add_column('plays', sa.Column('hbp', sa.Integer(), default=0))
    op.add_column('plays', sa.Column('sac', sa.Integer(), default=0))
    op.add_column('plays', sa.Column('sb', sa.Integer(), default=0))
    op.add_column('plays', sa.Column('cs', sa.Integer(), default=0))
    op.add_column('plays', sa.Column('gidp', sa.Integer(), default=0))
    op.add_column('plays', sa.Column('error', sa.Integer(), default=0))
    op.add_column('plays', sa.Column('wp', sa.Integer(), default=0))
    op.add_column('plays', sa.Column('pb', sa.Integer(), default=0))
    op.add_column('plays', sa.Column('outs_recorded', sa.Integer(), default=0))

def downgrade():
    # Remove all stat columns
    op.drop_column('plays', 'pa')
    # ... etc

2. Update Play Model

File: backend/app/models/db_models.py

class Play(Base):
    __tablename__ = "plays"

    # Existing fields
    id = Column(Integer, primary_key=True)
    game_id = Column(UUID, ForeignKey("games.id"))
    # ... etc

    # NEW: Statistical fields (added in migration 004)
    pa = Column(Integer, default=0)
    ab = Column(Integer, default=0)
    hit = Column(Integer, default=0)
    run = Column(Integer, default=0)
    rbi = Column(Integer, default=0)
    double = Column(Integer, default=0)
    triple = Column(Integer, default=0)
    hr = Column(Integer, default=0)
    bb = Column(Integer, default=0)
    so = Column(Integer, default=0)
    hbp = Column(Integer, default=0)
    sac = Column(Integer, default=0)
    sb = Column(Integer, default=0)
    cs = Column(Integer, default=0)
    gidp = Column(Integer, default=0)
    error = Column(Integer, default=0)
    wp = Column(Integer, default=0)
    pb = Column(Integer, default=0)
    outs_recorded = Column(Integer, default=0)

3. Create Stat Calculator

File: backend/app/services/play_stat_calculator.py (NEW)

"""
Calculate statistical fields from PlayOutcome.

Called when recording a play to populate stat fields.
"""
from app.config.play_outcome import PlayOutcome
from app.models.game_models import PlayResult, GameState

class PlayStatCalculator:
    """
    Converts play outcome to statistical fields for database.
    """

    @staticmethod
    def calculate_stats(
        outcome: PlayOutcome,
        result: PlayResult,
        state_before: GameState,
        state_after: GameState
    ) -> dict:
        """
        Calculate all stat fields for a play.

        Returns:
            dict with stat fields (pa, ab, hit, run, etc.)
        """
        stats = {
            'pa': 0, 'ab': 0, 'hit': 0, 'run': 0, 'rbi': 0,
            'double': 0, 'triple': 0, 'hr': 0, 'bb': 0, 'so': 0,
            'hbp': 0, 'sac': 0, 'sb': 0, 'cs': 0, 'gidp': 0,
            'error': 0, 'wp': 0, 'pb': 0, 'outs_recorded': 0
        }

        # Plate appearance (almost always 1, except...)
        if outcome not in [PlayOutcome.STOLEN_BASE_SUCCESS,
                           PlayOutcome.CAUGHT_STEALING,
                           PlayOutcome.WILD_PITCH,
                           PlayOutcome.PASSED_BALL]:
            stats['pa'] = 1

        # At bat (PA minus walks, HBP, sac)
        if stats['pa'] == 1 and outcome not in [
            PlayOutcome.WALK, PlayOutcome.HBP, PlayOutcome.SACRIFICE
        ]:
            stats['ab'] = 1

        # Hits
        if outcome in PlayOutcome.get_hit_outcomes():
            stats['hit'] = 1

            if outcome in PlayOutcome.get_double_outcomes():
                stats['double'] = 1
            elif outcome in PlayOutcome.get_triple_outcomes():
                stats['triple'] = 1
            elif outcome in PlayOutcome.get_homerun_outcomes():
                stats['hr'] = 1

        # Other outcomes
        if outcome == PlayOutcome.WALK:
            stats['bb'] = 1
        elif outcome == PlayOutcome.STRIKEOUT:
            stats['so'] = 1
        elif outcome == PlayOutcome.HBP:
            stats['hbp'] = 1
        elif outcome == PlayOutcome.SACRIFICE:
            stats['sac'] = 1
        elif outcome == PlayOutcome.STOLEN_BASE_SUCCESS:
            stats['sb'] = 1
        elif outcome == PlayOutcome.CAUGHT_STEALING:
            stats['cs'] = 1
        elif outcome in [PlayOutcome.GROUNDBALL_GDP]:
            stats['gidp'] = 1
        elif outcome == PlayOutcome.WILD_PITCH:
            stats['wp'] = 1
        elif outcome == PlayOutcome.PASSED_BALL:
            stats['pb'] = 1

        # Calculate runs and RBIs from state change
        stats['run'] = state_after.home_score - state_before.home_score
        if state_after.away_score > state_before.away_score:
            stats['run'] = state_after.away_score - state_before.away_score

        # RBI logic (runs scored minus errors)
        if not result.error_occurred:
            stats['rbi'] = stats['run']

        # Outs recorded
        stats['outs_recorded'] = state_after.outs - state_before.outs

        return stats

4. Update GameEngine to Calculate Stats

File: backend/app/core/game_engine.py

from app.services.play_stat_calculator import PlayStatCalculator

class GameEngine:
    async def record_play(
        self,
        game_id: UUID,
        outcome: PlayOutcome,
        ...
    ) -> PlayResult:
        """Record play and calculate stats."""

        state_before = self.state_manager.get_state(game_id)

        # Resolve play (existing logic)
        result = self._resolve_play(...)

        state_after = self.state_manager.get_state(game_id)

        # NEW: Calculate stat fields
        stats = PlayStatCalculator.calculate_stats(
            outcome, result, state_before, state_after
        )

        # Save play to database with stats
        await self.db_ops.create_play(
            game_id=game_id,
            play_num=state_after.play_count,
            inning=state_after.current_inning,
            half=state_after.current_half,
            batter_lineup_id=state_before.current_batter.lineup_id,
            pitcher_lineup_id=state_before.current_pitcher.lineup_id,
            play_result=outcome.value,
            # NEW: Add stat fields
            **stats
        )

        return result

5. Create Materialized Views (Migration)

File: backend/alembic/versions/005_create_stat_views.py

"""Create materialized views for statistics.

Revision ID: 005
Revises: 004
"""

def upgrade():
    # Create batting_game_stats view
    op.execute("""
        CREATE MATERIALIZED VIEW batting_game_stats AS
        SELECT
            l.game_id,
            l.id AS lineup_id,
            l.card_id AS player_card_id,
            l.team_id,
            COUNT(*) FILTER (WHERE p.pa = 1) AS pa,
            COUNT(*) FILTER (WHERE p.ab = 1) AS ab,
            SUM(p.run) AS run,
            SUM(p.hit) AS hit,
            SUM(p.rbi) AS rbi,
            SUM(p.double) AS double,
            SUM(p.triple) AS triple,
            SUM(p.hr) AS hr,
            SUM(p.bb) AS bb,
            SUM(p.so) AS so,
            SUM(p.hbp) AS hbp,
            SUM(p.sac) AS sac,
            SUM(p.sb) AS sb,
            SUM(p.cs) AS cs,
            SUM(p.gidp) AS gidp
        FROM lineups l
        JOIN plays p ON p.batter_lineup_id = l.id
        GROUP BY l.game_id, l.id, l.card_id, l.team_id
    """)

    op.execute("""
        CREATE UNIQUE INDEX idx_batting_game_stats_lineup
        ON batting_game_stats(lineup_id)
    """)

    # Create pitching_game_stats view
    op.execute("""
        CREATE MATERIALIZED VIEW pitching_game_stats AS
        ... (full SQL from above)
    """)

    # Create game_stats view
    op.execute("""
        CREATE MATERIALIZED VIEW game_stats AS
        ... (full SQL from above)
    """)

def downgrade():
    op.execute("DROP MATERIALIZED VIEW IF EXISTS batting_game_stats")
    op.execute("DROP MATERIALIZED VIEW IF EXISTS pitching_game_stats")
    op.execute("DROP MATERIALIZED VIEW IF EXISTS game_stats")

6. Create View Refresh Service

File: backend/app/services/stat_view_refresher.py (NEW)

"""
Service to refresh materialized views.
"""
from sqlalchemy import text
from app.database.database import async_session

class StatViewRefresher:
    """Manages refreshing of statistical materialized views."""

    @staticmethod
    async def refresh_game_stats(game_id: UUID) -> None:
        """
        Refresh stats for a specific game (partial refresh).

        Note: PostgreSQL doesn't support partial refresh of materialized views.
        We'd need to implement this with regular views or partitioning.
        For now, refresh all views after game completion.
        """
        await StatViewRefresher.refresh_all()

    @staticmethod
    async def refresh_all() -> None:
        """Refresh all statistical views."""
        async with async_session() as session:
            await session.execute(text("REFRESH MATERIALIZED VIEW CONCURRENTLY batting_game_stats"))
            await session.execute(text("REFRESH MATERIALIZED VIEW CONCURRENTLY pitching_game_stats"))
            await session.execute(text("REFRESH MATERIALIZED VIEW CONCURRENTLY game_stats"))
            await session.commit()

7. Box Score Retrieval

File: backend/app/services/box_score_service.py (NEW)

"""
Service to retrieve formatted box scores.
"""
from sqlalchemy import text
from app.database.database import async_session

class BoxScoreService:
    """Retrieves box score data from materialized views."""

    @staticmethod
    async def get_box_score(game_id: UUID) -> dict:
        """
        Get complete box score for a game.

        Returns dict with:
            - game_stats: Team totals and linescore
            - batting_stats: List of batting lines
            - pitching_stats: List of pitching lines
        """
        async with async_session() as session:
            # Get game stats
            game_query = text("""
                SELECT * FROM game_stats WHERE game_id = :game_id
            """)
            game_result = await session.execute(game_query, {"game_id": str(game_id)})
            game_stats = game_result.fetchone()._asdict() if game_result else {}

            # Get batting stats
            batting_query = text("""
                SELECT * FROM batting_game_stats WHERE game_id = :game_id
                ORDER BY lineup_id
            """)
            batting_result = await session.execute(batting_query, {"game_id": str(game_id)})
            batting_stats = [row._asdict() for row in batting_result]

            # Get pitching stats
            pitching_query = text("""
                SELECT * FROM pitching_game_stats WHERE game_id = :game_id
                ORDER BY lineup_id
            """)
            pitching_result = await session.execute(pitching_query, {"game_id": str(game_id)})
            pitching_stats = [row._asdict() for row in pitching_result]

            return {
                'game_stats': game_stats,
                'batting_stats': batting_stats,
                'pitching_stats': pitching_stats
            }

Refresh Strategy

When to Refresh Views?

Option 1: After Each Play (Real-time)

  • Pros: Always up-to-date
  • Cons: Performance impact, too frequent

Option 2: After Game Completion (Recommended for MVP)

  • Pros: Good balance of freshness and performance
  • Cons: Stats not visible until game ends

Option 3: On Demand (Terminal client)

  • Pros: User controls refresh
  • Cons: Stale data between refreshes

Option 4: Scheduled (Every 5 minutes)

  • Pros: Predictable, low impact
  • Cons: 5-minute staleness

Recommended: Option 2 (game completion) + Option 3 (on-demand for testing)


Benefits of This Approach

  1. No Stat Tracker Needed: Eliminated ~400 lines of stat tracking code
  2. Single Source of Truth: Plays table is the only data we write
  3. Consistency Guaranteed: Stats are always derived from plays
  4. Simple Logic: PlayStatCalculator is straightforward field mapping
  5. SQL Optimized: PostgreSQL handles aggregation efficiently
  6. Flexible Queries: Can create any stat aggregation we want with new views
  7. Audit Trail: Play-by-play history is complete record

Migration Path

  1. Add stat fields to plays table (migration 004)
  2. Update Play model
  3. Create PlayStatCalculator
  4. Modify GameEngine to populate stat fields when recording plays
  5. Create materialized views (migration 005)
  6. Create BoxScoreService to query views
  7. Add refresh logic (after game completion)
  8. Update terminal client to show box scores

Estimated Time: 4-5 hours (much less than original 6-8 hours!)


Questions for Approval

  1. Approve materialized view approach (vs separate stat tables)?
  2. Add stat fields to plays table?
  3. Refresh strategy: after game completion + on-demand?
  4. Use legacy field names (pa, ab, run, hit) for compatibility?
  5. Skip experimental fields (bphr, xba, etc.) for Phase 3.5?

Once approved, I'll update phase-3.5-polish-stats.md with this approach.