# Code Review: game_engine.py **Date**: 2025-01-19 **Reviewer**: Engineer Agent **File**: `/mnt/NV2/Development/strat-gameplay-webapp/backend/app/core/game_engine.py` **Lines**: 1,123 ## Summary Comprehensive review identified 24 issues. The game engine is well-structured but has architectural concerns that could cause production issues. --- ## Critical Issues (3) ### 1. Silent Error Swallowing in Batch Save (Lines 903-906) **Severity**: CRITICAL **Impact**: Audit data loss - dice rolls not persisted to database ```python except Exception as e: logger.error(f"Error batch saving inning rolls: {e}") # Silently continues - rolls are lost! ``` **Problem**: When database save fails, the error is logged but the method continues. This means dice rolls are lost with no indication to the caller. **Fix**: Re-raise the exception or return a failure indicator so callers can handle appropriately. --- ### 2. Race Condition in Decision Workflow (Lines 220-227, 256-263) **Severity**: CRITICAL **Impact**: Concurrent decision submissions could conflict ```python # In submit_defensive_setup: state.pending_defensive_decision = decision state.decisions_this_play['defense'] = True # In submit_offensive_decision: state.pending_offensive_decision = decision state.decisions_this_play['offense'] = True ``` **Problem**: No mutex/lock protecting the decision submission flow. If both managers submit simultaneously, state could become inconsistent. **Fix**: Add asyncio.Lock per game to serialize decision submissions: ```python async with self._game_locks[game_id]: # decision submission logic ``` --- ### 3. Memory Leak in _rolls_this_inning (Line 44) **Severity**: CRITICAL **Impact**: Unbounded dictionary growth for completed games ```python self._rolls_this_inning: dict[str, list[AbRoll]] = {} ``` **Problem**: Rolls are accumulated per game_id but never cleaned up when games complete. Over time, this causes memory growth. **Fix**: Clear entries when game completes in `_handle_game_end`: ```python if game_id in self._rolls_this_inning: del self._rolls_this_inning[game_id] ``` --- ## High Severity Issues (8) ### 4. No Transaction Handling (Lines 465-512) **Severity**: HIGH **Impact**: Partial database state on failure Multi-step database operations (save play, update game, update lineups) are not wrapped in a transaction. If any step fails, database is left in inconsistent state. **Fix**: Use SQLAlchemy session transaction context. --- ### 5. Code Duplication - Resolution Methods **Severity**: HIGH **Impact**: Maintenance burden, divergent behavior `resolve_play_auto` and `resolve_play_manual` share ~70% of their logic but are separate methods. Changes must be made in both places. **Fix**: Extract common logic to private method like `_resolve_play_common()`. --- ### 6. Missing Input Validation - submit_defensive_setup (Line 210-240) **Severity**: HIGH **Impact**: Invalid state could be accepted No validation that: - `hold_runners` bases actually have runners - Positioning values are valid enums **Fix**: Add validation before accepting decision. --- ### 7. Missing Input Validation - submit_offensive_decision (Line 245-280) **Severity**: HIGH **Impact**: Invalid steal attempts could be submitted No validation that `steal_attempts` bases actually have runners. **Fix**: Validate steal_attempts against current runner state. --- ### 8. Hardcoded Inning Limit (Line 580) **Severity**: HIGH **Impact**: Can't configure game length ```python if state.inning >= 9 and state.half == 'bottom': ``` **Fix**: Move to configuration or league settings. --- ### 9. No Cleanup on Game Abandon **Severity**: HIGH **Impact**: Resources not released for abandoned games When a game is abandoned, `_rolls_this_inning` and other per-game state is not cleaned up. **Fix**: Add cleanup method called on game end/abandon. --- ### 10. Direct State Mutation Throughout **Severity**: HIGH **Impact**: Hard to track state changes, no audit trail State is mutated directly throughout the code. No clear boundaries or logging of what changed. **Fix**: Consider state mutation through defined methods with logging. --- ### 11. Logger Instance Per Method Call Possible **Severity**: HIGH **Impact**: Performance overhead If module-level logger is recreated, could cause performance issues. **Fix**: Verify logger is module-level singleton. --- ## Medium Severity Issues (9) ### 12. Long Methods (>100 lines) - `resolve_play_auto`: ~120 lines - `resolve_play_manual`: ~100 lines - `_advance_to_next_play`: ~80 lines **Fix**: Extract sub-methods for readability. --- ### 13. Magic Numbers - Line 580: `9` (innings) - Line 645: `3` (outs) - Various timeout values **Fix**: Define as named constants or configuration. --- ### 14. Missing Type Hints Some internal methods lack return type hints or parameter types. **Fix**: Add comprehensive type hints. --- ### 15. Inconsistent Error Handling Some methods raise exceptions, others return None, others log and continue. **Fix**: Establish consistent error handling pattern. --- ### 16. No Retry Logic for Database Operations Database saves could fail transiently but no retry mechanism exists. **Fix**: Add retry with exponential backoff for transient failures. --- ### 17. Tight Coupling to PlayResolver GameEngine directly instantiates PlayResolver. Hard to mock for testing. **Fix**: Inject PlayResolver as dependency. --- ### 18. No Metrics/Observability No performance metrics, timing, or counters for monitoring. **Fix**: Add instrumentation for production monitoring. --- ### 19. Comments Could Be Docstrings Several inline comments explain method behavior but aren't in docstring format. **Fix**: Convert to proper docstrings. --- ### 20. Potential Division by Zero In statistics calculations, no guard against zero denominators. **Fix**: Add zero checks. --- ## Low Severity Issues (4) ### 21. Unused Imports May have imports not used in the file. **Fix**: Run import linter. --- ### 22. Variable Naming Some variable names are not descriptive (e.g., `d` for decision). **Fix**: Use descriptive names. --- ### 23. No __all__ Export Module doesn't define `__all__` for explicit public API. **Fix**: Add `__all__` list. --- ### 24. Test Helper Methods Some methods seem designed for testing but are public API. **Fix**: Prefix with `_` or move to test utilities. --- ## Remediation Priority ### Immediate (Before Production) 1. Fix silent error swallowing (Issue #1) 2. Add game locks for race conditions (Issue #2) 3. Implement memory cleanup (Issue #3) ### Next Sprint 4. Add transaction handling (Issue #4) 5. Extract common resolution logic (Issue #5) 6. Add input validation (Issues #6, #7) 7. Make inning limit configurable (Issue #8) ### Technical Debt - Remaining medium/low issues - Consider overall architectural refactor for testability --- ## Status - [x] Review completed - [x] Critical issues fixed (2025-01-19) - Issue #1: Re-raise exception in `_batch_save_inning_rolls` - Issue #2: Added `_game_locks` dict and `_get_game_lock()` method, wrapped decision submissions with `async with` - Issue #3: Added `_cleanup_game_resources()` method, called on game completion in `resolve_play`, `resolve_manual_play`, and `end_game` - [x] High issue #4 fixed (2025-01-19) - Added optional `session` parameter to `save_play` and `update_game_state` in DatabaseOperations - Wrapped multi-step DB ops in `resolve_play` and `resolve_manual_play` with single transaction - Transaction commits atomically or rolls back entirely on failure - [x] High issue #5 fixed (2025-11-19) - Extracted common logic to `_finalize_play()` method - `resolve_play` reduced from ~150 lines to ~60 lines - `resolve_manual_play` reduced from ~135 lines to ~60 lines - Single source of truth for: roll tracking, state capture, transaction handling, inning advance, cleanup - [x] High issues #6, #7 already implemented (2025-11-19) - Validation exists in validators.py: hold_runners validates against occupied bases (lines 71-77) - Validation exists in validators.py: steal_attempts validates against runner state (lines 156-165) - Validators called in submit_defensive_decision (line 225) and submit_offensive_decision (line 263) - [x] High issue #8 deferred to next sprint (2025-11-19) - Hardcoded inning limit (9) used in validators.py and game_models.py - Requires config system changes - appropriate for technical debt phase - [x] High issue #9 already fixed (part of Issue #3) - `_cleanup_game_resources()` called in `end_game()` at line 1109 - [x] High issue #10 acknowledged (2025-11-19) - Direct state mutation is current architecture pattern - Would require significant refactor for immutable state pattern - State changes are logged at debug level throughout - Consider for v2 if auditability becomes a requirement - [x] High issue #11 verified correct (2025-11-19) - Logger is module-level singleton at line 34 (outside class) - Pattern is correct: `logger = logging.getLogger(f'{__name__}.GameEngine')` - [ ] Medium issues addressed - [ ] Low issues addressed **Tests**: 739/739 passing after fixes (100%) **Last Updated**: 2025-11-19