# Plan 005: Replace Broad Exception Handling **Priority**: HIGH **Effort**: 2-3 hours **Status**: NOT STARTED **Risk Level**: MEDIUM - Hides bugs --- ## Problem Statement Multiple locations catch bare `Exception` instead of specific types, which: - Hides bugs by catching `SystemExit`, `KeyboardInterrupt` - Can't distinguish recoverable vs fatal errors - Swallows development errors that should fail fast **Locations Identified** (6+ instances): - `backend/app/core/substitution_manager.py` - `backend/app/core/game_engine.py` - `backend/app/api/routes.py` - `backend/app/websocket/handlers.py` ## Impact - **Debugging**: Hard to diagnose root cause of failures - **Reliability**: Silent failures instead of clear errors - **Development**: Bugs hidden during development ## Files to Modify | File | Instances | |------|-----------| | `backend/app/core/substitution_manager.py` | 2-3 | | `backend/app/core/game_engine.py` | 1-2 | | `backend/app/api/routes.py` | 1-2 | | `backend/app/websocket/handlers.py` | 2-3 | ## Implementation Steps ### Step 1: Identify Specific Exceptions (30 min) Map each `except Exception` to specific types: ```python # Database operations from sqlalchemy.exc import SQLAlchemyError, IntegrityError, OperationalError # Validation from pydantic import ValidationError # Async operations import asyncio asyncio.TimeoutError asyncio.CancelledError # HTTP client import httpx httpx.HTTPError httpx.TimeoutException # Custom application errors from app.core.exceptions import ( GameNotFoundError, InvalidStateError, AuthorizationError ) ``` ### Step 2: Create Custom Exception Classes (20 min) Create `backend/app/core/exceptions.py`: ```python """Custom exceptions for the game engine.""" class GameEngineError(Exception): """Base class for game engine errors.""" pass class GameNotFoundError(GameEngineError): """Raised when game doesn't exist.""" def __init__(self, game_id): self.game_id = game_id super().__init__(f"Game not found: {game_id}") class InvalidStateError(GameEngineError): """Raised when game is in invalid state for operation.""" def __init__(self, message: str, current_state: str = None): self.current_state = current_state super().__init__(message) class SubstitutionError(GameEngineError): """Raised when substitution is invalid.""" pass class AuthorizationError(GameEngineError): """Raised when user lacks permission.""" pass class ExternalAPIError(GameEngineError): """Raised when external API call fails.""" def __init__(self, service: str, message: str): self.service = service super().__init__(f"{service} API error: {message}") ``` ### Step 3: Update substitution_manager.py (30 min) **Before**: ```python except Exception as e: logger.error(f"Substitution failed: {e}") return SubstitutionResult(success=False, error=str(e)) ``` **After**: ```python from sqlalchemy.exc import SQLAlchemyError from pydantic import ValidationError from app.core.exceptions import SubstitutionError, InvalidStateError try: # ... substitution logic ... except ValidationError as e: logger.warning(f"Invalid substitution data: {e}") return SubstitutionResult(success=False, error="Invalid substitution data") except SQLAlchemyError as e: logger.error(f"Database error during substitution: {e}") return SubstitutionResult(success=False, error="Database error") except SubstitutionError as e: logger.info(f"Substitution rejected: {e}") return SubstitutionResult(success=False, error=str(e)) except InvalidStateError as e: logger.warning(f"Invalid game state for substitution: {e}") return SubstitutionResult(success=False, error=str(e)) # Let unexpected errors propagate! ``` ### Step 4: Update game_engine.py (30 min) **Before**: ```python except Exception as e: logger.error(f"Play resolution failed: {e}") raise ``` **After**: ```python from sqlalchemy.exc import SQLAlchemyError from app.core.exceptions import InvalidStateError, GameNotFoundError try: # ... play resolution logic ... except InvalidStateError as e: logger.warning(f"Invalid state for play: {e}") raise except SQLAlchemyError as e: logger.error(f"Database error during play resolution: {e}") raise InvalidStateError(f"Database error: {e}") except asyncio.TimeoutError: logger.error("Play resolution timed out") raise InvalidStateError("Operation timed out") # Let unexpected errors propagate with full traceback! ``` ### Step 5: Update WebSocket handlers.py (30 min) **Before**: ```python @sio.event async def submit_defensive_decision(sid, data): try: # ... handler logic ... except Exception as e: logger.error(f"Error in submit_defensive_decision: {e}") await sio.emit("error", {"message": "Internal error"}, to=sid) ``` **After**: ```python from pydantic import ValidationError from sqlalchemy.exc import SQLAlchemyError from app.core.exceptions import ( GameNotFoundError, InvalidStateError, AuthorizationError ) @sio.event async def submit_defensive_decision(sid, data): try: # ... handler logic ... except ValidationError as e: logger.info(f"Invalid decision data from {sid}: {e}") await sio.emit("error", {"message": "Invalid decision format"}, to=sid) except GameNotFoundError as e: logger.warning(f"Game not found: {e.game_id}") await sio.emit("error", {"message": "Game not found"}, to=sid) except InvalidStateError as e: logger.info(f"Invalid state for decision: {e}") await sio.emit("error", {"message": str(e)}, to=sid) except AuthorizationError as e: logger.warning(f"Unauthorized action from {sid}: {e}") await sio.emit("error", {"message": "Not authorized"}, to=sid) except SQLAlchemyError as e: logger.error(f"Database error in handler: {e}") await sio.emit("error", {"message": "Database error"}, to=sid) # Let unexpected errors propagate to Socket.io error handler! ``` ### Step 6: Add Global Error Handler (20 min) Update `backend/app/main.py`: ```python from fastapi import FastAPI, Request from fastapi.responses import JSONResponse from app.core.exceptions import GameEngineError @app.exception_handler(GameEngineError) async def game_engine_exception_handler(request: Request, exc: GameEngineError): """Handle all game engine errors.""" return JSONResponse( status_code=400, content={"error": exc.__class__.__name__, "message": str(exc)} ) # Socket.io error handler @sio.event async def error(sid, error): """Global Socket.io error handler.""" logger.error(f"Unhandled Socket.io error for {sid}: {error}") ``` ### Step 7: Write Tests (30 min) Create `backend/tests/unit/core/test_exception_handling.py`: ```python import pytest from app.core.exceptions import ( GameNotFoundError, InvalidStateError, SubstitutionError ) class TestExceptionHandling: """Tests for custom exception handling.""" def test_game_not_found_includes_id(self): """GameNotFoundError includes game ID.""" exc = GameNotFoundError("abc-123") assert "abc-123" in str(exc) assert exc.game_id == "abc-123" def test_invalid_state_includes_current_state(self): """InvalidStateError can include current state.""" exc = InvalidStateError("Cannot roll", current_state="waiting_decision") assert exc.current_state == "waiting_decision" @pytest.mark.asyncio async def test_handler_returns_specific_error_for_validation(self): """Handler returns validation-specific error message.""" # Test that ValidationError is caught and converted pass @pytest.mark.asyncio async def test_unexpected_errors_propagate(self): """Unexpected exceptions are not swallowed.""" # Test that RuntimeError propagates up pass ``` ## Verification Checklist - [ ] All `except Exception` replaced with specific types - [ ] Custom exception classes created - [ ] Error messages are user-friendly (no stack traces) - [ ] Unexpected errors propagate (not swallowed) - [ ] Logging includes appropriate level (info/warning/error) - [ ] Tests verify exception handling ## Pattern Reference ```python # GOOD: Specific exceptions with appropriate actions try: await db_ops.save_play(play_data) except IntegrityError: logger.warning("Duplicate play detected") raise InvalidStateError("Play already exists") except OperationalError: logger.error("Database connection failed") raise # Let it propagate for retry logic # BAD: Bare exception catch try: await db_ops.save_play(play_data) except Exception as e: logger.error(f"Error: {e}") return None # Swallows the error! ``` ## Rollback Plan If issues arise: 1. Revert specific file changes 2. Add temporary broad catch as last resort 3. Log the unexpected exception type for future handling ## Dependencies - None (can be implemented independently) ## Notes - Consider adding exception tracking (Sentry) for production - May want to add retry logic for transient errors - Future: Add error codes for client-side handling