# Data Sanitization Template for PostgreSQL Migration ## Template Structure Each data sanitization issue should follow this standardized format for consistent tracking and resolution. --- ## Issue Template ### Issue ID: [CATEGORY]-[TABLE]-[FIELD]-[NUMBER] **Example**: `CONSTRAINT-CURRENT-INJURY_COUNT-001` ### ๐Ÿ“Š Issue Classification - **Category**: [SCHEMA|DATA_INTEGRITY|DATA_QUALITY|MIGRATION_LOGIC] - **Priority**: [CRITICAL|HIGH|MEDIUM|LOW] - **Impact**: [BLOCKS_MIGRATION|DATA_LOSS|PERFORMANCE|COSMETIC] - **Table(s)**: [table_name, related_tables] - **Field(s)**: [field_names] ### ๐Ÿ” Problem Description **What happened:** Clear description of the error or issue encountered. **Error Message:** ``` Exact error message from logs ``` **Expected Behavior:** What should happen in a successful migration. **Current Behavior:** What actually happens. ### ๐Ÿ“ˆ Impact Assessment **Data Affected:** - Records: X out of Y total - Percentage: Z% - Critical data: YES/NO **Business Impact:** - User-facing features affected - Operational impact - Compliance/audit concerns ### ๐Ÿ”ง Root Cause Analysis **Technical Cause:** - SQLite vs PostgreSQL difference - Data model assumption - Migration logic flaw **Data Source:** - How did this data get into this state? - Is this expected or corrupted data? - Historical context ### ๐Ÿ’ก Solution Strategy **Approach:** [TRANSFORM_DATA|FIX_SCHEMA|MIGRATION_LOGIC|SKIP_TABLE] **Technical Solution:** Detailed explanation of how to fix the issue. **Data Transformation Required:** ```sql -- Example transformation query UPDATE table_name SET field_name = COALESCE(field_name, default_value) WHERE field_name IS NULL; ``` ### โœ… Implementation Plan **Steps:** 1. [ ] Backup current state 2. [ ] Implement fix 3. [ ] Test on sample data 4. [ ] Run full migration test 5. [ ] Validate results 6. [ ] Document changes **Rollback Plan:** How to undo changes if something goes wrong. ### ๐Ÿงช Testing Strategy **Test Cases:** 1. Happy path: Normal data migrates correctly 2. Edge case: Problem data is handled properly 3. Regression: Previous fixes still work **Validation Queries:** ```sql -- Query to verify fix worked SELECT COUNT(*) FROM table_name WHERE condition; ``` ### ๐Ÿ“‹ Resolution Status - **Status**: [IDENTIFIED|IN_PROGRESS|TESTING|RESOLVED|DEFERRED] - **Assigned To**: [team_member] - **Date Identified**: YYYY-MM-DD - **Date Resolved**: YYYY-MM-DD - **Solution Applied**: [description] --- ## ๐Ÿ“š Example Issues (From Our Testing) ### Issue ID: CONSTRAINT-CURRENT-INJURY_COUNT-001 **Category**: SCHEMA **Priority**: HIGH **Impact**: BLOCKS_MIGRATION **Problem Description:** `injury_count` field in `current` table has NULL values in SQLite but PostgreSQL schema requires NOT NULL. **Error Message:** ``` null value in column "injury_count" of relation "current" violates not-null constraint ``` **Solution Strategy:** TRANSFORM_DATA ```sql -- Transform NULL values to 0 before migration UPDATE current SET injury_count = 0 WHERE injury_count IS NULL; ``` **Implementation:** 1. Add data transformation in migration script 2. Set default value for future records 3. Update schema if business logic allows NULL --- ### Issue ID: DATA_QUALITY-PLAYER-NAME-001 **Category**: DATA_QUALITY **Priority**: MEDIUM **Impact**: DATA_LOSS **Problem Description:** Player names exceed PostgreSQL VARCHAR(255) limit causing truncation. **Error Message:** ``` value too long for type character varying(255) ``` **Solution Strategy:** FIX_SCHEMA ```sql -- Increase column size in PostgreSQL ALTER TABLE player ALTER COLUMN name TYPE VARCHAR(500); ``` **Implementation:** 1. Analyze max string lengths in SQLite 2. Update PostgreSQL schema with appropriate limits 3. Add validation to prevent future overruns --- ### Issue ID: MIGRATION_LOGIC-TEAM-INTEGER-001 **Category**: MIGRATION_LOGIC **Priority**: HIGH **Impact**: BLOCKS_MIGRATION **Problem Description:** Large integer values in SQLite exceed PostgreSQL INTEGER range. **Error Message:** ``` integer out of range ``` **Solution Strategy:** FIX_SCHEMA ```sql -- Use BIGINT instead of INTEGER ALTER TABLE team ALTER COLUMN large_field TYPE BIGINT; ``` **Implementation:** 1. Identify fields with large values 2. Update schema to use BIGINT 3. Verify no application code assumes INTEGER size --- ## ๐ŸŽฏ Standard Solution Patterns ### Pattern 1: NULL Constraint Violations ```python # Pre-migration data cleaning def clean_null_constraints(table_name, field_name, default_value): query = f"UPDATE {table_name} SET {field_name} = ? WHERE {field_name} IS NULL" sqlite_db.execute_sql(query, (default_value,)) ``` ### Pattern 2: String Length Overruns ```python # Schema adjustment def adjust_varchar_limits(table_name, field_name, new_limit): query = f"ALTER TABLE {table_name} ALTER COLUMN {field_name} TYPE VARCHAR({new_limit})" postgres_db.execute_sql(query) ``` ### Pattern 3: Integer Range Issues ```python # Type upgrade def upgrade_integer_fields(table_name, field_name): query = f"ALTER TABLE {table_name} ALTER COLUMN {field_name} TYPE BIGINT" postgres_db.execute_sql(query) ``` ### Pattern 4: Missing Table Handling ```python # Graceful table skipping def safe_table_migration(model_class): try: migrate_table_data(model_class) except Exception as e: if "no such table" in str(e): logger.warning(f"Table {model_class._meta.table_name} doesn't exist in source") return True raise ``` ## ๐Ÿ“Š Issue Tracking Spreadsheet Template | Issue ID | Category | Priority | Table | Field | Status | Date Found | Date Fixed | Notes | |----------|----------|----------|-------|-------|--------|------------|------------|-------| | CONSTRAINT-CURRENT-INJURY_COUNT-001 | SCHEMA | HIGH | current | injury_count | RESOLVED | 2025-01-15 | 2025-01-15 | Set NULL to 0 | | DATA_QUALITY-PLAYER-NAME-001 | DATA_QUALITY | MEDIUM | player | name | IN_PROGRESS | 2025-01-15 | | Increase VARCHAR limit | --- *This template ensures consistent documentation and systematic resolution of migration issues.*