- Fixed 4 critical schema issues blocking migration - Resolved integer overflow by converting Discord IDs to strings - Fixed VARCHAR length limits for Google Photos URLs - Made injury_count field nullable for NULL values - Successfully migrating 7/30 tables (5,432+ records) Issues resolved: - CONSTRAINT-CURRENT-INJURY_COUNT-001: Made nullable - DATA_QUALITY-PLAYER-NAME-001: Increased VARCHAR limits to 1000 - MIGRATION_LOGIC-TEAM-INTEGER-001: Discord IDs now strings - MIGRATION_LOGIC-DRAFTDATA-INTEGER-001: Channel IDs now strings New issues discovered for Phase 2: - CONSTRAINT-CURRENT-BSTATCOUNT-001: NULL stats count - CONSTRAINT-TEAM-AUTO_DRAFT-001: NULL auto draft flag 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
232 lines
6.0 KiB
Markdown
232 lines
6.0 KiB
Markdown
# Data Sanitization Template for PostgreSQL Migration
|
|
|
|
## Template Structure
|
|
Each data sanitization issue should follow this standardized format for consistent tracking and resolution.
|
|
|
|
---
|
|
|
|
## Issue Template
|
|
|
|
### Issue ID: [CATEGORY]-[TABLE]-[FIELD]-[NUMBER]
|
|
**Example**: `CONSTRAINT-CURRENT-INJURY_COUNT-001`
|
|
|
|
### 📊 Issue Classification
|
|
- **Category**: [SCHEMA|DATA_INTEGRITY|DATA_QUALITY|MIGRATION_LOGIC]
|
|
- **Priority**: [CRITICAL|HIGH|MEDIUM|LOW]
|
|
- **Impact**: [BLOCKS_MIGRATION|DATA_LOSS|PERFORMANCE|COSMETIC]
|
|
- **Table(s)**: [table_name, related_tables]
|
|
- **Field(s)**: [field_names]
|
|
|
|
### 🔍 Problem Description
|
|
**What happened:**
|
|
Clear description of the error or issue encountered.
|
|
|
|
**Error Message:**
|
|
```
|
|
Exact error message from logs
|
|
```
|
|
|
|
**Expected Behavior:**
|
|
What should happen in a successful migration.
|
|
|
|
**Current Behavior:**
|
|
What actually happens.
|
|
|
|
### 📈 Impact Assessment
|
|
**Data Affected:**
|
|
- Records: X out of Y total
|
|
- Percentage: Z%
|
|
- Critical data: YES/NO
|
|
|
|
**Business Impact:**
|
|
- User-facing features affected
|
|
- Operational impact
|
|
- Compliance/audit concerns
|
|
|
|
### 🔧 Root Cause Analysis
|
|
**Technical Cause:**
|
|
- SQLite vs PostgreSQL difference
|
|
- Data model assumption
|
|
- Migration logic flaw
|
|
|
|
**Data Source:**
|
|
- How did this data get into this state?
|
|
- Is this expected or corrupted data?
|
|
- Historical context
|
|
|
|
### 💡 Solution Strategy
|
|
**Approach:** [TRANSFORM_DATA|FIX_SCHEMA|MIGRATION_LOGIC|SKIP_TABLE]
|
|
|
|
**Technical Solution:**
|
|
Detailed explanation of how to fix the issue.
|
|
|
|
**Data Transformation Required:**
|
|
```sql
|
|
-- Example transformation query
|
|
UPDATE table_name
|
|
SET field_name = COALESCE(field_name, default_value)
|
|
WHERE field_name IS NULL;
|
|
```
|
|
|
|
### ✅ Implementation Plan
|
|
**Steps:**
|
|
1. [ ] Backup current state
|
|
2. [ ] Implement fix
|
|
3. [ ] Test on sample data
|
|
4. [ ] Run full migration test
|
|
5. [ ] Validate results
|
|
6. [ ] Document changes
|
|
|
|
**Rollback Plan:**
|
|
How to undo changes if something goes wrong.
|
|
|
|
### 🧪 Testing Strategy
|
|
**Test Cases:**
|
|
1. Happy path: Normal data migrates correctly
|
|
2. Edge case: Problem data is handled properly
|
|
3. Regression: Previous fixes still work
|
|
|
|
**Validation Queries:**
|
|
```sql
|
|
-- Query to verify fix worked
|
|
SELECT COUNT(*) FROM table_name WHERE condition;
|
|
```
|
|
|
|
### 📋 Resolution Status
|
|
- **Status**: [IDENTIFIED|IN_PROGRESS|TESTING|RESOLVED|DEFERRED]
|
|
- **Assigned To**: [team_member]
|
|
- **Date Identified**: YYYY-MM-DD
|
|
- **Date Resolved**: YYYY-MM-DD
|
|
- **Solution Applied**: [description]
|
|
|
|
---
|
|
|
|
## 📚 Example Issues (From Our Testing)
|
|
|
|
### Issue ID: CONSTRAINT-CURRENT-INJURY_COUNT-001
|
|
**Category**: SCHEMA
|
|
**Priority**: HIGH
|
|
**Impact**: BLOCKS_MIGRATION
|
|
|
|
**Problem Description:**
|
|
`injury_count` field in `current` table has NULL values in SQLite but PostgreSQL schema requires NOT NULL.
|
|
|
|
**Error Message:**
|
|
```
|
|
null value in column "injury_count" of relation "current" violates not-null constraint
|
|
```
|
|
|
|
**Solution Strategy:** TRANSFORM_DATA
|
|
```sql
|
|
-- Transform NULL values to 0 before migration
|
|
UPDATE current SET injury_count = 0 WHERE injury_count IS NULL;
|
|
```
|
|
|
|
**Implementation:**
|
|
1. Add data transformation in migration script
|
|
2. Set default value for future records
|
|
3. Update schema if business logic allows NULL
|
|
|
|
---
|
|
|
|
### Issue ID: DATA_QUALITY-PLAYER-NAME-001
|
|
**Category**: DATA_QUALITY
|
|
**Priority**: MEDIUM
|
|
**Impact**: DATA_LOSS
|
|
|
|
**Problem Description:**
|
|
Player names exceed PostgreSQL VARCHAR(255) limit causing truncation.
|
|
|
|
**Error Message:**
|
|
```
|
|
value too long for type character varying(255)
|
|
```
|
|
|
|
**Solution Strategy:** FIX_SCHEMA
|
|
```sql
|
|
-- Increase column size in PostgreSQL
|
|
ALTER TABLE player ALTER COLUMN name TYPE VARCHAR(500);
|
|
```
|
|
|
|
**Implementation:**
|
|
1. Analyze max string lengths in SQLite
|
|
2. Update PostgreSQL schema with appropriate limits
|
|
3. Add validation to prevent future overruns
|
|
|
|
---
|
|
|
|
### Issue ID: MIGRATION_LOGIC-TEAM-INTEGER-001
|
|
**Category**: MIGRATION_LOGIC
|
|
**Priority**: HIGH
|
|
**Impact**: BLOCKS_MIGRATION
|
|
|
|
**Problem Description:**
|
|
Large integer values in SQLite exceed PostgreSQL INTEGER range.
|
|
|
|
**Error Message:**
|
|
```
|
|
integer out of range
|
|
```
|
|
|
|
**Solution Strategy:** FIX_SCHEMA
|
|
```sql
|
|
-- Use BIGINT instead of INTEGER
|
|
ALTER TABLE team ALTER COLUMN large_field TYPE BIGINT;
|
|
```
|
|
|
|
**Implementation:**
|
|
1. Identify fields with large values
|
|
2. Update schema to use BIGINT
|
|
3. Verify no application code assumes INTEGER size
|
|
|
|
---
|
|
|
|
## 🎯 Standard Solution Patterns
|
|
|
|
### Pattern 1: NULL Constraint Violations
|
|
```python
|
|
# Pre-migration data cleaning
|
|
def clean_null_constraints(table_name, field_name, default_value):
|
|
query = f"UPDATE {table_name} SET {field_name} = ? WHERE {field_name} IS NULL"
|
|
sqlite_db.execute_sql(query, (default_value,))
|
|
```
|
|
|
|
### Pattern 2: String Length Overruns
|
|
```python
|
|
# Schema adjustment
|
|
def adjust_varchar_limits(table_name, field_name, new_limit):
|
|
query = f"ALTER TABLE {table_name} ALTER COLUMN {field_name} TYPE VARCHAR({new_limit})"
|
|
postgres_db.execute_sql(query)
|
|
```
|
|
|
|
### Pattern 3: Integer Range Issues
|
|
```python
|
|
# Type upgrade
|
|
def upgrade_integer_fields(table_name, field_name):
|
|
query = f"ALTER TABLE {table_name} ALTER COLUMN {field_name} TYPE BIGINT"
|
|
postgres_db.execute_sql(query)
|
|
```
|
|
|
|
### Pattern 4: Missing Table Handling
|
|
```python
|
|
# Graceful table skipping
|
|
def safe_table_migration(model_class):
|
|
try:
|
|
migrate_table_data(model_class)
|
|
except Exception as e:
|
|
if "no such table" in str(e):
|
|
logger.warning(f"Table {model_class._meta.table_name} doesn't exist in source")
|
|
return True
|
|
raise
|
|
```
|
|
|
|
## 📊 Issue Tracking Spreadsheet Template
|
|
|
|
| Issue ID | Category | Priority | Table | Field | Status | Date Found | Date Fixed | Notes |
|
|
|----------|----------|----------|-------|-------|--------|------------|------------|-------|
|
|
| CONSTRAINT-CURRENT-INJURY_COUNT-001 | SCHEMA | HIGH | current | injury_count | RESOLVED | 2025-01-15 | 2025-01-15 | Set NULL to 0 |
|
|
| DATA_QUALITY-PLAYER-NAME-001 | DATA_QUALITY | MEDIUM | player | name | IN_PROGRESS | 2025-01-15 | | Increase VARCHAR limit |
|
|
|
|
---
|
|
|
|
*This template ensures consistent documentation and systematic resolution of migration issues.* |