# VM to LXC Migration Testing Checklist Comprehensive validation checklist for VM to LXC container migrations. ## Pre-Migration Checklist ### Planning Phase - [ ] VM analyzed with migration tool: `python3 migrate_vm_to_lxc.py analyze --vmid ` - [ ] Migration suitability confirmed (excellent or good) - [ ] Migration plan generated and reviewed - [ ] Target LXC container ID selected (not in use) - [ ] Static IP address planned (if needed) - [ ] Maintenance window scheduled (low-traffic period) - [ ] Stakeholders notified (if production service) - [ ] Rollback plan documented and understood ### Backup Phase - [ ] VM snapshot created: `snapshot-name: pre-migration-YYYY-MM-DD` - [ ] VM snapshot verified in Proxmox UI - [ ] Docker Compose files backed up from VM - [ ] Docker volumes/data backed up (if applicable) - [ ] List of running containers documented - [ ] Environment variables documented - [ ] Network configuration documented (IP, ports, DNS) - [ ] External dependencies documented (databases, APIs, etc.) ### Infrastructure Validation - [ ] Docker LXC template exists (ID 9001 or custom) - [ ] Target container ID available - [ ] Sufficient storage space on target storage pool - [ ] Network configuration confirmed (VLAN, bridge, gateway) - [ ] DNS entries documented (update after migration if needed) - [ ] Firewall rules documented - [ ] Reverse proxy configuration backed up (if using NPM/Traefik) --- ## Migration Execution Checklist ### Phase 1: Pre-Migration Testing - [ ] VM is running and healthy - [ ] All services responding normally - [ ] No error logs in VM - [ ] Docker containers all running: `docker ps -a` - [ ] Resource usage documented (CPU, RAM, disk) - [ ] Performance baseline captured (response times, etc.) ### Phase 2: VM Shutdown - [ ] Services gracefully stopped (if order matters) - [ ] Docker containers stopped: `docker compose down` (optional) - [ ] VM shut down gracefully: `shutdown -h now` or Proxmox - [ ] VM status confirmed: `stopped` - [ ] Snapshot remains intact ### Phase 3: LXC Creation - [ ] LXC created from template - [ ] Container ID matches plan - [ ] Hostname configured correctly - [ ] Memory allocation set (estimated from analysis) - [ ] CPU cores allocated (match or reduce from VM) - [ ] Storage configured correctly - [ ] Network configured (static IP or DHCP) - [ ] Docker features enabled: `nesting=1,keyctl=1` - [ ] Container set to privileged mode (unprivileged=0) - [ ] Container configuration reviewed in Proxmox UI ### Phase 4: LXC Initial Start - [ ] Container started successfully - [ ] Container status: `running` - [ ] Container accessible via SSH - [ ] Network connectivity confirmed: `ping 8.8.8.8` - [ ] DNS resolution working: `nslookup google.com` - [ ] Docker service running: `systemctl status docker` - [ ] Docker working: `docker ps` (should be empty initially) --- ## Service Migration Checklist ### Phase 5: Docker Configuration Transfer - [ ] Docker Compose files copied to LXC - [ ] Directory structure matches VM layout - [ ] File permissions verified - [ ] Environment files copied (.env files) - [ ] Docker volumes path confirmed - [ ] Data directories created (if needed) - [ ] Configuration files reviewed for absolute paths ### Phase 6: Docker Containers Deployment - [ ] Docker Compose files validated: `docker compose config` - [ ] Images pulled successfully: `docker compose pull` - [ ] Containers created: `docker compose up -d` - [ ] All containers started: `docker compose ps` - [ ] No container restart loops: `docker ps` (check STATUS) - [ ] Container logs checked: `docker compose logs` - [ ] No error messages in logs ### Phase 7: Service Validation - [ ] All expected containers running - [ ] Services responding on correct ports - [ ] Web interfaces accessible (if applicable) - [ ] APIs responding correctly (if applicable) - [ ] Health check endpoints passing (if configured) - [ ] Data persistence verified (check databases, files) - [ ] Inter-container communication working - [ ] External service connections working (databases, APIs) --- ## Network & Connectivity Checklist ### Phase 8: Network Validation - [ ] LXC has correct IP address: `ip addr show` - [ ] Gateway reachable: `ping ` - [ ] Internal network access verified - [ ] Internet access confirmed - [ ] DNS resolution working for all required domains - [ ] Ports accessible from other hosts: `nc -zv ` - [ ] Firewall rules applied (if needed) ### Phase 9: External Access - [ ] Service accessible from local network - [ ] Service accessible from internet (if required) - [ ] Reverse proxy updated (if using NPM/Traefik) - [ ] SSL certificates working (if HTTPS) - [ ] Domain names resolving correctly - [ ] Load balancer updated (if applicable) --- ## Performance & Stability Checklist ### Phase 10: Performance Validation - [ ] CPU usage reasonable: `top` or `htop` - [ ] Memory usage lower than VM: `free -h` - [ ] Disk I/O acceptable: `iostat` or monitor in Proxmox - [ ] Network throughput adequate: test with actual traffic - [ ] Response times equal to or better than VM - [ ] No performance degradation under load ### Phase 11: Resource Monitoring (First 24 Hours) - [ ] Hour 1: Services stable, no crashes - [ ] Hour 2: Resource usage normal - [ ] Hour 4: No memory leaks detected - [ ] Hour 8: Performance consistent - [ ] Hour 24: All metrics stable - [ ] Proxmox graphs show healthy trends - [ ] No OOM (Out of Memory) kills: `dmesg | grep -i oom` - [ ] No kernel errors: `dmesg | grep -i error` ### Phase 12: Functional Testing - [ ] Primary functionality tested end-to-end - [ ] User workflows validated - [ ] Scheduled jobs running (cron, etc.) - [ ] Backups configured and tested - [ ] Monitoring alerts configured - [ ] Logging working correctly - [ ] Integrations with other services functioning --- ## Data Integrity Checklist ### Phase 13: Data Validation - [ ] Database connections working - [ ] Data readable and writable - [ ] File uploads/downloads working - [ ] Cache functioning correctly - [ ] Sessions persisting correctly - [ ] User data accessible - [ ] No data corruption detected - [ ] Database migrations applied (if needed) ### Phase 14: Backup Validation - [ ] Backup jobs configured for LXC - [ ] Test backup created successfully - [ ] Test restore validated - [ ] Backup storage sufficient - [ ] Backup retention policy set - [ ] Backup monitoring alerts configured --- ## Extended Monitoring Checklist ### Phase 15: Week 1 Monitoring - [ ] Day 1: Initial 24 hours stable - [ ] Day 2: Resource usage patterns established - [ ] Day 3: Performance benchmarks met - [ ] Day 4: No unexpected issues - [ ] Day 5: Load testing passed (if applicable) - [ ] Day 6: Weekend operations normal (if applicable) - [ ] Day 7: Weekly summary reviewed, all green ### Phase 16: Week 2 Validation - [ ] Week 2: Continued stability - [ ] No memory leaks over extended period - [ ] Disk usage growth as expected - [ ] No unexpected restarts or crashes - [ ] Resource utilization optimized - [ ] Documentation updated with final configuration --- ## Rollback Checklist (If Needed) ### Emergency Rollback - [ ] Stop LXC container: `pct stop ` - [ ] Start original VM: `qm start ` - [ ] Verify VM services starting - [ ] Validate VM functionality - [ ] Restore network access (update DNS/proxy if changed) - [ ] Document rollback reason for analysis - [ ] Plan remediation before retry --- ## Final Migration Completion Checklist ### Phase 17: Production Validation - [ ] 1-2 weeks of stable operation confirmed - [ ] All stakeholders confirm service quality - [ ] Performance metrics meet or exceed VM baseline - [ ] No outstanding issues or concerns - [ ] Monitoring and alerting fully operational - [ ] Documentation complete and accurate ### Phase 18: Cleanup - [ ] VM no longer needed, safe to remove - [ ] VM snapshot retained for safety (30 days recommended) - [ ] Original VM stopped and archived - [ ] Resources freed up (document savings) - [ ] Migration marked complete in tracking system - [ ] Lessons learned documented ### Phase 19: Documentation Updates - [ ] Network diagram updated (if exists) - [ ] IP address spreadsheet updated - [ ] Service inventory updated - [ ] Runbooks updated for new LXC location - [ ] Backup documentation updated - [ ] Disaster recovery plan updated - [ ] Team knowledge base updated --- ## Quick Reference: Common Issues & Solutions ### Issue: Container won't start **Check:** - [ ] Storage space available: `pvesm status` - [ ] Container configuration valid: `pct config ` - [ ] No resource limits exceeded - [ ] Logs: `journalctl -u pve-container@` ### Issue: Docker won't start **Check:** - [ ] Nesting enabled: `pct config | grep features` - [ ] Container is privileged: `pct config | grep unprivileged` - [ ] Docker service: `systemctl status docker` - [ ] Logs: `journalctl -u docker` ### Issue: Network not working **Check:** - [ ] Network interface configured: `ip addr show` - [ ] Gateway configured: `ip route show` - [ ] DNS configured: `cat /etc/resolv.conf` - [ ] Firewall rules: `iptables -L` ### Issue: Poor performance **Check:** - [ ] Resource allocation sufficient: `pct config ` - [ ] No CPU throttling: `cat /proc/loadavg` - [ ] Memory not exhausted: `free -h` - [ ] No I/O bottleneck: `iostat -x 1` ### Issue: Can't access services **Check:** - [ ] Containers running: `docker ps` - [ ] Ports exposed: `docker ps` (PORTS column) - [ ] Firewall rules: `iptables -L` - [ ] Service binding: `netstat -tlnp | grep ` - [ ] Reverse proxy config updated --- ## Service-Specific Checklists ### Discord Bots - [ ] Bot token configured correctly - [ ] Bot connected to Discord: check bot status - [ ] Commands responding - [ ] Database connections working (if applicable) - [ ] Scheduled tasks running - [ ] Logs showing normal operation ### Databases (PostgreSQL, MySQL, MongoDB) - [ ] Database service running - [ ] Data directory mounted correctly - [ ] Connections from applications working - [ ] Queries executing normally - [ ] Backups configured - [ ] Replication working (if applicable) - [ ] Performance acceptable: run query benchmarks ### Plex Media Server - [ ] Media libraries accessible - [ ] Transcoding working (CPU or GPU) - [ ] Streaming playback smooth - [ ] Metadata refreshing - [ ] Remote access configured (if needed) - [ ] Hardware acceleration working (if configured) ### Docker-Based Web Apps - [ ] Web interface accessible - [ ] Login/authentication working - [ ] Database connections functional - [ ] File uploads working - [ ] API endpoints responding - [ ] SSL/TLS certificates valid - [ ] Caching working correctly --- ## Migration Success Criteria ### Minimum Criteria (Must Have) - ✅ All services running and accessible - ✅ No data loss or corruption - ✅ Performance equal to or better than VM - ✅ 24 hours of stable operation - ✅ No critical errors in logs - ✅ Rollback plan tested and ready ### Optimal Criteria (Should Have) - ✅ Resource usage reduced vs VM - ✅ Faster startup times - ✅ Improved I/O performance - ✅ 1 week of stable operation - ✅ Monitoring and alerts configured - ✅ Documentation complete ### Excellence Criteria (Nice to Have) - ✅ 2 weeks of flawless operation - ✅ Measurable performance improvements - ✅ Resource optimization completed - ✅ Automated backups validated - ✅ Team trained on new setup - ✅ Migration lessons documented --- ## Notes & Best Practices **Timing:** - Migrate non-critical services first - Schedule during low-traffic periods - Allow extra time for first migration - Plan for 2-4 hours per service initially **Safety:** - Always have VM snapshot before starting - Keep VM stopped but available for 1-2 weeks - Test rollback procedure before committing - Document every step for repeatability **Monitoring:** - Watch resource usage closely first 48 hours - Set up alerts for anomalies - Compare to VM baseline metrics - Keep detailed migration notes **Optimization:** - Start with conservative resource allocation - Tune after monitoring actual usage - Document optimal settings for future migrations - Share learnings with team --- **Checklist Version:** 1.0 **Last Updated:** 2025-01-11 **For:** Cal's Home Lab Proxmox Infrastructure