claude-configs/skills/proxmox/docs/lxc_migration_guide.md
Cal Corum 8a1d15911f Initial commit: Claude Code configuration backup
Version control Claude Code configuration including:
- Global instructions (CLAUDE.md)
- User settings (settings.json)
- Custom agents (architect, designer, engineer, etc.)
- Custom skills (create-skill templates and workflows)

Excludes session data, secrets, cache, and temporary files per .gitignore.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-03 16:34:21 -06:00

861 lines
19 KiB
Markdown

# VM to LXC Migration Guide
Complete guide for migrating Docker-based VMs to LXC containers in Cal's home lab.
## Overview
This guide walks through the entire process of migrating Docker-based virtual machines to LXC containers, achieving significant resource savings while maintaining identical functionality.
**Expected Benefits:**
- **25-40GB RAM savings** across all eligible VMs
- **10-20% reduction** in per-service memory usage
- **2-5 second startup** vs 30-60 second VM boot
- **Better I/O performance** for databases and disk-heavy workloads
- **Near-native CPU performance** (no hypervisor overhead)
---
## Prerequisites
### Knowledge Requirements
- Basic understanding of Proxmox VE
- Familiarity with Docker and Docker Compose
- SSH access to VMs and LXC containers
- Understanding of your network configuration (IPs, VLANs, etc.)
### Tools Required
- ✅ Proxmox skill installed: `~/.claude/skills/proxmox/`
- ✅ Python 3 with proxmoxer library
- ✅ SSH access to Proxmox host
- ✅ Backup storage for VM snapshots
### Infrastructure Requirements
- [ ] Docker LXC template created (see Template Creation section)
- [ ] Available container IDs planned (200-299 range recommended)
- [ ] Network configuration documented
- [ ] Sufficient storage space on target pool
---
## Phase 1: Assessment & Planning
### Step 1.1: Analyze All VMs
Run batch analysis to identify migration candidates:
```bash
cd ~/.claude/skills/proxmox/scripts
python3 migrate_vm_to_lxc.py batch-analyze
```
**Expected Output:**
- ✅ Excellent candidates (Docker hosts, bots)
- 🟢 Good candidates (databases)
- 🟡 Conditional candidates (Plex, based on transcoding)
- ❌ Poor candidates (GPU VMs, game servers)
**For your infrastructure, expect:**
- **Excellent:** docker-* VMs (8), discord-bots
- **Good:** databases-bots
- **Conditional:** Plex (check transcoding method)
- **Poor:** Tdarr (GPU), game servers, Home Assistant
### Step 1.2: Prioritize Migrations
**Recommended Order:**
**Week 1: Low-Risk Test**
1. `docker-unused` or `docker-pittsburgh` - Low impact, test environment
**Week 2-3: Docker Hosts**
2. `docker-home-servers`
3. `docker-pittsburgh`
4. `docker-vpn`
**Week 4-5: Higher Memory Targets**
5. `docker-sba` (89% memory - big win!)
6. `docker-home`
7. `docker-7days`
**Week 6: Application Servers**
8. `discord-bots`
9. `databases-bots` (careful - production!)
**Week 7+: Conditional**
10. `Plex` (if CPU transcoding confirmed)
### Step 1.3: Create Migration Calendar
```
Week 1 (Test Phase):
- Monday: Create Docker LXC template
- Wednesday: Migrate docker-unused (test)
- Friday: Validate test migration
Week 2 (Production Start):
- Tuesday: Migrate docker-home-servers
- Thursday: Migrate docker-pittsburgh
- Weekend: Monitor both
Week 3 (VPN + SBA):
- Tuesday: Migrate docker-vpn
- Thursday: Migrate docker-sba (high memory VM)
- Weekend: Monitor and optimize
[Continue weekly schedule...]
```
---
## Phase 2: Template Creation
### Step 2.1: Create Docker LXC Template
**Automated Approach:**
```bash
cd ~/.claude/skills/proxmox/scripts
python3 create_docker_lxc_template.py --id 9001 --name docker-lxc-template
```
**Manual Approach (if needed):**
1. **Download Ubuntu LXC Template:**
- Proxmox UI → Datacenter → Node → local → CT Templates
- Click "Templates" button
- Download: `ubuntu-22.04-standard`
2. **Create Container:**
```bash
pct create 9001 local:vztmpl/ubuntu-22.04-standard_22.04-1_amd64.tar.zst \
--hostname docker-lxc-template \
--memory 2048 \
--cores 2 \
--rootfs local-lvm:20 \
--net0 name=eth0,bridge=vmbr0,ip=dhcp \
--nameserver 8.8.8.8 \
--password temporary123 \
--unprivileged 0 \
--features nesting=1,keyctl=1
```
3. **Start and Configure:**
```bash
pct start 9001
sleep 10
pct enter 9001
# Inside container:
apt update && apt upgrade -y
apt install -y docker.io docker-compose-plugin curl wget git vim htop
systemctl enable docker
systemctl start docker
docker run hello-world # Verify Docker works
apt clean
history -c
exit
```
4. **Convert to Template:**
```bash
pct stop 9001
pct template 9001
```
**Verification:**
```bash
pct list | grep 9001
# Should show as template
```
---
## Phase 3: Single VM Migration (Detailed)
### Example: Migrating `docker-unused` (VM 117) to LXC
#### Step 3.1: Pre-Migration Analysis
```bash
python3 migrate_vm_to_lxc.py analyze --vmid 117
```
Review output for:
- Migration suitability
- Resource estimates
- Warnings
#### Step 3.2: Generate Migration Plan
```bash
python3 migrate_vm_to_lxc.py plan --vmid 117 --ctid 217 --ip 10.10.0.217
```
This creates `migration-plan-vm117-to-ct217.json` with step-by-step instructions.
#### Step 3.3: Create VM Snapshot
```bash
# Via Python
python3 -c "from proxmox_client import ProxmoxClient; c=ProxmoxClient(); \
c.create_snapshot(117, 'pre-lxc-migration-2025-01-11', 'Before LXC migration')"
# Or via Proxmox UI
# VM 117 → Snapshots → Take Snapshot
```
#### Step 3.4: Backup Docker Configurations
```bash
# SSH into VM
ssh cal@10.10.0.X # Replace with actual VM IP
# Create backup archive
cd ~
tar czf docker-backup-$(date +%Y%m%d).tar.gz \
~/docker \
~/services \
~/*/docker-compose.yml \
~/.env* \
2>/dev/null
# Document running containers
docker ps -a --format "table {{.Names}}\t{{.Image}}\t{{.Status}}" > container-list.txt
docker compose ls > compose-projects.txt
# Exit VM
exit
# Copy backup to local machine
scp cal@10.10.0.X:~/docker-backup-*.tar.gz ./vm117-backup.tar.gz
scp cal@10.10.0.X:~/container-list.txt ./vm117-containers.txt
```
#### Step 3.5: Stop VM
```bash
# Graceful shutdown
ssh cal@10.10.0.X 'sudo shutdown -h now'
# Or via Proxmox API
python3 -c "from proxmox_client import ProxmoxClient; c=ProxmoxClient(); \
c.shutdown_vm(117, timeout=120)"
# Verify stopped
qm status 117
```
#### Step 3.6: Create LXC from Template
```bash
# Clone template
pct clone 9001 217 \
--hostname docker-unused-lxc \
--full \
--storage local-lvm
# Configure resources
pct set 217 \
--memory 6800 \
--cores 8 \
--net0 name=eth0,bridge=vmbr0,ip=10.10.0.217/24,gw=10.10.0.1 \
--nameserver 10.10.0.1 \
--searchdomain local \
--onboot 1
# Verify configuration
pct config 217
```
#### Step 3.7: Start LXC and Validate
```bash
# Start container
pct start 217
# Check status
pct status 217
# Test SSH access
sleep 10
ssh root@10.10.0.217 'echo "LXC is accessible"'
# Verify Docker
ssh root@10.10.0.217 'docker --version && docker ps'
```
#### Step 3.8: Restore Docker Configurations
```bash
# Copy backup to LXC
scp vm117-backup.tar.gz root@10.10.0.217:/root/
# Extract on LXC
ssh root@10.10.0.217 'cd / && tar xzf /root/vm117-backup.tar.gz'
# Verify files
ssh root@10.10.0.217 'ls -la ~/docker ~/services'
```
#### Step 3.9: Start Docker Containers
```bash
# Start all Docker Compose projects
ssh root@10.10.0.217 'cd ~/docker && docker compose up -d'
# Verify containers started
ssh root@10.10.0.217 'docker ps -a'
# Check logs for errors
ssh root@10.10.0.217 'docker compose logs --tail=50'
```
#### Step 3.10: Validate Services
Run through **migration_checklist.md** Phase 7: Service Validation
Key checks:
```bash
# From LXC
docker ps # All containers "Up"
docker compose logs # No errors
curl http://localhost:<port> # Services responding
docker stats --no-stream # Resource usage reasonable
```
#### Step 3.11: Update External References
If VM had static services:
- [ ] Update DNS entries (if manual DNS)
- [ ] Update reverse proxy (NPM/Traefik)
- [ ] Update firewall rules (if IP-based)
- [ ] Update monitoring dashboards
- [ ] Update documentation
#### Step 3.12: Monitor for 24-48 Hours
```bash
# Check resource usage
watch -n 60 'pct exec 217 -- docker stats --no-stream'
# Check container status
watch -n 300 'pct exec 217 -- docker ps'
# Monitor Proxmox metrics
# UI → Container 217 → Summary (watch graphs)
```
#### Step 3.13: Success Criteria
After 24-48 hours of stable operation:
- ✅ All containers running continuously
- ✅ No unexpected restarts
- ✅ Services accessible and functional
- ✅ Resource usage acceptable (memory < 80%)
- No errors in logs
- Performance equal to or better than VM
**If successful:** Keep VM stopped for 1-2 weeks, then delete.
**If issues:** Rollback to VM (see Rollback section).
---
## Phase 4: Batch Migration
Once comfortable with single migration:
### Step 4.1: Prepare Batch
Create migration spreadsheet:
| VM ID | VM Name | CT ID | IP | Priority | Date Planned | Status |
|-------|---------|-------|-----|----------|-------------|---------|
| 117 | docker-unused | 217 | 10.10.0.217 | Test | 2025-01-15 | Complete |
| 116 | docker-home-servers | 216 | 10.10.0.216 | High | 2025-01-22 | Pending |
| 114 | docker-pittsburgh | 214 | 10.10.0.214 | High | 2025-01-24 | Pending |
| ... | ... | ... | ... | ... | ... | ... |
### Step 4.2: Automate Common Steps
Create migration script wrapper:
```bash
#!/bin/bash
# migrate-docker-vm.sh
VMID=$1
CTID=$2
VM_IP=$3
LXC_IP=$4
echo "Migrating VM $VMID to LXC $CTID"
# 1. Snapshot
echo "Creating snapshot..."
python3 -c "from proxmox_client import ProxmoxClient; c=ProxmoxClient(); \
c.create_snapshot($VMID, 'pre-migration', 'Before LXC migration')"
# 2. Backup Docker
echo "Backing up Docker configs..."
ssh cal@$VM_IP "cd ~ && tar czf docker-backup.tar.gz docker/ services/"
scp cal@$VM_IP:~/docker-backup.tar.gz ./vm${VMID}-backup.tar.gz
# 3. Stop VM
echo "Stopping VM..."
python3 -c "from proxmox_client import ProxmoxClient; c=ProxmoxClient(); \
c.shutdown_vm($VMID, timeout=120)"
# 4. Create LXC
echo "Creating LXC..."
pct clone 9001 $CTID --hostname vm${VMID}-lxc --full
pct set $CTID --net0 name=eth0,bridge=vmbr0,ip=${LXC_IP}/24,gw=10.10.0.1
pct start $CTID
sleep 10
# 5. Restore configs
echo "Restoring Docker configs..."
scp ./vm${VMID}-backup.tar.gz root@$LXC_IP:/root/
ssh root@$LXC_IP "cd / && tar xzf /root/docker-backup.tar.gz"
# 6. Start containers
echo "Starting Docker containers..."
ssh root@$LXC_IP "cd ~/docker && docker compose up -d"
echo "Migration complete! Validate with checklist."
```
### Step 4.3: Execute Migrations Weekly
```bash
# Week 1
./migrate-docker-vm.sh 117 217 10.10.0.X 10.10.0.217
# Week 2
./migrate-docker-vm.sh 116 216 10.10.0.X 10.10.0.216
./migrate-docker-vm.sh 114 214 10.10.0.X 10.10.0.214
# etc.
```
---
## Phase 5: Optimization & Cleanup
### Step 5.1: Resource Optimization
After 1-2 weeks of monitoring, optimize allocations:
```bash
# Check actual usage
pct exec 217 -- free -h
pct exec 217 -- docker stats --no-stream
# If overprovisioned, reduce:
pct set 217 --memory 6144 # Reduce from 6800 to 6144
# If underprovisioned, increase:
pct set 217 --memory 8192 # Increase to 8192
```
### Step 5.2: VM Cleanup
After 2-4 weeks of stable LXC operation:
```bash
# Delete VM (keeping snapshot storage)
qm destroy 117
# Or archive VM to backup storage (recommended)
vzdump 117 --mode stop --storage backup-nas
# Then delete
qm destroy 117
```
### Step 5.3: Documentation Updates
- [ ] Update network diagram with LXC IDs
- [ ] Update IP address spreadsheet
- [ ] Update service runbooks
- [ ] Update backup documentation
- [ ] Update monitoring dashboards
- [ ] Share lessons learned with team
---
## Rollback Procedures
### Scenario 1: LXC Container Issues During Migration
```bash
# 1. Stop problematic LXC
pct stop 217
# 2. Start original VM
qm start 117
# 3. Validate VM services
ssh cal@10.10.0.X 'docker ps'
# 4. Update network (if changed)
# Update DNS/proxy back to VM IP
# 5. Delete failed LXC (optional)
pct destroy 217
# 6. Document issues for analysis
```
### Scenario 2: Performance Problems Post-Migration
```bash
# 1. Increase LXC resources
pct set 217 --memory 12288 # Double memory
pct set 217 --cores 16 # Double cores
pct reboot 217
# 2. If still problems, rollback to VM
pct stop 217
qm start 117
# 3. Analyze root cause before retry
```
### Scenario 3: Data Corruption / Loss
```bash
# 1. IMMEDIATELY stop LXC
pct stop 217
# 2. Start VM from snapshot
qm start 117
# VM should boot to pre-migration state
# 3. Assess data loss
# Compare VM data to LXC data
# 4. Recover from backups if needed
# 5. Document incident thoroughly
```
---
## Common Issues & Solutions
### Issue: Docker won't start in LXC
**Symptoms:** `systemctl status docker` fails, Docker commands error
**Solutions:**
```bash
# Check nesting is enabled
pct config 217 | grep features
# Should show: features: keyctl=1,nesting=1
# If not, enable:
pct set 217 --features nesting=1,keyctl=1
pct reboot 217
# Check container is privileged
pct config 217 | grep unprivileged
# Should show: unprivileged: 0
# If not:
pct set 217 --unprivileged 0
pct reboot 217
```
### Issue: Network not accessible
**Symptoms:** Can't SSH to LXC, containers can't reach internet
**Solutions:**
```bash
# Check LXC network config
pct config 217 | grep net0
# Verify from inside LXC
pct enter 217
ip addr show # Check IP assigned
ip route show # Check gateway
ping 10.10.0.1 # Test gateway
ping 8.8.8.8 # Test internet
exit
# Reconfigure network if needed
pct set 217 --net0 name=eth0,bridge=vmbr0,ip=10.10.0.217/24,gw=10.10.0.1
pct reboot 217
```
### Issue: Containers don't start
**Symptoms:** `docker compose up` fails, containers immediately exit
**Solutions:**
```bash
# Check Docker Compose file syntax
pct exec 217 -- docker compose -f ~/docker/docker-compose.yml config
# Check for volume mount issues
pct exec 217 -- ls -la /path/to/volumes
# Check container logs
pct exec 217 -- docker compose logs service-name
# Common fixes:
# - Create missing directories
# - Fix file permissions
# - Update volume paths in docker-compose.yml
# - Check environment variables in .env
```
### Issue: High memory usage
**Symptoms:** LXC using more memory than expected, OOM kills
**Solutions:**
```bash
# Check actual usage
pct exec 217 -- free -h
pct exec 217 -- docker stats --no-stream
# Increase allocation
pct set 217 --memory 12288 # Increase RAM
pct reboot 217
# Or reduce container memory limits
# Edit docker-compose.yml:
services:
app:
mem_limit: 2g
```
### Issue: Poor performance
**Symptoms:** Slow response times, high CPU wait
**Solutions:**
```bash
# Check resource allocation
pct config 217
# Increase CPU cores
pct set 217 --cores 16
# Check I/O
pct exec 217 -- iostat -x 1 10
# If I/O bound, consider:
# - Faster storage (NVMe vs HDD)
# - Different storage backend
# - Optimize application queries
```
---
## Advanced Topics
### Custom LXC Configurations
**Bind Mounts** (Mount host directories into LXC):
```bash
# Edit /etc/pve/lxc/217.conf
mp0: /mnt/nas/media,mp=/mnt/media,backup=0
# Restart container
pct reboot 217
```
**Resource Limits:**
```bash
# CPU limits
pct set 217 --cpulimit 8 # Limit to 8 cores worth of CPU
pct set 217 --cpuunits 2048 # Relative CPU weight
# Memory limits
pct set 217 --memory 8192 --swap 2048
```
### LXC Templates for Different Purposes
Create specialized templates:
- **docker-lxc-template** (9001) - General Docker host
- **python-lxc-template** (9002) - Python apps without Docker
- **nodejs-lxc-template** (9003) - Node.js apps
- **database-lxc-template** (9004) - PostgreSQL/MySQL
### Monitoring & Alerting
Setup Proxmox API monitoring:
```python
from proxmox_client import ProxmoxClient
client = ProxmoxClient()
containers = client.get_all_containers_status()
for ct in containers:
if ct['status'] == 'running':
mem_pct = (ct['mem'] / ct['maxmem']) * 100
if mem_pct > 80:
print(f"⚠️ Container {ct['vmid']} high memory: {mem_pct:.1f}%")
# Send alert (Discord, email, etc.)
```
---
## Migration Metrics & Tracking
### Track These Metrics
**Pre-Migration (VM):**
- VM memory allocation
- Actual memory usage (average, peak)
- CPU usage (average, peak)
- Disk I/O (IOPS, throughput)
- Boot time
- Application response time
**Post-Migration (LXC):**
- Same metrics as above
- Compare to VM baseline
- Track improvements/regressions
**Overall Progress:**
- VMs migrated vs planned
- Total RAM freed up
- Migration success rate
- Average migration time
- Issues encountered
### Sample Tracking Spreadsheet
| VM | Migration Date | RAM Before | RAM After | Savings | Status | Notes |
|----|----------------|------------|-----------|---------|--------|-------|
| 117 | 2025-01-15 | 8192 MB | 6800 MB | 1392 MB | Success | Smooth |
| 116 | 2025-01-22 | 8192 MB | 7168 MB | 1024 MB | Success | - |
| 115 | 2025-01-29 | 8192 MB | 7680 MB | 512 MB | 🟡 Issues | High CPU |
| ... | ... | ... | ... | ... | ... | ... |
| **Total** | | **73 GB** | **58 GB** | **15 GB** | **85%** | |
---
## Migration Completion
### Success Criteria
Migration program is complete when:
- All suitable VMs migrated (8+ Docker hosts)
- 25-40GB RAM savings achieved
- All services stable for 2+ weeks
- No outstanding issues
- Documentation updated
- Team trained on LXC operations
- Old VMs cleaned up
### Final Review
- [ ] Migration metrics spreadsheet complete
- [ ] Lessons learned documented
- [ ] Best practices identified
- [ ] Template refinements made
- [ ] Monitoring dashboards updated
- [ ] Backup procedures validated
- [ ] Disaster recovery plan updated
### Celebrate! 🎉
You've successfully modernized your infrastructure:
- **Reduced overhead** by 25-40GB RAM
- **Improved performance** with near-native speeds
- **Faster operations** with 2-5 second container starts
- **Same functionality** with better efficiency
---
## Appendix
### Useful Commands Reference
```bash
# List all containers
pct list
# Container status
pct status 217
# Container config
pct config 217
# Enter container (console)
pct enter 217
# Execute command in container
pct exec 217 -- docker ps
# Start/stop/reboot container
pct start 217
pct stop 217
pct reboot 217
# Container resource usage
pct status 217
# Snapshot operations
pct snapshot 217 snap1
pct listsnapshot 217
pct rollback 217 snap1
pct delsnapshot 217 snap1
# Clone container
pct clone 217 218 --hostname new-host --full
# Delete container
pct destroy 217
```
### Python API Quick Reference
```python
from proxmox_client import ProxmoxClient
client = ProxmoxClient()
# List containers
containers = client.list_containers()
# Get container details
ct = client.get_container(217)
# Start/stop container
client.start_container(217)
client.stop_container(217)
# Create container
client.create_container(
vmid=217,
ostemplate="local:vztmpl/ubuntu-22.04-standard.tar.zst",
hostname="test-lxc",
memory=2048,
cores=2
)
# Configure for Docker
client.configure_container_for_docker(217)
# Snapshot operations
client.create_container_snapshot(217, "backup1", "Test snapshot")
client.list_container_snapshots(217)
client.rollback_container_snapshot(217, "backup1")
```
---
**Guide Version:** 1.0
**Last Updated:** 2025-01-11
**For:** Cal's Home Lab Proxmox Infrastructure
**Maintained By:** Jarvis PAI System