Create detailed two-phase upgrade strategy for Proxmox hypervisor: - Phase 1: 7.1 → 8.4 (Debian Bullseye → Bookworm) - Phase 2: 8.4 → 9.1 (Debian Bookworm → Trixie) Plan includes: - Pre-upgrade preparation and backup procedures - Step-by-step upgrade execution for both phases - Service validation and dependency order - Rollback procedures for failure scenarios - Risk assessment with mitigation strategies - Timeline: 3-4 weeks total, ~4 hours downtime Critical considerations: - 8 LXC containers + 17 VMs to maintain - Production services (Discord bots, databases, Gitea, n8n) - Home Assistant dual network requirements - LXC systemd compatibility checks for PVE 9 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
11 KiB
Proxmox VE Upgrade Plan: 7.1-7 → 9.1
Executive Summary
Current State: Proxmox VE 7.1-7 (kernel 5.13.19-2-pve) Target State: Proxmox VE 9.1 (latest) Upgrade Path: Two-phase upgrade (7→8→9) - direct upgrade not supported Total Timeline: 3-4 weeks (including stabilization periods) Total Downtime: ~4 hours (2 hours per phase)
Infrastructure Overview
Production Services (8 LXC + 17 VMs):
- Critical: Paper Dynasty/Major Domo (VMs 115, 110), Gitea (LXC 225), n8n (LXC 210), Home Assistant (VM 109)
- Important: Media services (Plex 107, Tdarr 113, arr-stack 221), OpenClaw (224), Databases (112)
- Lower Priority: Game servers, development containers
Key Constraints:
- Home Assistant VM 109 requires dual network (vmbr1 for Matter support)
- All production Discord bots must minimize downtime
- Gitea mirrored to GitHub provides backup
- TrueNAS backup mount at 10.10.0.35
Phase 1: Proxmox 7.1 → 8.4 Upgrade
Pre-Upgrade Preparation (1-2 days)
1. Comprehensive Backups
Priority 1 - Production Services:
# Backup critical services to TrueNAS
vzdump 210 --mode snapshot --dumpdir /mnt/truenas/proxmox --compress zstd # n8n
vzdump 115 --mode snapshot --dumpdir /mnt/truenas/proxmox --compress zstd # docker-sba
vzdump 112 --mode snapshot --dumpdir /mnt/truenas/proxmox --compress zstd # databases
vzdump 110 --mode snapshot --dumpdir /mnt/truenas/proxmox --compress zstd # discord-bots
vzdump 225 --mode snapshot --dumpdir /mnt/truenas/proxmox --compress zstd # gitea
vzdump 109 --mode snapshot --dumpdir /mnt/truenas/proxmox --compress zstd # homeassistant
Priority 2 - All Remaining VMs/LXCs:
vzdump --all --mode snapshot --dumpdir /mnt/truenas/proxmox --compress zstd
Backup Proxmox Configuration:
tar -czf /mnt/truenas/proxmox/pve-config-$(date +%Y%m%d).tar.gz /etc/pve/
cp /etc/network/interfaces /mnt/truenas/proxmox/interfaces.backup
Expected: 2-4 hours, ~500GB-1TB storage required
2. Pre-Upgrade Validation
# Run Proxmox 7-to-8 checker
pve7to8 --full
# Update to latest PVE 7.4
apt update && apt dist-upgrade -y
# Verify minimum version
pveversion # Must show 7.4-15 or higher
# Document current state
pvesh get /cluster/resources --type vm --output-format yaml > /mnt/truenas/proxmox/vm-inventory-pre-upgrade.yaml
3. Maintenance Window Planning
Recommended Timing: Overnight or early morning weekend Estimated Downtime: 1.5-2.5 hours Notifications Required: Discord bot users, game server players
Upgrade Execution (2-4 hours including downtime)
1. Update to Latest PVE 7.4
apt update && apt dist-upgrade -y
pveversion # Verify 7.4-XX
reboot
2. Configure PVE 8 Repositories
# Backup current config
cp /etc/apt/sources.list /etc/apt/sources.list.pve7-backup
cp -a /etc/apt/sources.list.d/ /etc/apt/sources.list.d.pve7-backup/
# Update repositories (Bullseye → Bookworm)
sed -i 's/bullseye/bookworm/g' /etc/apt/sources.list
echo "deb http://download.proxmox.com/debian/pve bookworm pve-no-subscription" > /etc/apt/sources.list.d/pve-install-repo.list
sed -i 's/^deb/# deb/' /etc/apt/sources.list.d/pve-enterprise.list 2>/dev/null || true
apt update
3. Execute Distribution Upgrade
apt dist-upgrade
# Duration: 15-45 minutes
# Accept new versions of /etc/issue
# Keep current versions of customized configs
reboot
4. Verify PVE 8 Installation
pveversion # Should show pve-manager/8.4-X
uname -r # Should show 6.8.X-X-pve
# Verify services
systemctl status pve-cluster pvedaemon pveproxy pvestatd
pvesm status
Post-Upgrade Validation
Start Services in Dependency Order:
# Databases first
pvesh create /nodes/proxmox/qemu/112/status/start
# Infrastructure
pvesh create /nodes/proxmox/lxc/225/status/start # gitea
pvesh create /nodes/proxmox/lxc/210/status/start # n8n
# Applications
pvesh create /nodes/proxmox/qemu/115/status/start # docker-sba (Paper Dynasty)
pvesh create /nodes/proxmox/qemu/110/status/start # discord-bots
pvesh create /nodes/proxmox/lxc/224/status/start # openclaw
# Media & Others
pvesh create /nodes/proxmox/qemu/109/status/start # homeassistant
pvesh create /nodes/proxmox/qemu/107/status/start # plex
pvesh create /nodes/proxmox/lxc/221/status/start # arr-stack
Service Validation Checklist:
- Discord bots responding in Discord
- Database connections working
- n8n workflows executing
- Gitea accessible at git.manticorum.com
- Home Assistant automations running
- Media servers streaming (Plex/Jellyfin)
- Web UI accessible and functional
Stabilization Period
Wait 1-2 weeks before PVE 9 upgrade
Monitor for:
- VM/LXC stability
- Performance issues
- Service uptime
- Error logs
Phase 2: Proxmox 8.4 → 9.1 Upgrade
Pre-Upgrade Preparation (1 day)
1. LXC Compatibility Check (CRITICAL)
# Verify systemd version in each LXC (must be > 230)
for ct in 108 210 211 221 222 223 224 225; do
echo "=== LXC $ct ==="
pct exec $ct -- systemd --version | head -1
done
Action Required: If any LXC shows systemd < 230:
pct enter <CTID>
apt update && apt dist-upgrade -y
do-release-upgrade # Upgrade Ubuntu to compatible version
Expected: All Ubuntu 20.04+ LXCs should be compatible (systemd 245+)
2. Fresh Backup Set
vzdump --all --mode snapshot --dumpdir /mnt/truenas/proxmox/pve9-upgrade --compress zstd
tar -czf /mnt/truenas/proxmox/pve8-config-$(date +%Y%m%d).tar.gz /etc/pve/
3. Run PVE 8-to-9 Checker
pve8to9 --full
Upgrade Execution (2-4 hours including downtime)
1. Configure PVE 9 Repositories
# Backup PVE 8 config
cp /etc/apt/sources.list /etc/apt/sources.list.pve8-backup
cp -a /etc/apt/sources.list.d/ /etc/apt/sources.list.d.pve8-backup/
# Update repositories (Bookworm → Trixie)
sed -i 's/bookworm/trixie/g' /etc/apt/sources.list
echo "deb http://download.proxmox.com/debian/pve trixie pve-no-subscription" > /etc/apt/sources.list.d/pve-install-repo.list
sed -i 's/^deb/# deb/' /etc/apt/sources.list.d/pve-enterprise.list 2>/dev/null || true
apt update
2. Execute Distribution Upgrade
apt dist-upgrade
# Duration: 20-60 minutes
reboot
3. Verify PVE 9 Installation
pveversion # Should show pve-manager/9.1-X
uname -r # Should show 6.14.X-X-pve
# Verify cgroupv2 (PVE 9 requirement)
mount | grep cgroup2
# Verify services
systemctl status pve-cluster pvedaemon pveproxy pvestatd
pvesm status
Post-Upgrade Validation
Start and validate services using same procedure as PVE 8 upgrade.
Additional PVE 9 Checks:
- Web UI with cleared browser cache (Ctrl+Shift+R)
- Memory reporting (PVE 9 includes overhead in VM memory)
- Storage performance validation
Rollback Procedures
If PVE 8 Upgrade Fails
During dist-upgrade:
apt --fix-broken install
dpkg --configure -a
# If unrecoverable:
cp /etc/apt/sources.list.pve7-backup /etc/apt/sources.list
cp -a /etc/apt/sources.list.d.pve7-backup/* /etc/apt/sources.list.d/
apt update && apt install pve-manager/7.4
After reboot to unstable system:
- Boot to previous kernel via GRUB → Advanced options
- Rollback repositories as above
If PVE 9 Upgrade Fails
cp /etc/apt/sources.list.pve8-backup /etc/apt/sources.list
cp -a /etc/apt/sources.list.d.pve8-backup/* /etc/apt/sources.list.d/
apt update && apt dist-upgrade
reboot
If VM/LXC Won't Start
Restore from backup:
# LXC
pct restore <CTID> /mnt/truenas/proxmox/vzdump-lxc-<CTID>-*.tar.zst --storage local-lvm
# VM
qmrestore /mnt/truenas/proxmox/vzdump-qemu-<VMID>-*.vma.zst <VMID>
Complete Reinstallation (Last Resort)
- Reinstall Proxmox VE 9 from ISO
- Restore configs from
/mnt/truenas/proxmox/pve-config-*/ - Restore VMs/LXCs from backups
- Reconfigure networking if needed
Risk Assessment
| Component | Risk | Impact | Mitigation |
|---|---|---|---|
| Production Bots (115, 110) | HIGH | Service downtime | Backup instance ready, notify users |
| Databases (112) | HIGH | Data loss | Multiple backups, test restore |
| LXC systemd compatibility | MEDIUM | Container won't start | Pre-verify versions, upgrade OS if needed |
| Network config | MEDIUM | Connectivity loss | Document config, console access |
| n8n workflows (210) | MEDIUM | Automation failures | Export workflow configs |
Low Risk: Game servers, templates, unused services
Post-Upgrade Tasks
1. Update Documentation
- Record upgrade completion in
/mnt/NV2/Development/claude-home/vm-management/ - Update Proxmox version references
- Document issues encountered
2. Performance Validation
pvesh get /cluster/resources
3. Long-Term Monitoring
- Daily health checks
- Resource utilization trends
- Plan next upgrade (PVE 9.x updates)
Timeline Summary
| Phase | Duration | Downtime | Activity |
|---|---|---|---|
| Pre-PVE8 Prep | 1-2 days | None | Backups, validation |
| PVE 7→8 Upgrade | 2-4 hours | 1.5-2.5 hours | Repository update, upgrade |
| PVE 8 Stabilization | 1-2 weeks | None | Monitor, validate |
| Pre-PVE9 Prep | 1 day | None | LXC validation, backups |
| PVE 8→9 Upgrade | 2-4 hours | 1.5-2.5 hours | Repository update, upgrade |
| Post-Upgrade | 1-2 days | None | Documentation, optimization |
| TOTAL | 3-4 weeks | ~4 hours | Full upgrade with stabilization |
Critical Files
/etc/pve/qemu-server/*.conf- VM configurations (backup critical)/etc/pve/lxc/*.conf- LXC configurations (backup critical)/etc/network/interfaces- Network config (document before changes)/etc/apt/sources.list- Repository config (will be modified)/etc/apt/sources.list.d/pve-*.list- Proxmox repos (will be modified)
Verification Checklist
After each upgrade phase:
- Proxmox version correct (
pveversion) - Kernel version updated (
uname -r) - All services running (
systemctl status pve-*) - Storage accessible (
pvesm status) - Network functional (
ip addr,ip route) - All VMs/LXCs visible in UI
- Critical VMs/LXCs started successfully
- Discord bots responding
- Databases accessible
- n8n workflows running
- Gitea accessible
- Home Assistant functional
- Media streaming working
- Web UI functional (clear cache first)