Add comprehensive Proxmox VE 7.1 → 9.1 upgrade plan
Create detailed two-phase upgrade strategy for Proxmox hypervisor: - Phase 1: 7.1 → 8.4 (Debian Bullseye → Bookworm) - Phase 2: 8.4 → 9.1 (Debian Bookworm → Trixie) Plan includes: - Pre-upgrade preparation and backup procedures - Step-by-step upgrade execution for both phases - Service validation and dependency order - Rollback procedures for failure scenarios - Risk assessment with mitigation strategies - Timeline: 3-4 weeks total, ~4 hours downtime Critical considerations: - 8 LXC containers + 17 VMs to maintain - Production services (Discord bots, databases, Gitea, n8n) - Home Assistant dual network requirements - LXC systemd compatibility checks for PVE 9 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
parent
76dc82ce7c
commit
7eadacc6db
376
vm-management/proxmox-upgrades/proxmox-7-to-9-upgrade-plan.md
Normal file
376
vm-management/proxmox-upgrades/proxmox-7-to-9-upgrade-plan.md
Normal file
@ -0,0 +1,376 @@
|
||||
# Proxmox VE Upgrade Plan: 7.1-7 → 9.1
|
||||
|
||||
## Executive Summary
|
||||
|
||||
**Current State**: Proxmox VE 7.1-7 (kernel 5.13.19-2-pve)
|
||||
**Target State**: Proxmox VE 9.1 (latest)
|
||||
**Upgrade Path**: Two-phase upgrade (7→8→9) - direct upgrade not supported
|
||||
**Total Timeline**: 3-4 weeks (including stabilization periods)
|
||||
**Total Downtime**: ~4 hours (2 hours per phase)
|
||||
|
||||
## Infrastructure Overview
|
||||
|
||||
**Production Services** (8 LXC + 17 VMs):
|
||||
- **Critical**: Paper Dynasty/Major Domo (VMs 115, 110), Gitea (LXC 225), n8n (LXC 210), Home Assistant (VM 109)
|
||||
- **Important**: Media services (Plex 107, Tdarr 113, arr-stack 221), OpenClaw (224), Databases (112)
|
||||
- **Lower Priority**: Game servers, development containers
|
||||
|
||||
**Key Constraints**:
|
||||
- Home Assistant VM 109 requires dual network (vmbr1 for Matter support)
|
||||
- All production Discord bots must minimize downtime
|
||||
- Gitea mirrored to GitHub provides backup
|
||||
- TrueNAS backup mount at 10.10.0.35
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: Proxmox 7.1 → 8.4 Upgrade
|
||||
|
||||
### Pre-Upgrade Preparation (1-2 days)
|
||||
|
||||
#### 1. Comprehensive Backups
|
||||
|
||||
**Priority 1 - Production Services**:
|
||||
```bash
|
||||
# Backup critical services to TrueNAS
|
||||
vzdump 210 --mode snapshot --dumpdir /mnt/truenas/proxmox --compress zstd # n8n
|
||||
vzdump 115 --mode snapshot --dumpdir /mnt/truenas/proxmox --compress zstd # docker-sba
|
||||
vzdump 112 --mode snapshot --dumpdir /mnt/truenas/proxmox --compress zstd # databases
|
||||
vzdump 110 --mode snapshot --dumpdir /mnt/truenas/proxmox --compress zstd # discord-bots
|
||||
vzdump 225 --mode snapshot --dumpdir /mnt/truenas/proxmox --compress zstd # gitea
|
||||
vzdump 109 --mode snapshot --dumpdir /mnt/truenas/proxmox --compress zstd # homeassistant
|
||||
```
|
||||
|
||||
**Priority 2 - All Remaining VMs/LXCs**:
|
||||
```bash
|
||||
vzdump --all --mode snapshot --dumpdir /mnt/truenas/proxmox --compress zstd
|
||||
```
|
||||
|
||||
**Backup Proxmox Configuration**:
|
||||
```bash
|
||||
tar -czf /mnt/truenas/proxmox/pve-config-$(date +%Y%m%d).tar.gz /etc/pve/
|
||||
cp /etc/network/interfaces /mnt/truenas/proxmox/interfaces.backup
|
||||
```
|
||||
|
||||
**Expected**: 2-4 hours, ~500GB-1TB storage required
|
||||
|
||||
#### 2. Pre-Upgrade Validation
|
||||
|
||||
```bash
|
||||
# Run Proxmox 7-to-8 checker
|
||||
pve7to8 --full
|
||||
|
||||
# Update to latest PVE 7.4
|
||||
apt update && apt dist-upgrade -y
|
||||
|
||||
# Verify minimum version
|
||||
pveversion # Must show 7.4-15 or higher
|
||||
|
||||
# Document current state
|
||||
pvesh get /cluster/resources --type vm --output-format yaml > /mnt/truenas/proxmox/vm-inventory-pre-upgrade.yaml
|
||||
```
|
||||
|
||||
#### 3. Maintenance Window Planning
|
||||
|
||||
**Recommended Timing**: Overnight or early morning weekend
|
||||
**Estimated Downtime**: 1.5-2.5 hours
|
||||
**Notifications Required**: Discord bot users, game server players
|
||||
|
||||
### Upgrade Execution (2-4 hours including downtime)
|
||||
|
||||
#### 1. Update to Latest PVE 7.4
|
||||
```bash
|
||||
apt update && apt dist-upgrade -y
|
||||
pveversion # Verify 7.4-XX
|
||||
reboot
|
||||
```
|
||||
|
||||
#### 2. Configure PVE 8 Repositories
|
||||
```bash
|
||||
# Backup current config
|
||||
cp /etc/apt/sources.list /etc/apt/sources.list.pve7-backup
|
||||
cp -a /etc/apt/sources.list.d/ /etc/apt/sources.list.d.pve7-backup/
|
||||
|
||||
# Update repositories (Bullseye → Bookworm)
|
||||
sed -i 's/bullseye/bookworm/g' /etc/apt/sources.list
|
||||
echo "deb http://download.proxmox.com/debian/pve bookworm pve-no-subscription" > /etc/apt/sources.list.d/pve-install-repo.list
|
||||
sed -i 's/^deb/# deb/' /etc/apt/sources.list.d/pve-enterprise.list 2>/dev/null || true
|
||||
|
||||
apt update
|
||||
```
|
||||
|
||||
#### 3. Execute Distribution Upgrade
|
||||
```bash
|
||||
apt dist-upgrade
|
||||
# Duration: 15-45 minutes
|
||||
# Accept new versions of /etc/issue
|
||||
# Keep current versions of customized configs
|
||||
|
||||
reboot
|
||||
```
|
||||
|
||||
#### 4. Verify PVE 8 Installation
|
||||
```bash
|
||||
pveversion # Should show pve-manager/8.4-X
|
||||
uname -r # Should show 6.8.X-X-pve
|
||||
|
||||
# Verify services
|
||||
systemctl status pve-cluster pvedaemon pveproxy pvestatd
|
||||
pvesm status
|
||||
```
|
||||
|
||||
### Post-Upgrade Validation
|
||||
|
||||
**Start Services in Dependency Order**:
|
||||
```bash
|
||||
# Databases first
|
||||
pvesh create /nodes/proxmox/qemu/112/status/start
|
||||
|
||||
# Infrastructure
|
||||
pvesh create /nodes/proxmox/lxc/225/status/start # gitea
|
||||
pvesh create /nodes/proxmox/lxc/210/status/start # n8n
|
||||
|
||||
# Applications
|
||||
pvesh create /nodes/proxmox/qemu/115/status/start # docker-sba (Paper Dynasty)
|
||||
pvesh create /nodes/proxmox/qemu/110/status/start # discord-bots
|
||||
pvesh create /nodes/proxmox/lxc/224/status/start # openclaw
|
||||
|
||||
# Media & Others
|
||||
pvesh create /nodes/proxmox/qemu/109/status/start # homeassistant
|
||||
pvesh create /nodes/proxmox/qemu/107/status/start # plex
|
||||
pvesh create /nodes/proxmox/lxc/221/status/start # arr-stack
|
||||
```
|
||||
|
||||
**Service Validation Checklist**:
|
||||
- [ ] Discord bots responding in Discord
|
||||
- [ ] Database connections working
|
||||
- [ ] n8n workflows executing
|
||||
- [ ] Gitea accessible at git.manticorum.com
|
||||
- [ ] Home Assistant automations running
|
||||
- [ ] Media servers streaming (Plex/Jellyfin)
|
||||
- [ ] Web UI accessible and functional
|
||||
|
||||
### Stabilization Period
|
||||
|
||||
**Wait 1-2 weeks before PVE 9 upgrade**
|
||||
|
||||
Monitor for:
|
||||
- VM/LXC stability
|
||||
- Performance issues
|
||||
- Service uptime
|
||||
- Error logs
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: Proxmox 8.4 → 9.1 Upgrade
|
||||
|
||||
### Pre-Upgrade Preparation (1 day)
|
||||
|
||||
#### 1. LXC Compatibility Check (CRITICAL)
|
||||
|
||||
```bash
|
||||
# Verify systemd version in each LXC (must be > 230)
|
||||
for ct in 108 210 211 221 222 223 224 225; do
|
||||
echo "=== LXC $ct ==="
|
||||
pct exec $ct -- systemd --version | head -1
|
||||
done
|
||||
```
|
||||
|
||||
**Action Required**: If any LXC shows systemd < 230:
|
||||
```bash
|
||||
pct enter <CTID>
|
||||
apt update && apt dist-upgrade -y
|
||||
do-release-upgrade # Upgrade Ubuntu to compatible version
|
||||
```
|
||||
|
||||
**Expected**: All Ubuntu 20.04+ LXCs should be compatible (systemd 245+)
|
||||
|
||||
#### 2. Fresh Backup Set
|
||||
```bash
|
||||
vzdump --all --mode snapshot --dumpdir /mnt/truenas/proxmox/pve9-upgrade --compress zstd
|
||||
tar -czf /mnt/truenas/proxmox/pve8-config-$(date +%Y%m%d).tar.gz /etc/pve/
|
||||
```
|
||||
|
||||
#### 3. Run PVE 8-to-9 Checker
|
||||
```bash
|
||||
pve8to9 --full
|
||||
```
|
||||
|
||||
### Upgrade Execution (2-4 hours including downtime)
|
||||
|
||||
#### 1. Configure PVE 9 Repositories
|
||||
```bash
|
||||
# Backup PVE 8 config
|
||||
cp /etc/apt/sources.list /etc/apt/sources.list.pve8-backup
|
||||
cp -a /etc/apt/sources.list.d/ /etc/apt/sources.list.d.pve8-backup/
|
||||
|
||||
# Update repositories (Bookworm → Trixie)
|
||||
sed -i 's/bookworm/trixie/g' /etc/apt/sources.list
|
||||
echo "deb http://download.proxmox.com/debian/pve trixie pve-no-subscription" > /etc/apt/sources.list.d/pve-install-repo.list
|
||||
sed -i 's/^deb/# deb/' /etc/apt/sources.list.d/pve-enterprise.list 2>/dev/null || true
|
||||
|
||||
apt update
|
||||
```
|
||||
|
||||
#### 2. Execute Distribution Upgrade
|
||||
```bash
|
||||
apt dist-upgrade
|
||||
# Duration: 20-60 minutes
|
||||
|
||||
reboot
|
||||
```
|
||||
|
||||
#### 3. Verify PVE 9 Installation
|
||||
```bash
|
||||
pveversion # Should show pve-manager/9.1-X
|
||||
uname -r # Should show 6.14.X-X-pve
|
||||
|
||||
# Verify cgroupv2 (PVE 9 requirement)
|
||||
mount | grep cgroup2
|
||||
|
||||
# Verify services
|
||||
systemctl status pve-cluster pvedaemon pveproxy pvestatd
|
||||
pvesm status
|
||||
```
|
||||
|
||||
### Post-Upgrade Validation
|
||||
|
||||
**Start and validate services** using same procedure as PVE 8 upgrade.
|
||||
|
||||
**Additional PVE 9 Checks**:
|
||||
- Web UI with cleared browser cache (Ctrl+Shift+R)
|
||||
- Memory reporting (PVE 9 includes overhead in VM memory)
|
||||
- Storage performance validation
|
||||
|
||||
---
|
||||
|
||||
## Rollback Procedures
|
||||
|
||||
### If PVE 8 Upgrade Fails
|
||||
|
||||
**During dist-upgrade**:
|
||||
```bash
|
||||
apt --fix-broken install
|
||||
dpkg --configure -a
|
||||
|
||||
# If unrecoverable:
|
||||
cp /etc/apt/sources.list.pve7-backup /etc/apt/sources.list
|
||||
cp -a /etc/apt/sources.list.d.pve7-backup/* /etc/apt/sources.list.d/
|
||||
apt update && apt install pve-manager/7.4
|
||||
```
|
||||
|
||||
**After reboot to unstable system**:
|
||||
- Boot to previous kernel via GRUB → Advanced options
|
||||
- Rollback repositories as above
|
||||
|
||||
### If PVE 9 Upgrade Fails
|
||||
|
||||
```bash
|
||||
cp /etc/apt/sources.list.pve8-backup /etc/apt/sources.list
|
||||
cp -a /etc/apt/sources.list.d.pve8-backup/* /etc/apt/sources.list.d/
|
||||
apt update && apt dist-upgrade
|
||||
reboot
|
||||
```
|
||||
|
||||
### If VM/LXC Won't Start
|
||||
|
||||
**Restore from backup**:
|
||||
```bash
|
||||
# LXC
|
||||
pct restore <CTID> /mnt/truenas/proxmox/vzdump-lxc-<CTID>-*.tar.zst --storage local-lvm
|
||||
|
||||
# VM
|
||||
qmrestore /mnt/truenas/proxmox/vzdump-qemu-<VMID>-*.vma.zst <VMID>
|
||||
```
|
||||
|
||||
### Complete Reinstallation (Last Resort)
|
||||
|
||||
1. Reinstall Proxmox VE 9 from ISO
|
||||
2. Restore configs from `/mnt/truenas/proxmox/pve-config-*/`
|
||||
3. Restore VMs/LXCs from backups
|
||||
4. Reconfigure networking if needed
|
||||
|
||||
---
|
||||
|
||||
## Risk Assessment
|
||||
|
||||
| Component | Risk | Impact | Mitigation |
|
||||
|-----------|------|--------|-----------|
|
||||
| Production Bots (115, 110) | HIGH | Service downtime | Backup instance ready, notify users |
|
||||
| Databases (112) | HIGH | Data loss | Multiple backups, test restore |
|
||||
| LXC systemd compatibility | MEDIUM | Container won't start | Pre-verify versions, upgrade OS if needed |
|
||||
| Network config | MEDIUM | Connectivity loss | Document config, console access |
|
||||
| n8n workflows (210) | MEDIUM | Automation failures | Export workflow configs |
|
||||
|
||||
**Low Risk**: Game servers, templates, unused services
|
||||
|
||||
---
|
||||
|
||||
## Post-Upgrade Tasks
|
||||
|
||||
### 1. Update Documentation
|
||||
- Record upgrade completion in `/mnt/NV2/Development/claude-home/vm-management/`
|
||||
- Update Proxmox version references
|
||||
- Document issues encountered
|
||||
|
||||
### 2. Performance Validation
|
||||
```bash
|
||||
pvesh get /cluster/resources
|
||||
```
|
||||
|
||||
### 3. Long-Term Monitoring
|
||||
- Daily health checks
|
||||
- Resource utilization trends
|
||||
- Plan next upgrade (PVE 9.x updates)
|
||||
|
||||
---
|
||||
|
||||
## Timeline Summary
|
||||
|
||||
| Phase | Duration | Downtime | Activity |
|
||||
|-------|----------|----------|----------|
|
||||
| Pre-PVE8 Prep | 1-2 days | None | Backups, validation |
|
||||
| PVE 7→8 Upgrade | 2-4 hours | 1.5-2.5 hours | Repository update, upgrade |
|
||||
| PVE 8 Stabilization | 1-2 weeks | None | Monitor, validate |
|
||||
| Pre-PVE9 Prep | 1 day | None | LXC validation, backups |
|
||||
| PVE 8→9 Upgrade | 2-4 hours | 1.5-2.5 hours | Repository update, upgrade |
|
||||
| Post-Upgrade | 1-2 days | None | Documentation, optimization |
|
||||
| **TOTAL** | **3-4 weeks** | **~4 hours** | Full upgrade with stabilization |
|
||||
|
||||
---
|
||||
|
||||
## Critical Files
|
||||
|
||||
- `/etc/pve/qemu-server/*.conf` - VM configurations (backup critical)
|
||||
- `/etc/pve/lxc/*.conf` - LXC configurations (backup critical)
|
||||
- `/etc/network/interfaces` - Network config (document before changes)
|
||||
- `/etc/apt/sources.list` - Repository config (will be modified)
|
||||
- `/etc/apt/sources.list.d/pve-*.list` - Proxmox repos (will be modified)
|
||||
|
||||
---
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
After each upgrade phase:
|
||||
|
||||
- [ ] Proxmox version correct (`pveversion`)
|
||||
- [ ] Kernel version updated (`uname -r`)
|
||||
- [ ] All services running (`systemctl status pve-*`)
|
||||
- [ ] Storage accessible (`pvesm status`)
|
||||
- [ ] Network functional (`ip addr`, `ip route`)
|
||||
- [ ] All VMs/LXCs visible in UI
|
||||
- [ ] Critical VMs/LXCs started successfully
|
||||
- [ ] Discord bots responding
|
||||
- [ ] Databases accessible
|
||||
- [ ] n8n workflows running
|
||||
- [ ] Gitea accessible
|
||||
- [ ] Home Assistant functional
|
||||
- [ ] Media streaming working
|
||||
- [ ] Web UI functional (clear cache first)
|
||||
|
||||
---
|
||||
|
||||
## Sources
|
||||
|
||||
- [Proxmox VE: Upgrade from 7 to 8](https://pve.proxmox.com/wiki/Upgrade_from_7_to_8)
|
||||
- [Proxmox VE: Upgrade from 8 to 9](https://pve.proxmox.com/wiki/Upgrade_from_8_to_9)
|
||||
- [Proxmox VE: Backup and Restore](https://pve.proxmox.com/wiki/Backup_and_Restore)
|
||||
Loading…
Reference in New Issue
Block a user