claude-home/vm-management/vm-decommission-runbook.md
Cal Corum a97f443f60
All checks were successful
Reindex Knowledge Base / reindex (push) Successful in 3s
docs: sync KB — vm-decommission-runbook.md
2026-04-02 22:00:04 -05:00

4.5 KiB

title description type domain tags
VM Decommission Runbook Step-by-step procedure for safely decommissioning a Proxmox VM — dependency checks, destruction, and repo cleanup. runbook vm-management
proxmox
decommission
infrastructure
cleanup

VM Decommission Runbook

Procedure for safely removing a stopped Proxmox VM and reclaiming its disk space. Derived from the VM 105 (docker-vpn) decommission (2026-04-02, issue #20).

Prerequisites

  • VM must already be stopped on Proxmox
  • Services previously running on the VM must be confirmed migrated or no longer needed
  • SSH access to Proxmox host (ssh proxmox)

Phase 1 — Dependency Verification

Run all checks before destroying anything. A clean result on all five means safe to proceed.

1.1 Pi-hole DNS

Check both primary and secondary Pi-hole for DNS records pointing to the VM's IP:

ssh pihole "grep '<VM_IP>' /etc/pihole/custom.list || echo 'No DNS entries'"
ssh pihole "pihole -q <VM_HOSTNAME>"

1.2 Nginx Proxy Manager (NPM)

Check NPM for any proxy hosts with the VM's IP as an upstream:

  • NPM UI: https://npm.manticorum.com → Proxy Hosts → search for VM IP
  • Or via API: ssh npm-pihole "curl -s http://localhost:81/api/nginx/proxy-hosts" | grep <VM_IP>

1.3 Proxmox Firewall Rules

ssh proxmox "cat /etc/pve/firewall/<VMID>.fw 2>/dev/null || echo 'No firewall rules'"

1.4 Backup Existence

ssh proxmox "ls -la /var/lib/vz/dump/ | grep <VMID>"

1.5 VPN / Tunnel References

Check if any WireGuard or VPN configs on other hosts reference this VM:

ssh proxmox "grep -r '<VM_IP>' /etc/wireguard/ 2>/dev/null || echo 'No WireGuard refs'"

Also check SSH config and any automation scripts in the claude-home repo:

grep -r '<VM_IP>\|<VM_HOSTNAME>' ~/Development/claude-home/

Phase 2 — Safety Measures

2.1 Disable Auto-Start

Prevent the VM from starting on Proxmox reboot while you work:

ssh proxmox "qm set <VMID> --onboot 0"

2.2 Record Disk Space (Before)

ssh proxmox "lvs | grep pve"

Save this output for comparison after destruction.

2.3 Optional: Take a Final Backup

If the VM might contain anything worth preserving:

ssh proxmox "vzdump <VMID> --mode snapshot --storage home-truenas --compress zstd"

Skip if the VM has been stopped for a long time and all services are confirmed migrated.

Phase 3 — Destroy

ssh proxmox "qm destroy <VMID> --purge"

The --purge flag removes the disk along with the VM config. Verify:

ssh proxmox "qm list | grep <VMID>"          # Should return nothing
ssh proxmox "lvs | grep vm-<VMID>-disk"       # Should return nothing
ssh proxmox "lvs | grep pve"                  # Compare with Phase 2.2

Phase 4 — Repo Cleanup

Update these files in the claude-home repo:

File Action
~/.ssh/config Comment out Host block, add # DECOMMISSIONED: <name> (<IP>) - <reason>
server-configs/proxmox/qemu/<VMID>.conf Delete the file
Migration results (if applicable) Check off decommission tasks
vm-management/proxmox-upgrades/proxmox-7-to-9-upgrade-plan.md Move from Stopped/Investigate to Decommissioned
networking/examples/ssh-homelab-setup.md Comment out or remove entry
networking/examples/server_inventory.yaml Comment out or remove entry

Leave historical/planning docs (migration plans, wave results) as-is — they serve as historical records.

Phase 5 — Commit and PR

Branch naming: chore/<ISSUE_NUMBER>-decommission-<vm-name>

Commit message format:

chore: decommission VM <VMID> (<name>) — reclaim <SIZE> disk (#<ISSUE>)

Closes #<ISSUE>

This is typically a docs-only PR (all .md and config files) which gets auto-approved by the auto-merge-docs workflow.

Checklist Template

Copy this for each decommission:

### VM <VMID> (<name>) Decommission

**Pre-deletion verification:**
- [ ] Pi-hole DNS — no records
- [ ] NPM upstreams — no proxy hosts
- [ ] Proxmox firewall — no rules
- [ ] Backup status — verified
- [ ] VPN/tunnel references — none

**Execution:**
- [ ] Disabled onboot
- [ ] Recorded disk space before
- [ ] Took backup (or confirmed skip)
- [ ] Destroyed VM with --purge
- [ ] Verified disk space reclaimed

**Cleanup:**
- [ ] SSH config updated
- [ ] VM config file deleted from repo
- [ ] Migration docs updated
- [ ] Upgrade plan updated
- [ ] Example files updated
- [ ] Committed, pushed, PR created