VM 116: Resolve watchstate duplicate and clean up remaining containers #31

New Issue

cal · 2026-04-03T01:13:50Z

cal commented

2026-04-03 01:13:50 +00:00

Context

During the infra audit immediate fixes, we discovered VM 116 (docker-home-servers) had a watchstate container in a restart loop — Caddy (embedded in watchstate) was crashing with decoding intermediate certificate PEM: no PEM block found. This was the primary driver of VM 116's elevated load after avahi was masked.

Watchstate was stopped (not removed) and 3 dead containers were removed. VM 116 now only runs Jellyfin.

Current State (post-cleanup)

jellyfin — running, healthy
watchstate — stopped (restart loop due to corrupt Caddy PKI cert)
3 dead containers removed: freetube (19 months), pihole (never started), xenodochial_agnesi (4 years)

Key Question: Is this watchstate instance needed?

Manticore also runs a watchstate container that is healthy and active. This VM 116 instance may be a stale duplicate from before services were migrated to manticore.

Tasks

Confirm manticore's watchstate is the canonical instance: ssh manticore "docker logs --tail 20 watchstate" — verify it's syncing Jellyfin state
If manticore's instance is authoritative, remove the VM 116 watchstate container and its volumes: docker rm watchstate && docker volume prune
If VM 116's instance was needed, fix the Caddy cert: likely delete /data/caddy/pki/ inside the container volume and restart
Clean up unused Docker images on VM 116: docker image prune -a -f
Reassess VM 116's purpose — if only Jellyfin remains, consider whether Jellyfin should move to manticore (which already runs it) and VM 116 could be decommissioned entirely

VM 110 (discord-bots) is also now empty after container cleanup — another decommission candidate
Both VMs together represent 8 vCPUs + 16 GB RAM that could be reclaimed

Labels

infra-audit, proxmox

## Context During the infra audit immediate fixes, we discovered VM 116 (docker-home-servers) had a **watchstate container in a restart loop** — Caddy (embedded in watchstate) was crashing with `decoding intermediate certificate PEM: no PEM block found`. This was the primary driver of VM 116's elevated load after avahi was masked. Watchstate was stopped (not removed) and 3 dead containers were removed. VM 116 now only runs Jellyfin. ## Current State (post-cleanup) - **jellyfin** — running, healthy - **watchstate** — stopped (restart loop due to corrupt Caddy PKI cert) - 3 dead containers removed: `freetube` (19 months), `pihole` (never started), `xenodochial_agnesi` (4 years) ## Key Question: Is this watchstate instance needed? Manticore also runs a watchstate container that is healthy and active. This VM 116 instance may be a stale duplicate from before services were migrated to manticore. ## Tasks - [ ] Confirm manticore's watchstate is the canonical instance: `ssh manticore "docker logs --tail 20 watchstate"` — verify it's syncing Jellyfin state - [ ] If manticore's instance is authoritative, **remove** the VM 116 watchstate container and its volumes: `docker rm watchstate && docker volume prune` - [ ] If VM 116's instance was needed, fix the Caddy cert: likely delete `/data/caddy/pki/` inside the container volume and restart - [ ] Clean up unused Docker images on VM 116: `docker image prune -a -f` - [ ] Reassess VM 116's purpose — if only Jellyfin remains, consider whether Jellyfin should move to manticore (which already runs it) and VM 116 could be decommissioned entirely ## Related - VM 110 (discord-bots) is also now empty after container cleanup — another decommission candidate - Both VMs together represent 8 vCPUs + 16 GB RAM that could be reclaimed ## Labels `infra-audit`, `proxmox`

cal added the

infra-audit

proxmox

labels 2026-04-03 01:14:06 +00:00

Claude added the

ai-working

label 2026-04-03 16:00:54 +00:00

cal referenced this issue from a commit

2026-04-03 16:04:40 +00:00

fix: clean up VM 116 watchstate duplicate and document decommission candidacy (#31)

Claude referenced a pull request that will close this issue

2026-04-03 16:04:51 +00:00

fix: clean up VM 116 watchstate duplicate and document decommission candidacy (#31) #41

Claude removed the

ai-working

label 2026-04-03 16:04:56 +00:00

Claude commented

2026-04-03 16:05:01 +00:00

Completed infrastructure cleanup and opened PR #41.

What was done:

Confirmed manticore's watchstate (v1.0.2) is the canonical instance — actively syncing Jellyfin state
Removed the stopped/broken watchstate container from VM 116 (stale duplicate with corrupt Caddy PKI cert)
Pruned 5 orphan images (watchstate, freetube, pihole, hello-world) → 3.36 GB reclaimed
VM 116 now runs only Jellyfin (container up 8 days, healthy)

Docs updated: server-configs/hosts.yml (added VM 116 as decommission candidate) and vm-management/proxmox-upgrades/proxmox-7-to-9-upgrade-plan.md (status updated from Stopped/Investigate → Decommission Candidate).

Next: VM 116 full decommission (and VM 110) can be separate issues following the decommission runbook.

Completed infrastructure cleanup and opened PR #41. **What was done:** - Confirmed manticore's watchstate (v1.0.2) is the canonical instance — actively syncing Jellyfin state - Removed the stopped/broken `watchstate` container from VM 116 (stale duplicate with corrupt Caddy PKI cert) - Pruned 5 orphan images (watchstate, freetube, pihole, hello-world) → **3.36 GB reclaimed** - VM 116 now runs only Jellyfin (container up 8 days, healthy) **Docs updated:** `server-configs/hosts.yml` (added VM 116 as decommission candidate) and `vm-management/proxmox-upgrades/proxmox-7-to-9-upgrade-plan.md` (status updated from Stopped/Investigate → Decommission Candidate). Next: VM 116 full decommission (and VM 110) can be separate issues following the decommission runbook.

Claude added the

ai-pr-opened

label 2026-04-03 16:05:04 +00:00

cal referenced this issue from a commit

2026-04-03 20:01:16 +00:00

fix: clean up VM 116 watchstate duplicate and document decommission candidacy (#31)

cal closed this issue

2026-04-03 20:01:28 +00:00

cal referenced this issue from a commit

2026-04-03 20:01:29 +00:00

Merge pull request 'fix: clean up VM 116 watchstate duplicate and document decommission candidacy (#31)' (#41) from issue/31-vm-116-resolve-watchstate-duplicate-and-clean-up-r into main

Sign in to join this conversation.