--- title: "Ansible Controller LXC Setup" description: "Complete setup guide for LXC 304 (ansible-controller) at 10.10.0.232 — automated OS/Docker updates with Proxmox snapshot rollback across all VMs, LXCs, and physical servers." type: guide domain: vm-management tags: [ansible, proxmox, lxc, automation, updates, snapshots, rollback, systemd] --- # Ansible Controller LXC Setup Centralized Ansible controller for automated infrastructure updates with Proxmox snapshot-based rollback. ## LXC Details - **VMID**: 304 - **Hostname**: ansible-controller - **IP**: 10.10.0.232 - **SSH alias**: `ansible-controller` or `ansible` - **OS**: Ubuntu 24.04 - **Resources**: 2 cores, 2GB RAM, 16GB disk - **Ansible version**: 2.20.4 (from PPA) - **Collections**: community.general, community.docker (bundled) - **User**: `cal` runs playbooks, SSH key at `/home/cal/.ssh/homelab_rsa` ## Directory Layout ``` /opt/ansible/ ├── ansible.cfg # Main config (pipelining, forks=5) ├── inventory/ │ └── hosts.yml # Full infrastructure inventory ├── playbooks/ │ ├── update-all.yml # Full cycle: snapshot → OS → Docker → health → cleanup │ ├── os-update-only.yml # OS packages only (lighter) │ ├── rollback.yml # Roll back any host to a snapshot │ └── check-status.yml # Read-only health/status check ├── run-update.sh # Runner script with logging ├── roles/ # (empty, for future use) └── logs/ # Update run logs (12-week retention) ``` ## Managed Hosts (15 total) ### Proxmox Host | Host | IP | User | |------|----|------| | pve-node | 10.10.0.11 | root | ### VMs | Host | IP | VMID | User | Python | |------|-----|------|------|--------| | docker-home | 10.10.0.16 | 106 | cal | 3.9 | | discord-bots | 10.10.0.33 | 110 | cal | 3.9 | | databases-bots | 10.10.0.42 | 112 | cal | 3.9 | | docker-sba | 10.10.0.88 | 115 | cal | 3.9 | | docker-home-servers | 10.10.0.124 | 116 | cal | 3.9 | ### LXCs | Host | IP | VMID | User | Python | |------|-----|------|------|--------| | docker-n8n-lxc | 10.10.0.210 | 210 | root | 3.9 | | arr-stack | 10.10.0.221 | 221 | root | 3.9 | | memos | 10.10.0.222 | 222 | root | 3.9 | | foundry-lxc | 10.10.0.223 | 223 | root | 3.9 | | gitea | 10.10.0.225 | 225 | root | 3.9 | | uptime-kuma | 10.10.0.227 | 227 | root | 3.10 | | claude-discord-coordinator | 10.10.0.230 | 301 | root | 3.12 | | claude-runner | 10.10.0.148 | 302 | root | 3.12 | ### Physical | Host | IP | User | Python | |------|----|------|--------| | ubuntu-manticore | 10.10.0.226 | cal | 3.12 | ### Excluded - **Home Assistant** (VM 109): self-managed via HA Supervisor - **Palworld** (LXC 230): deleted 2026-03-25 (freed IP collision with LXC 301) ## Usage SSH to the controller and run as `cal`: ```bash ssh ansible export ANSIBLE_CONFIG=/opt/ansible/ansible.cfg # Check status of everything ansible-playbook /opt/ansible/playbooks/check-status.yml # Full update cycle (snapshot → update → health check → cleanup) ansible-playbook /opt/ansible/playbooks/update-all.yml # Update specific group ansible-playbook /opt/ansible/playbooks/update-all.yml --limit lxcs ansible-playbook /opt/ansible/playbooks/update-all.yml --limit docker-home # Dry run ansible-playbook /opt/ansible/playbooks/update-all.yml --check # OS updates only (no Docker) ansible-playbook /opt/ansible/playbooks/os-update-only.yml # Skip snapshots ansible-playbook /opt/ansible/playbooks/update-all.yml -e skip_snapshot=true # Roll back a host to latest snapshot ansible-playbook /opt/ansible/playbooks/rollback.yml --limit gitea # Roll back to specific snapshot ansible-playbook /opt/ansible/playbooks/rollback.yml --limit gitea -e snapshot=pre-update-2026-03-25 ``` ## Update Pipeline (update-all.yml) 1. **Snapshot**: Creates `pre-update-YYYY-MM-DD` snapshot on each Proxmox guest via `pvesh` 2. **OS Update**: `apt update && apt upgrade safe && autoremove` (serial: 3) 3. **Docker Update**: Finds compose files, pulls images, restarts changed stacks (serial: 1) 4. **Health Check**: SSH ping, disk space warning (>89%), exited container report 5. **Snapshot Cleanup**: Keeps last 3 `pre-update-*` snapshots per host ## Scheduled Runs Systemd timer runs every **Sunday at 3:00 AM UTC** with up to 10 min jitter. `Persistent=true` ensures missed runs execute on next boot. ```bash # Check timer status ssh ansible "systemctl status ansible-update.timer" # View last run ssh ansible "systemctl status ansible-update.service" # View logs ssh ansible "ls -lt /opt/ansible/logs/ | head -5" ssh ansible "journalctl -u ansible-update.service --no-pager -n 50" ``` ## Inventory Groups - `proxmox_host` — just pve-node - `vms` — all QEMU VMs - `lxcs` — all LXC containers - `physical` — bare-metal servers (manticore) - `docker_hosts` — any host running Docker compose stacks - `proxmox_guests` — union of vms + lxcs (snapshotable) ## Adding a New Host 1. Add entry to `/opt/ansible/inventory/hosts.yml` under the appropriate group 2. Include: `ansible_host`, `ansible_user`, `proxmox_vmid`, `proxmox_type` (for guests) 3. Set `ansible_python_interpreter` if Python < 3.9 default 4. Ensure SSH key (`/home/cal/.ssh/homelab_rsa`) is authorized on the target 5. For VMs: ensure NOPASSWD sudo for `cal` user 6. Test: `ansible -m ping` ## Setup Prerequisites Fixed During Initial Deployment - **Python 3.9** installed via deadsnakes PPA on all Ubuntu 20.04 hosts (Ansible 2.20 requires ≥3.9) - **NOPASSWD sudo** set via `/etc/sudoers.d/cal` on all VMs and manticore - **qemu-guest-agent** enabled on VM 112 (databases-bots) - **VM 116 disk** expanded from 31GB→315GB (was 100% full), DNS fixed (missing resolv.conf) - **IP collision** between LXC 230 (palworld) and LXC 301 (claude-discord-coordinator) resolved by deleting palworld