claude-home/vm-management/CONTEXT.md
Cal Corum 93d6093d45
All checks were successful
Auto-merge docs-only PRs / auto-merge-docs (pull_request) Successful in 6s
docs: add Ansible controller LXC setup guide and update VM context
New KB doc covering LXC 304 (ansible-controller) at 10.10.0.232 with
full inventory, update playbooks, snapshot rollback, and systemd timer.
Updated CONTEXT.md to reference the new controller.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 22:26:55 -05:00

314 lines
10 KiB
Markdown

---
title: "VM Management Overview"
description: "Technology context for Proxmox VM management including host details (PVE 8.4.16), IaC patterns with cloud-init, security architecture, resource sizing standards, and lifecycle workflows."
type: context
domain: vm-management
tags: [proxmox, vm, cloud-init, docker, ssh, infrastructure-as-code]
---
# Virtual Machine Management - Technology Context
## Overview
Virtual machine management for home lab environments with focus on automated provisioning, infrastructure as code, and security-first configuration. This context covers VM lifecycle management, Proxmox integration, and standardized deployment patterns.
## Proxmox Host
- **Version**: PVE 8.4.16 (upgraded from 7.4-20 on 2026-02-19)
- **Kernel**: 6.8.12-18-pve
- **IP**: 10.10.0.11
- **SSH**: `ssh -i ~/.ssh/homelab_rsa root@10.10.0.11`
- **Storage**: local (100GB dir), local-lvm (2.3TB thin), home-truenas (17TB CIFS at 10.10.0.35)
- **Networking**: vmbr0 (10.10.0.x/24 via eno1), vmbr1 (10.0.0.x/24 via eno2, Matter/IoT)
- **Ansible Controller**: LXC 304 at 10.10.0.232 — automated updates with snapshot rollback, weekly systemd timer (Sun 3 AM). See `ansible-controller-setup.md`
- **Upgrade plan**: Phase 2 (PVE 8→9) pending — see `proxmox-upgrades/proxmox-7-to-9-upgrade-plan.md`
## Architecture Patterns
### Infrastructure as Code (IaC) Approach
**Pattern**: Declarative VM configuration with repeatable deployments
```yaml
# Cloud-init template pattern
#cloud-config
users:
- name: cal
groups: [sudo, docker]
ssh_authorized_keys:
- ssh-rsa AAAAB3... primary-key
- ssh-rsa AAAAB3... emergency-key
packages:
- docker.io
- docker-compose
runcmd:
- systemctl enable docker
- usermod -aG docker cal
```
### Template-Based Deployment Strategy
**Pattern**: Standardized VM templates with cloud-init automation
- **Base Templates**: Ubuntu Server with cloud-init support
- **Resource Allocation**: Standardized sizing (2CPU/4GB/20GB baseline)
- **Network Configuration**: Predefined VLAN assignments (10.10.0.x internal)
- **Security Hardening**: SSH keys only, password auth disabled
## Provisioning Strategies
### Cloud-Init Deployment (Recommended for New VMs)
**Purpose**: Fully automated VM provisioning from first boot
**Implementation**:
1. Create VM in Proxmox with cloud-init support
2. Apply standardized cloud-init template
3. VM configures itself automatically on first boot
4. No manual intervention required
**Benefits**:
- Zero-touch deployment
- Consistent configuration
- Security hardening from first boot
- Immediate productivity
### Post-Install Scripting (Existing VMs)
**Purpose**: Standardize existing VM configurations
**Implementation**:
```bash
./vm-post-install.sh <vm-ip> [username]
# Automated: updates, SSH keys, Docker, hardening
```
**Use Cases**:
- Legacy VM standardization
- Imported VM configuration
- Recovery and remediation
- Incremental improvements
## Security Architecture
### SSH Key-Based Authentication
**Pattern**: Dual key deployment for security and redundancy
```bash
# Primary access key
~/.ssh/homelab_rsa # Daily operations
# Emergency access key
~/.ssh/emergency_homelab_rsa # Backup/recovery access
```
**Security Controls**:
- Password authentication completely disabled
- Root login prohibited
- SSH keys managed centrally
- Automatic key deployment
### User Privilege Management
**Pattern**: Least privilege with sudo elevation
```bash
# User configuration
username: cal
groups: [sudo, docker] # Minimal required groups
shell: /bin/bash
sudo: ALL=(ALL) NOPASSWD:ALL # Operational convenience
```
**Access Controls**:
- Non-root user accounts only
- Sudo required for administrative tasks
- Docker group for container management
- SSH key authentication mandatory
### Network Security
**Pattern**: Network segmentation and access control
- **Internal Network**: 10.10.0.x/24 for VM communication
- **Management Access**: SSH (port 22) only
- **Service Isolation**: Application-specific port exposure
- **Firewall Ready**: iptables/ufw configuration prepared
## Lifecycle Management Patterns
### VM Creation Workflow
1. **Template Selection**: Choose appropriate base image
2. **Resource Allocation**: Size based on workload requirements
3. **Network Assignment**: VLAN and IP address planning
4. **Cloud-Init Configuration**: Apply standardized template
5. **Automated Provisioning**: Zero-touch deployment
6. **Verification**: Automated connectivity and configuration tests
### Configuration Management
**Pattern**: Standardized system configuration
```bash
# Essential packages
packages: [
"curl", "wget", "git", "vim", "htop", "unzip",
"docker.io", "docker-compose-plugin"
]
# System services
runcmd:
- systemctl enable docker
- systemctl enable ssh
- systemctl enable unattended-upgrades
```
### Maintenance Automation
**Pattern**: Automated updates and maintenance
- **Security Updates**: Automatic installation enabled
- **Package Management**: Standardized package selection
- **Service Management**: Consistent service configuration
- **Log Management**: Centralized logging ready
## Resource Management
### Sizing Standards
**Pattern**: Standardized VM resource allocation
```yaml
# Basic workload (web services, small databases)
vcpus: 2
memory: 4096 # 4GB
disk: 20 # 20GB
# Medium workload (application servers, medium databases)
vcpus: 4
memory: 8192 # 8GB
disk: 40 # 40GB
# Heavy workload (transcoding, large databases)
vcpus: 6
memory: 16384 # 16GB
disk: 100 # 100GB
```
### Storage Strategy
**Pattern**: Application-appropriate storage allocation
- **System Disk**: OS and applications (20-40GB)
- **Data Volumes**: Application data (variable)
- **Backup Storage**: Network-attached for persistence
- **Cache Storage**: Local fast storage for performance
### Network Planning
**Pattern**: Structured network addressing
```yaml
# Network segments
management: 10.10.0.x/24 # VM management and SSH access
services: 10.10.1.x/24 # Application services
storage: 10.10.2.x/24 # Storage and backup traffic
dmz: 10.10.10.x/24 # External-facing services
```
## Monitoring and Operations
### Health Monitoring
**Pattern**: Automated system health checks
```bash
# Resource monitoring
cpu_usage: <80%
memory_usage: <90%
disk_usage: <85%
network_connectivity: verified
# Service monitoring
ssh_service: active
docker_service: active
unattended_upgrades: active
```
### Backup Strategies
**Pattern**: Multi-tier backup approach
- **VM Snapshots**: Point-in-time recovery (Proxmox)
- **Application Data**: Specific application backup procedures
- **Configuration Backup**: Cloud-init templates and scripts
- **SSH Keys**: Centralized key management backup
### Performance Tuning
**Pattern**: Workload-optimized configuration
```yaml
# CPU optimization
cpu_type: host # Performance over compatibility
numa: enabled # NUMA awareness for multi-socket
# Memory optimization
ballooning: enabled # Dynamic memory allocation
hugepages: disabled # Unless specifically needed
# Storage optimization
cache: writethrough # Balance performance and safety
io_thread: enabled # Improve I/O performance
```
## Integration Patterns
### Container Platform Integration
**Pattern**: Docker-ready VM deployment
```bash
# Automated Docker setup
- docker.io installation
- docker-compose plugin
- User added to docker group
- Service auto-start enabled
- Container runtime verified
```
### SSH Infrastructure Integration
**Pattern**: Centralized SSH key management
```bash
# Key deployment automation
primary_key: ~/.ssh/homelab_rsa.pub
emergency_key: ~/.ssh/emergency_homelab_rsa.pub
backup_system: automated
rotation_policy: annual
```
### Network Services Integration
**Pattern**: Ready for service deployment
- **Reverse Proxy**: Nginx/Traefik ready configuration
- **DNS**: Local DNS registration prepared
- **Certificates**: Let's Encrypt integration ready
- **Monitoring**: Prometheus/Grafana agent ready
## Common Implementation Workflows
### New VM Deployment
1. **Create VM** in Proxmox with cloud-init support
2. **Configure resources** based on workload requirements
3. **Apply cloud-init template** with standardized configuration
4. **Start VM** and wait for automated provisioning
5. **Verify deployment** via SSH key authentication
6. **Deploy applications** using container or package management
### Existing VM Standardization
1. **Assess current configuration** and identify gaps
2. **Run post-install script** for automated updates
3. **Verify SSH key deployment** and password auth disable
4. **Test Docker installation** and user permissions
5. **Update documentation** with new configuration
6. **Schedule regular maintenance** and monitoring
### VM Migration and Recovery
1. **Create VM snapshot** before changes
2. **Export VM configuration** and cloud-init template
3. **Test recovery procedure** in staging environment
4. **Document recovery steps** and verification procedures
5. **Implement backup automation** for critical VMs
## Best Practices
### Security Hardening
1. **SSH Keys Only**: Disable password authentication completely
2. **Emergency Access**: Deploy backup SSH keys for recovery
3. **User Separation**: Non-root users with sudo privileges
4. **Automatic Updates**: Enable security update automation
5. **Network Isolation**: Use VLANs and firewall rules
### Operational Excellence
1. **Infrastructure as Code**: Use cloud-init for reproducible deployments
2. **Standardization**: Consistent VM sizing and configuration
3. **Automation**: Minimize manual configuration steps
4. **Documentation**: Maintain deployment templates and procedures
5. **Testing**: Verify deployments before production use
### Performance Optimization
1. **Resource Right-Sizing**: Match resources to workload requirements
2. **Storage Strategy**: Use appropriate storage tiers
3. **Network Optimization**: Plan network topology for performance
4. **Monitoring**: Implement resource usage monitoring
5. **Capacity Planning**: Plan for growth and scaling
This technology context provides comprehensive guidance for implementing virtual machine management in home lab and production environments using modern IaC principles and security best practices.