claude-home/docker/CONTEXT.md

# Docker Container Technology - Technology Context

## Overview
Docker containerization for home lab environments with focus on performance optimization, GPU acceleration, and distributed workloads. This context covers container architecture patterns, security practices, and production deployment strategies.

## Architecture Patterns

### Container Design Principles
1. **Single Responsibility**: One service per container
2. **Immutable Infrastructure**: Treat containers as replaceable units
3. **Resource Isolation**: Use container limits and cgroups
4. **Security First**: Run as non-root, minimal attack surface
5. **Configuration Management**: Environment variables and external configs

### Multi-Stage Build Pattern
**Purpose**: Minimize production image size and attack surface
```dockerfile
# Build stage
FROM node:18 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production

# Production stage
FROM node:18-alpine AS production
WORKDIR /app
COPY --from=builder /app/node_modules ./node_modules
COPY . .
USER 1000
EXPOSE 3000
CMD ["node", "server.js"]
```

### Distributed Application Architecture
**Pattern**: Server-Node separation with specialized workloads

```
┌─────────────────┐    ┌──────────────────────────────────┐
│   Control Plane │    │        Worker Nodes              │
│                 │    │  ┌─────────┐ ┌─────────┐         │
│  - Web Interface│◄──►│  │ Node 1  │ │ Node 2  │ ...     │
│  - Job Queue    │    │  │ GPU+CPU │ │ GPU+CPU │         │
│  - Coordination │    │  │Local SSD│ │Local SSD│         │
│                 │    │  └─────────┘ └─────────┘         │
└─────────────────┘    └──────────────────────────────────┘
         │                              │
         └──────── Shared Storage ──────┘
              (NAS/SAN for persistence)
```

## Container Runtime Platforms

### Docker vs Podman Comparison
**Docker**: Traditional daemon-based approach
- Requires Docker daemon running as root
- Centralized container management
- Established ecosystem and tooling

**Podman** (Recommended for GPU workloads):
- Daemonless architecture
- Better GPU integration with NVIDIA
- Rootless containers for enhanced security
- Direct systemd integration

### GPU Acceleration Support
**NVIDIA Container Toolkit Integration**:
```bash
# Podman GPU configuration (recommended)
podman run -d --name gpu-workload \
    --device nvidia.com/gpu=all \
    -e NVIDIA_DRIVER_CAPABILITIES=all \
    -e NVIDIA_VISIBLE_DEVICES=all \
    myapp:latest

# Docker GPU configuration
docker run -d --name gpu-workload \
    --gpus all \
    -e NVIDIA_DRIVER_CAPABILITIES=all \
    myapp:latest
```

## Performance Optimization Patterns

### Hybrid Storage Strategy
**Pattern**: Balance performance and persistence for different data types

```yaml
volumes:
  # Local storage (SSD/NVMe) - High Performance
  - ./app/data:/app/data              # Database - frequent I/O
  - ./app/configs:/app/configs        # Config - startup performance
  - ./app/logs:/app/logs              # Logs - continuous writing
  - ./cache:/cache                    # Work directories - temp processing

  # Network storage (NAS) - Persistence & Backup
  - /mnt/nas/backups:/app/backups     # Backups - infrequent access
  - /mnt/nas/media:/media:ro          # Source data - read-only
```

**Benefits**:
- **Local Operations**: 100x faster database performance vs network
- **Network Reliability**: Critical data protected on redundant storage
- **Cost Optimization**: Expensive fast storage only where needed

### Cache Optimization Hierarchy
```bash
# Performance tiers for different workload types
/dev/shm/cache/          # RAM disk - fastest, volatile, limited size
/mnt/nvme/cache/         # NVMe SSD - 3-7GB/s, persistent, recommended
/mnt/ssd/cache/          # SATA SSD - 500MB/s, good balance
/mnt/nas/cache/          # Network - 100MB/s, legacy compatibility
```

### Resource Management
**Container Limits** (prevent resource exhaustion):
```yaml
deploy:
  resources:
    limits:
      memory: 8G
      cpus: '6'
    reservations:
      memory: 4G
      cpus: '2'
```

**Networking Optimization**:
```yaml
# Host networking for performance-critical applications
network_mode: host

# Bridge networking with port mapping (default)
network_mode: bridge
ports:
  - "8080:8080"
```

## Security Patterns

### Container Hardening
```dockerfile
# Use minimal base images
FROM alpine:3.18

# Run as non-root user
RUN addgroup -g 1000 appuser && \
    adduser -u 1000 -G appuser -s /bin/sh -D appuser
USER 1000

# Set secure permissions
COPY --chown=appuser:appuser . /app
```

### Environment Security
```bash
# Secrets management (avoid environment variables for secrets)
podman secret create db_password password.txt
podman run --secret db_password myapp:latest

# Network isolation
podman network create --driver bridge isolated-net
podman run --network isolated-net myapp:latest
```

### Image Security
1. **Vulnerability Scanning**: Regular image scans with tools like Trivy
2. **Version Pinning**: Use specific tags, avoid `latest`
3. **Minimal Images**: Distroless or Alpine base images
4. **Layer Optimization**: Minimize layers, combine RUN commands

## Development Workflows

### Local Development Pattern
```yaml
# docker-compose.dev.yml
version: "3.8"
services:
  app:
    build: .
    volumes:
      - .:/app              # Code hot-reload
      - /app/node_modules   # Preserve dependencies
    environment:
      - NODE_ENV=development
    ports:
      - "3000:3000"
```

### Production Deployment Pattern
```bash
# Production container with health checks
podman run -d --name production-app \
    --restart unless-stopped \
    --health-cmd="curl -f http://localhost:3000/health || exit 1" \
    --health-interval=30s \
    --health-timeout=10s \
    --health-retries=3 \
    -p 3000:3000 \
    myapp:v1.2.3
```

## Monitoring and Observability

### Health Check Implementation
```dockerfile
# Application health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
    CMD curl -f http://localhost:3000/health || exit 1
```

### Log Management
```bash
# Structured logging with log rotation
podman run -d --name app \
    --log-driver journald \
    --log-opt max-size=10m \
    --log-opt max-file=3 \
    myapp:latest

# Centralized logging
podman logs -f app | logger -t myapp
```

### Resource Monitoring
```bash
# Real-time container metrics
podman stats --no-stream app

# Historical resource usage
podman exec app cat /sys/fs/cgroup/memory/memory.usage_in_bytes
```

## Common Implementation Patterns

### Database Containers
```yaml
# Persistent database with backup strategy
services:
  postgres:
    image: postgres:15-alpine
    environment:
      POSTGRES_DB: myapp
      POSTGRES_USER: appuser
      POSTGRES_PASSWORD_FILE: /run/secrets/db_password
    volumes:
      - postgres_data:/var/lib/postgresql/data  # Persistent data
      - ./backups:/backups                      # Backup mount
    secrets:
      - db_password
```

### Web Application Containers
```yaml
# Multi-tier web application
services:
  frontend:
    image: nginx:alpine
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf:ro
    ports:
      - "80:80"
      - "443:443"
    depends_on:
      - backend

  backend:
    build: ./api
    environment:
      - DATABASE_URL=postgresql://appuser@postgres/myapp
    depends_on:
      - postgres
```

### GPU-Accelerated Workloads
```bash
# GPU transcoding/processing container
podman run -d --name gpu-processor \
    --device nvidia.com/gpu=all \
    -e NVIDIA_DRIVER_CAPABILITIES=compute,video \
    -v "/fast-storage:/cache" \
    -v "/media:/input:ro" \
    -v "/output:/output" \
    gpu-app:latest
```

## Best Practices

### Production Deployment
1. **Use specific image tags**: Never use `latest` in production
2. **Implement health checks**: Application and infrastructure monitoring
3. **Resource limits**: Prevent resource exhaustion
4. **Backup strategy**: Regular backups of persistent data
5. **Security scanning**: Regular vulnerability assessments

### Development Guidelines
1. **Multi-stage builds**: Separate build and runtime environments
2. **Environment parity**: Keep dev/staging/prod similar
3. **Configuration externalization**: Use environment variables and secrets
4. **Dependency management**: Pin versions, use lock files
5. **Testing strategy**: Unit, integration, and container tests

### Operational Excellence
1. **Log aggregation**: Centralized logging strategy
2. **Metrics collection**: Application and infrastructure metrics
3. **Alerting**: Proactive monitoring and alerting
4. **Documentation**: Container documentation and runbooks
5. **Disaster recovery**: Backup and recovery procedures

## Migration Patterns

### Legacy Application Containerization
1. **Assessment**: Identify dependencies and requirements
2. **Dockerfile creation**: Start with appropriate base image
3. **Configuration externalization**: Move configs to environment variables
4. **Data persistence**: Identify and volume mount data directories
5. **Testing**: Validate functionality in containerized environment

### Platform Migration (Docker to Podman)
```bash
# Export Docker container configuration
docker inspect mycontainer > container-config.json

# Convert to Podman run command
podman run -d --name mycontainer \
    --memory 4g \
    --cpus 2 \
    -v /host/path:/container/path \
    myimage:tag
```

This technology context provides comprehensive guidance for implementing Docker containerization strategies in home lab and production environments.