claude-home/networking/pihole-ha-setup.md
Cal Corum 6c8d199359 Add Pi-hole HA documentation and networking updates
Add dual Pi-hole high availability setup guide, deployment notes, and
disk optimization docs. Update NPM + Pi-hole sync script and docs.
Add UniFi DNS firewall troubleshooting and networking scripts CONTEXT.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 22:19:56 -06:00

424 lines
14 KiB
Markdown

# Pi-hole High Availability Setup
## Architecture Overview
This homelab uses a dual Pi-hole setup for DNS high availability and ad blocking across the network.
```
┌─────────────────────────────────────────────────────────────┐
│ UniFi DHCP Server │
│ DNS1: 10.10.0.16 DNS2: 10.10.0.226 │
└────────────┬────────────────────────────┬───────────────────┘
│ │
▼ ▼
┌────────────────┐ ┌────────────────┐
│ npm-pihole │ │ ubuntu- │
│ 10.10.0.16 │◄────────►│ manticore │
│ │ Orbital │ 10.10.0.226 │
│ - NPM │ Sync │ │
│ - Pi-hole 1 │ │ - Jellyfin │
│ (Primary) │ │ - Tdarr │
└────────────────┘ │ - Pi-hole 2 │
▲ │ (Secondary) │
│ └────────────────┘
┌────────────────┐
│ NPM DNS Sync │
│ (hourly cron) │
│ │
│ Syncs proxy │
│ hosts to both │
│ Pi-holes │
└────────────────┘
```
## Components
### Primary Pi-hole (npm-pihole)
- **Host**: npm-pihole LXC (10.10.0.16)
- **Web UI**: http://10.10.0.16/admin
- **Role**: Primary DNS server, receives NPM proxy host sync
- **Upstream DNS**: Google DNS (8.8.8.8, 8.8.4.4)
### Secondary Pi-hole (ubuntu-manticore)
- **Host**: ubuntu-manticore physical server (10.10.0.226)
- **Web UI**: http://10.10.0.226:8053/admin
- **Role**: Secondary DNS server, failover
- **Upstream DNS**: Google DNS (8.8.8.8, 8.8.4.4)
- **Port**: 8053 (web UI) to avoid conflict with Jellyfin on 8096
### Orbital Sync
- **Host**: ubuntu-manticore (co-located with secondary Pi-hole)
- **Function**: Synchronizes blocklists, whitelists, and custom DNS entries from primary to secondary
- **Sync Interval**: 5 minutes
- **Method**: Pi-hole Teleporter API (official backup/restore)
### NPM DNS Sync
- **Host**: npm-pihole (cron job)
- **Function**: Syncs NPM proxy host entries to both Pi-holes' custom.list
- **Schedule**: Hourly
- **Script**: `server-configs/networking/scripts/npm-pihole-sync.sh`
## Failover Behavior
### How Client DNS Failover Works
1. **Normal operation**: Clients query DNS1 (10.10.0.16 - primary)
2. **Primary failure**: If primary doesn't respond, client automatically queries DNS2 (10.10.0.226 - secondary)
3. **Primary recovery**: Client preference returns to DNS1 when it's available again
### Failover Timing
- **Detection**: 2-5 seconds (client OS dependent)
- **Fallback**: Immediate query to secondary DNS
- **Impact**: Users typically see no interruption
### Load Distribution
- Most clients prefer DNS1 (primary) by default
- Some clients may round-robin between DNS1 and DNS2
- Both servers handle queries to distribute load
## Benefits Over Previous Setup
### Before (Single Pi-hole + Cloudflare Fallback)
- ❌ Single point of failure (Pi-hole down = DNS down)
- ❌ iOS devices preferred public DNS (1.1.1.1), bypassing local DNS overrides
- ❌ 403 errors accessing internal services (git.manticorum.com) due to NPM ACL restrictions
- ❌ No ad blocking when fallback DNS was used
### After (Dual Pi-hole HA)
- ✅ True high availability across separate physical hosts
- ✅ DNS survives single host failure
- ✅ All devices use Pi-hole for consistent ad blocking
- ✅ Local DNS overrides work on all devices (iOS, Android, desktop)
- ✅ No 403 errors on internal services
- ✅ Automatic synchronization of blocklists and custom DNS entries
## Deployment Locations
### Configuration Files
```
server-configs/
├── ubuntu-manticore/
│ └── docker-compose/
│ ├── pihole/
│ │ ├── docker-compose.yml
│ │ ├── .env.example
│ │ ├── config/ # Pi-hole persistent data
│ │ └── dnsmasq/ # dnsmasq configuration
│ └── orbital-sync/
│ ├── docker-compose.yml
│ └── .env.example
└── networking/
└── scripts/
└── npm-pihole-sync.sh # Enhanced for dual Pi-hole support
```
### Runtime Locations
```
ubuntu-manticore:
~/docker/pihole/ # Secondary Pi-hole
~/docker/orbital-sync/ # Synchronization service
npm-pihole:
/path/to/pihole/ # Primary Pi-hole (existing)
/path/to/npm-sync-cron/ # NPM → Pi-hole sync script
```
## Initial Setup Steps
### 1. Deploy Secondary Pi-hole (ubuntu-manticore)
```bash
# SSH to ubuntu-manticore
ssh ubuntu-manticore
# Create directory structure
mkdir -p ~/docker/pihole ~/docker/orbital-sync
# Copy docker-compose.yml from repository
# (Assume server-configs is synced to host)
# Create .env file with strong password
cd ~/docker/pihole
echo "WEBPASSWORD=$(openssl rand -base64 32)" > .env
echo "TZ=America/Chicago" >> .env
# Start Pi-hole
docker compose up -d
# Monitor startup
docker logs pihole -f
```
**Note on Pi-hole v6 Upgrades**: If upgrading from v5 to v6, blocklists are not automatically migrated. The v5 database is backed up as `gravity.db.v5.backup`. To restore blocklists, access the web UI and manually add them via Settings → Adlists (multiple lists can be added by comma-separating URLs).
### 2. Configure Secondary Pi-hole
```bash
# Access web UI: http://10.10.0.226:8053/admin
# Login with password from .env file
# Settings → DNS:
# - Upstream DNS: Google DNS (8.8.8.8, 8.8.4.4)
# - Enable DNSSEC
# - Interface listening behavior: Listen on all interfaces
# Settings → Privacy:
# - Query logging: Enabled
# - Privacy level: Show everything (for troubleshooting)
# Test DNS resolution
dig @10.10.0.226 google.com
dig @10.10.0.226 doubleclick.net # Should be blocked
```
### 3. Generate App Passwords (Pi-hole v6)
**Important**: Pi-hole v6 uses app passwords instead of API tokens for authentication.
```bash
# Primary Pi-hole (10.10.0.16:81)
# 1. Login to http://10.10.0.16:81/admin
# 2. Navigate to: Settings → Web Interface / API → Advanced Settings
# 3. Click "Configure app password"
# 4. Copy the generated app password
# 5. Store in: ~/.claude/secrets/pihole1_app_password
# Secondary Pi-hole (10.10.0.226:8053)
# 1. Login to http://10.10.0.226:8053/admin
# 2. Navigate to: Settings → Web Interface / API → Advanced Settings
# 3. Click "Configure app password"
# 4. Copy the generated app password
# 5. Store in: ~/.claude/secrets/pihole2_app_password
```
### 4. Deploy Orbital Sync
```bash
# SSH to ubuntu-manticore
cd ~/docker/orbital-sync
# Create .env file with app passwords from step 3
cat > .env << EOF
PRIMARY_HOST_PASSWORD=$(cat ~/.claude/secrets/pihole1_app_password)
SECONDARY_HOST_PASSWORD=$(cat ~/.claude/secrets/pihole2_app_password)
EOF
# Start Orbital Sync
docker compose up -d
# Monitor initial sync
docker logs orbital-sync -f
# Expected output on success:
# "✓ Signed in to http://10.10.0.16:81/admin"
# "✓ Signed in to http://127.0.0.1:8053/admin"
# "✓ Sync completed successfully"
```
### 5. Update NPM DNS Sync Script
The script at `server-configs/networking/scripts/npm-pihole-sync.sh` has been enhanced to sync to both Pi-holes:
```bash
# Test the updated script
ssh npm-pihole "/path/to/npm-pihole-sync.sh --dry-run"
# Verify both Pi-holes receive entries
ssh npm-pihole "docker exec pihole cat /etc/pihole/custom.list | grep git.manticorum.com"
ssh ubuntu-manticore "docker exec pihole cat /etc/pihole/custom.list | grep git.manticorum.com"
```
### 6. Update UniFi DHCP Configuration
```
1. Access UniFi Network Controller
2. Navigate to: Settings → Networks → LAN → DHCP
3. Set DHCP DNS Server: Manual
4. DNS Server 1: 10.10.0.16 (primary Pi-hole)
5. DNS Server 2: 10.10.0.226 (secondary Pi-hole)
6. Remove any public DNS servers (1.1.1.1, etc.)
7. Save and apply
```
**Note**: Clients will pick up new DNS servers on next DHCP lease renewal (typically 24 hours) or manually renew:
- Windows: `ipconfig /release && ipconfig /renew`
- macOS/Linux: `sudo dhclient -r && sudo dhclient` or reconnect to WiFi
- iOS/Android: Forget network and reconnect
## Testing Procedures
### DNS Resolution Tests
```bash
# Test both Pi-holes respond
dig @10.10.0.16 google.com
dig @10.10.0.226 google.com
# Test ad blocking works on both
dig @10.10.0.16 doubleclick.net
dig @10.10.0.226 doubleclick.net
# Test custom DNS entries (from NPM sync)
dig @10.10.0.16 git.manticorum.com
dig @10.10.0.226 git.manticorum.com
```
### Failover Tests
```bash
# Test 1: Primary Pi-hole failure
ssh npm-pihole "docker stop pihole"
dig google.com # Should still resolve via secondary
ssh npm-pihole "docker start pihole"
# Test 2: Secondary Pi-hole failure
ssh ubuntu-manticore "docker stop pihole"
dig google.com # Should still resolve via primary
ssh ubuntu-manticore "docker start pihole"
# Test 3: iOS device access to internal services
# From iPhone, access: https://git.manticorum.com
# Expected: 200 OK (no 403 errors)
# NPM logs should show local IP (10.0.0.x) not public IP
```
### Orbital Sync Validation
```bash
# Add test blocklist to primary Pi-hole
# Web UI → Adlists → Add: https://example.com/blocklist.txt
# Wait 5 minutes for sync
# Check secondary Pi-hole
# Web UI → Adlists → Should see same blocklist
# Check sync logs
ssh ubuntu-manticore "docker logs orbital-sync --tail 50"
```
### NPM DNS Sync Validation
```bash
# Add new NPM proxy host (e.g., test.manticorum.com)
# Wait for hourly cron sync
# Verify both Pi-holes have the entry
ssh npm-pihole "docker exec pihole cat /etc/pihole/custom.list | grep test.manticorum.com"
ssh ubuntu-manticore "docker exec pihole cat /etc/pihole/custom.list | grep test.manticorum.com"
# Test DNS resolution
dig test.manticorum.com
```
## Monitoring
### Health Checks
```bash
# Check Pi-hole containers are running
ssh npm-pihole "docker ps | grep pihole"
ssh ubuntu-manticore "docker ps | grep pihole"
# Check Orbital Sync is running
ssh ubuntu-manticore "docker ps | grep orbital-sync"
# Check DNS response times
time dig @10.10.0.16 google.com
time dig @10.10.0.226 google.com
```
### Resource Usage
```bash
# Pi-hole typically uses <1% CPU and ~150MB RAM
ssh ubuntu-manticore "docker stats pihole --no-stream"
# Verify no impact on Jellyfin/Tdarr
ssh ubuntu-manticore "docker stats jellyfin tdarr --no-stream"
```
### Query Logs
- **Primary**: http://10.10.0.16/admin → Query Log
- **Secondary**: http://10.10.0.226:8053/admin → Query Log
- Look for balanced query distribution across both servers
## Troubleshooting
See `networking/troubleshooting.md` for detailed Pi-hole HA troubleshooting scenarios.
### Common Issues
**Issue**: Secondary Pi-hole not receiving queries
- Check UniFi DHCP settings (DNS2 should be 10.10.0.226)
- Force DHCP lease renewal on test client
- Verify Pi-hole is listening on port 53: `netstat -tulpn | grep :53`
**Issue**: Orbital Sync not syncing
- Check container logs: `docker logs orbital-sync`
- Verify API tokens are correct in `.env`
- Test API access manually: `curl -H "Authorization: Token <api_token>" http://10.10.0.16/admin/api.php?status`
**Issue**: NPM domains not appearing in secondary Pi-hole
- Check npm-pihole-sync.sh script logs
- Verify SSH access from npm-pihole to ubuntu-manticore
- Manually trigger sync: `ssh npm-pihole "/path/to/npm-pihole-sync.sh"`
**Issue**: iOS devices still getting 403 on internal services
- Verify UniFi DHCP no longer has public DNS (1.1.1.1)
- Force iOS device to renew DHCP (forget network and reconnect)
- Check iOS DNS settings: Settings → WiFi → (i) → DNS (should show 10.10.0.16)
- Test DNS resolution from iOS: Use DNS test app or `nslookup git.manticorum.com`
## Maintenance
### Updating Pi-hole
```bash
# Primary Pi-hole
ssh npm-pihole "docker compose pull && docker compose up -d"
# Secondary Pi-hole
ssh ubuntu-manticore "cd ~/docker/pihole && docker compose pull && docker compose up -d"
# Orbital Sync
ssh ubuntu-manticore "cd ~/docker/orbital-sync && docker compose pull && docker compose up -d"
```
### Backup and Recovery
```bash
# Pi-hole Teleporter backups (automatic via Orbital Sync)
# Manual backup from web UI: Settings → Teleporter → Backup
# Docker volume backup
ssh ubuntu-manticore "tar -czf ~/pihole-backup-$(date +%Y%m%d).tar.gz ~/docker/pihole/config"
# Restore
ssh ubuntu-manticore "tar -xzf ~/pihole-backup-YYYYMMDD.tar.gz -C ~/"
```
## Performance Characteristics
### Expected Behavior
- **Query response time**: <50ms on LAN
- **CPU usage**: <1% per Pi-hole instance
- **RAM usage**: ~150MB per Pi-hole instance
- **Sync latency**: 5 minutes (Orbital Sync interval)
- **NPM sync latency**: Up to 1 hour (cron schedule)
### Capacity
- Both Pi-holes can easily handle 1000+ queries/minute
- No impact on ubuntu-manticore's Jellyfin/Tdarr GPU operations
- Orbital Sync overhead is negligible (<10MB RAM, <0.1% CPU)
## Related Documentation
- **NPM + Pi-hole Integration**: `server-configs/networking/nginx-proxy-manager-pihole.md`
- **Network Troubleshooting**: `networking/troubleshooting.md`
- **ubuntu-manticore Setup**: `media-servers/jellyfin-ubuntu-manticore.md`
- **Orbital Sync Documentation**: https://github.com/mattwebbio/orbital-sync