Complete restructure from patterns/examples/reference to technology-focused directories: • Created technology-specific directories with comprehensive documentation: - /tdarr/ - Transcoding automation with gaming-aware scheduling - /docker/ - Container management with GPU acceleration patterns - /vm-management/ - Virtual machine automation and cloud-init - /networking/ - SSH infrastructure, reverse proxy, and security - /monitoring/ - System health checks and Discord notifications - /databases/ - Database patterns and troubleshooting - /development/ - Programming language patterns (bash, nodejs, python, vuejs) • Enhanced CLAUDE.md with intelligent context loading: - Technology-first loading rules for automatic context provision - Troubleshooting keyword triggers for emergency scenarios - Documentation maintenance protocols with automated reminders - Context window management for optimal documentation updates • Preserved valuable content from .claude/tmp/: - SSH security improvements and server inventory - Tdarr CIFS troubleshooting and Docker iptables solutions - Operational scripts with proper technology classification • Benefits achieved: - Self-contained technology directories with complete context - Automatic loading of relevant documentation based on keywords - Emergency-ready troubleshooting with comprehensive guides - Scalable structure for future technology additions - Eliminated context bloat through targeted loading 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
10 KiB
Networking Infrastructure Troubleshooting Guide
SSH Connection Issues
SSH Authentication Failures
Symptoms: Permission denied, connection refused, timeout Diagnosis:
# Verbose SSH debugging
ssh -vvv user@host
# Test different authentication methods
ssh -o PasswordAuthentication=no user@host
ssh -o PubkeyAuthentication=yes user@host
# Check local key files
ls -la ~/.ssh/
ssh-keygen -lf ~/.ssh/homelab_rsa.pub
Solutions:
# Re-deploy SSH keys
ssh-copy-id -i ~/.ssh/homelab_rsa.pub user@host
ssh-copy-id -i ~/.ssh/emergency_homelab_rsa.pub user@host
# Fix key permissions
chmod 600 ~/.ssh/homelab_rsa
chmod 644 ~/.ssh/homelab_rsa.pub
chmod 700 ~/.ssh
# Verify remote authorized_keys
ssh user@host 'chmod 700 ~/.ssh && chmod 600 ~/.ssh/authorized_keys'
SSH Service Issues
Symptoms: Connection refused, service not running Diagnosis:
# Check SSH service status
systemctl status sshd
ss -tlnp | grep :22
# Test port connectivity
nc -zv host 22
nmap -p 22 host
Solutions:
# Restart SSH service
sudo systemctl restart sshd
sudo systemctl enable sshd
# Check firewall
sudo ufw status
sudo ufw allow ssh
# Verify SSH configuration
sudo sshd -T | grep -E "(passwordauth|pubkeyauth|permitroot)"
Network Connectivity Problems
Basic Network Troubleshooting
Symptoms: Cannot reach hosts, timeouts, routing issues Diagnosis:
# Basic connectivity tests
ping host
traceroute host
mtr host
# Check local network configuration
ip addr show
ip route show
cat /etc/resolv.conf
Solutions:
# Restart networking
sudo systemctl restart networking
sudo netplan apply # Ubuntu
# Reset network interface
sudo ip link set eth0 down
sudo ip link set eth0 up
# Check default gateway
sudo ip route add default via 10.10.0.1
DNS Resolution Issues
Symptoms: Cannot resolve hostnames, slow resolution Diagnosis:
# Test DNS resolution
nslookup google.com
dig google.com
host google.com
# Check DNS servers
systemd-resolve --status
cat /etc/resolv.conf
Solutions:
# Temporary DNS fix
echo "nameserver 8.8.8.8" | sudo tee /etc/resolv.conf
# Restart DNS services
sudo systemctl restart systemd-resolved
# Flush DNS cache
sudo systemd-resolve --flush-caches
Reverse Proxy and Load Balancer Issues
Nginx Configuration Problems
Symptoms: 502 Bad Gateway, 503 Service Unavailable, SSL errors Diagnosis:
# Check Nginx status and logs
systemctl status nginx
sudo tail -f /var/log/nginx/error.log
sudo tail -f /var/log/nginx/access.log
# Test Nginx configuration
sudo nginx -t
sudo nginx -T # Show full configuration
Solutions:
# Reload Nginx configuration
sudo nginx -s reload
# Check upstream servers
curl -I http://backend-server:port
telnet backend-server port
# Fix common configuration issues
sudo nano /etc/nginx/sites-available/default
# Check proxy_pass URLs, upstream definitions
SSL/TLS Certificate Issues
Symptoms: Certificate warnings, expired certificates, connection errors Diagnosis:
# Check certificate validity
openssl s_client -connect host:443 -servername host
openssl x509 -in /etc/ssl/certs/cert.pem -text -noout
# Check certificate expiry
openssl x509 -in /etc/ssl/certs/cert.pem -noout -dates
Solutions:
# Renew Let's Encrypt certificates
sudo certbot renew --dry-run
sudo certbot renew --force-renewal
# Generate self-signed certificate
sudo openssl req -x509 -nodes -days 365 -newkey rsa:2048 \
-keyout /etc/ssl/private/selfsigned.key \
-out /etc/ssl/certs/selfsigned.crt
Network Storage Issues
CIFS/SMB Mount Problems
Symptoms: Mount failures, connection timeouts, permission errors Diagnosis:
# Test SMB connectivity
smbclient -L //nas-server -U username
testparm # Test Samba configuration
# Check mount status
mount | grep cifs
df -h | grep cifs
Solutions:
# Remount with verbose logging
sudo mount -t cifs //server/share /mnt/point -o username=user,password=pass,vers=3.0
# Fix mount options in /etc/fstab
//server/share /mnt/point cifs credentials=/etc/cifs/credentials,uid=1000,gid=1000,iocharset=utf8,file_mode=0644,dir_mode=0755,cache=strict,_netdev 0 0
# Test credentials
sudo cat /etc/cifs/credentials
# Should contain: username=, password=, domain=
NFS Mount Issues
Symptoms: Stale file handles, mount hangs, permission denied Diagnosis:
# Check NFS services
systemctl status nfs-client.target
showmount -e nfs-server
# Test NFS connectivity
rpcinfo -p nfs-server
Solutions:
# Restart NFS services
sudo systemctl restart nfs-client.target
# Remount NFS shares
sudo umount /mnt/nfs-share
sudo mount -t nfs server:/path /mnt/nfs-share
# Fix stale file handles
sudo umount -f /mnt/nfs-share
sudo mount /mnt/nfs-share
Firewall and Security Issues
Port Access Problems
Symptoms: Connection refused, filtered ports, blocked services Diagnosis:
# Check firewall status
sudo ufw status verbose
sudo iptables -L -n -v
# Test port accessibility
nc -zv host port
nmap -p port host
Solutions:
# Open required ports
sudo ufw allow ssh
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp
sudo ufw allow from 10.10.0.0/24
# Reset firewall if needed
sudo ufw --force reset
sudo ufw enable
Network Security Issues
Symptoms: Unauthorized access, suspicious traffic, security alerts Diagnosis:
# Check active connections
ss -tuln
netstat -tuln
# Review logs for security events
sudo tail -f /var/log/auth.log
sudo tail -f /var/log/syslog | grep -i security
Solutions:
# Block suspicious IPs
sudo ufw deny from suspicious-ip
# Update SSH security
sudo nano /etc/ssh/sshd_config
# Set: PasswordAuthentication no, PermitRootLogin no
sudo systemctl restart sshd
Service Discovery and DNS Issues
Local DNS Problems
Symptoms: Services unreachable by hostname, DNS timeouts Diagnosis:
# Test local DNS resolution
nslookup service.homelab.local
dig @10.10.0.16 service.homelab.local
# Check DNS server status
systemctl status bind9 # or named
Solutions:
# Add to /etc/hosts as temporary fix
echo "10.10.0.100 service.homelab.local" | sudo tee -a /etc/hosts
# Restart DNS services
sudo systemctl restart bind9
sudo systemctl restart systemd-resolved
Container Networking Issues
Symptoms: Containers cannot communicate, service discovery fails Diagnosis:
# Check Docker networks
docker network ls
docker network inspect bridge
# Test container connectivity
docker exec container1 ping container2
docker exec container1 nslookup container2
Solutions:
# Create custom network
docker network create --driver bridge app-network
docker run --network app-network container
# Fix DNS in containers
docker run --dns 8.8.8.8 container
Performance Issues
Network Latency Problems
Symptoms: Slow response times, timeouts, poor performance Diagnosis:
# Measure network latency
ping -c 100 host
mtr --report host
# Check network interface stats
ip -s link show
cat /proc/net/dev
Solutions:
# Optimize network settings
echo 'net.core.rmem_max = 134217728' | sudo tee -a /etc/sysctl.conf
echo 'net.core.wmem_max = 134217728' | sudo tee -a /etc/sysctl.conf
sudo sysctl -p
# Check for network congestion
iftop
nethogs
Bandwidth Issues
Symptoms: Slow transfers, network congestion, dropped packets Diagnosis:
# Test bandwidth
iperf3 -s # Server
iperf3 -c server-ip # Client
# Check interface utilization
vnstat -i eth0
Solutions:
# Implement QoS if needed
sudo tc qdisc add dev eth0 root fq_codel
# Optimize buffer sizes
sudo ethtool -G eth0 rx 4096 tx 4096
Emergency Recovery Procedures
Network Emergency Recovery
Complete network failure recovery:
# Reset all network configuration
sudo systemctl stop networking
sudo ip addr flush eth0
sudo ip route flush table main
sudo systemctl start networking
# Manual network configuration
sudo ip addr add 10.10.0.100/24 dev eth0
sudo ip route add default via 10.10.0.1
echo "nameserver 8.8.8.8" | sudo tee /etc/resolv.conf
SSH Emergency Access
When locked out of systems:
# Use emergency SSH key
ssh -i ~/.ssh/emergency_homelab_rsa user@host
# Via console access (if available)
# Use hypervisor console or physical access
# Reset SSH to allow password auth temporarily
sudo sed -i 's/PasswordAuthentication no/PasswordAuthentication yes/' /etc/ssh/sshd_config
sudo systemctl restart sshd
Service Recovery
Critical service restoration:
# Restart all network services
sudo systemctl restart networking
sudo systemctl restart nginx
sudo systemctl restart sshd
# Emergency firewall disable
sudo ufw disable # CAUTION: Only for troubleshooting
# Service-specific recovery
sudo systemctl restart docker
sudo systemctl restart systemd-resolved
Monitoring and Prevention
Network Health Monitoring
#!/bin/bash
# network-monitor.sh
CRITICAL_HOSTS="10.10.0.1 10.10.0.16 nas.homelab.local"
CRITICAL_SERVICES="https://homelab.local http://proxmox.homelab.local:8006"
for host in $CRITICAL_HOSTS; do
if ! ping -c1 -W5 $host >/dev/null 2>&1; then
echo "ALERT: $host unreachable" | logger -t network-monitor
fi
done
for service in $CRITICAL_SERVICES; do
if ! curl -sSf --max-time 10 "$service" >/dev/null 2>&1; then
echo "ALERT: $service unavailable" | logger -t network-monitor
fi
done
Automated Recovery Scripts
#!/bin/bash
# network-recovery.sh
if ! ping -c1 8.8.8.8 >/dev/null 2>&1; then
echo "Network down, attempting recovery..."
sudo systemctl restart networking
sleep 10
if ping -c1 8.8.8.8 >/dev/null 2>&1; then
echo "Network recovered"
else
echo "Manual intervention required"
fi
fi
Quick Reference Commands
Network Diagnostics
# Connectivity tests
ping host
traceroute host
mtr host
nc -zv host port
# Service checks
systemctl status networking
systemctl status nginx
systemctl status sshd
# Network configuration
ip addr show
ip route show
ss -tuln
Emergency Commands
# Network restart
sudo systemctl restart networking
# SSH emergency access
ssh -i ~/.ssh/emergency_homelab_rsa user@host
# Firewall quick disable (emergency only)
sudo ufw disable
# DNS quick fix
echo "nameserver 8.8.8.8" | sudo tee /etc/resolv.conf
This troubleshooting guide provides comprehensive solutions for common networking issues in home lab environments.