claude-home/reference/networking/ssh-troubleshooting.md
Cal Corum 704bad1547 CLAUDE: Add comprehensive SSH key management documentation
- Add SSH key management patterns with dual-key strategy and NAS backup architecture
- Add complete SSH home lab setup implementation with scripts and configurations
- Add SSH troubleshooting reference with common issues and emergency procedures
- Update CLAUDE.md with SSH keyword triggers for automatic context loading
- Add .gitignore to exclude temporary files

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-08 21:02:46 -05:00

5.8 KiB

SSH Troubleshooting Reference

Common Configuration Issues

UseKeychain Compatibility Error

Error: Bad configuration option: usekeychain

Cause: UseKeychain yes is macOS-specific and not supported on Linux

Solution: Remove or comment out the line from SSH config

# UseKeychain yes  # macOS only - remove on Linux

Port Forwarding Conflicts

Error: bind [127.0.0.1]:8080: Address already in use

Cause: Local port already in use by another service

Solutions:

  1. Remove LocalForward line from SSH config
  2. Change to different port: LocalForward 8081 localhost:80
  3. Find conflicting service: sudo netstat -tulpn | grep :8080

Host Key Verification Loops

Issue: Asked to verify host key on every connection

Cause: SSH config discarding host keys with UserKnownHostsFile /dev/null

Solution: Change StrictHostKeyChecking policy

# Instead of:
StrictHostKeyChecking no
UserKnownHostsFile /dev/null

# Use:
StrictHostKeyChecking accept-new

Key Deployment Issues

ssh-copy-id False Warnings

Warning: All keys were skipped because they already exist on the remote system

Issue: Warning appears even when keys aren't actually deployed

Solution: Force deployment with -f flag

ssh-copy-id -f -i ~/.ssh/emergency_homelab_rsa.pub cal@10.10.0.42

Permission Denied After Key Deployment

Error: Permission denied (publickey)

Troubleshooting Steps:

  1. Check key permissions locally:

    ls -la ~/.ssh/
    # Private keys should be 600, public keys 644
    
  2. Check authorized_keys on remote server:

    ssh user@server "ls -la ~/.ssh/authorized_keys"
    # Should be 600 with correct ownership
    
  3. Verify key is actually deployed:

    ssh user@server "cat ~/.ssh/authorized_keys"
    
  4. Test specific key file:

    ssh -i ~/.ssh/specific_key user@server
    

Key Authentication Not Working

Debug connection issues:

# Verbose SSH connection for debugging
ssh -v user@server

# Super verbose for detailed debugging
ssh -vvv user@server

# Test specific identity file
ssh -i ~/.ssh/homelab_rsa -v cal@10.10.0.42

Server-Side Issues

SSH Service Not Running

Check SSH service status:

sudo systemctl status sshd
sudo systemctl start sshd
sudo systemctl enable sshd

Firewall Blocking SSH

Check firewall rules:

# Ubuntu/Debian
sudo ufw status
sudo ufw allow ssh

# CentOS/RHEL
sudo firewall-cmd --list-services
sudo firewall-cmd --add-service=ssh --permanent
sudo firewall-cmd --reload

Wrong SSH Port

Check SSH configuration:

sudo grep "^Port" /etc/ssh/sshd_config
# Update SSH client config accordingly

Emergency Access Procedures

Primary Keys Lost/Corrupted

  1. Use emergency keys:

    ssh -i ~/.ssh/emergency_homelab_rsa cal@10.10.0.16
    
  2. Restore from NAS backup:

    cp /mnt/NV2/ssh-keys/backup-*/homelab_rsa* ~/.ssh/
    chmod 600 ~/.ssh/homelab_rsa
    chmod 644 ~/.ssh/homelab_rsa.pub
    
  3. Generate new keys if needed:

    ssh-keygen -t rsa -b 4096 -f ~/.ssh/new_homelab_rsa
    ssh-copy-id -i ~/.ssh/new_homelab_rsa.pub user@server
    

Complete SSH Access Lost

  1. Physical/console access (home servers)
  2. Cloud provider web console (cloud servers)
  3. Recovery mode if available
  4. Manual authorized_keys editing:
    # On the server via console:
    echo "your-public-key-here" >> ~/.ssh/authorized_keys
    chmod 600 ~/.ssh/authorized_keys
    

Network Connectivity Issues

Connection Timeouts

Check network connectivity:

# Basic connectivity test
ping 10.10.0.42

# Check if SSH port is open
telnet 10.10.0.42 22
# Or using nc
nc -zv 10.10.0.42 22

DNS Resolution Issues

Bypass DNS with IP addresses:

# Instead of hostname
ssh server.local

# Use IP directly
ssh 10.10.0.42

VPN/Network Routing

Check routing to server:

traceroute 10.10.0.42
ip route | grep 10.10.0.0

Configuration Validation

SSH Config Syntax Check

# Test SSH config syntax
ssh -F ~/.ssh/config -T git@github.com 2>&1 | head

Key Fingerprint Verification

# Local key fingerprint
ssh-keygen -lf ~/.ssh/homelab_rsa.pub

# Remote server's authorized keys fingerprints
ssh user@server "ssh-keygen -lf ~/.ssh/authorized_keys"

Connection Test Script

#!/bin/bash
# Test all configured SSH hosts
for host in strat-database pihole docker-home akamai vultr; do
    echo "Testing $host..."
    if ssh -o ConnectTimeout=5 -o BatchMode=yes "$host" 'echo "OK"' 2>/dev/null; then
        echo "✅ $host: Connected successfully"
    else
        echo "❌ $host: Connection failed"
    fi
done

Maintenance Commands

Clean Up Known Hosts

# Remove specific host key
ssh-keygen -R 10.10.0.42

# Remove hostname and IP
ssh-keygen -R server.local
ssh-keygen -R 10.10.0.42

Key Rotation Process

# Generate new key
ssh-keygen -t rsa -b 4096 -f ~/.ssh/homelab_rsa_new

# Deploy new key alongside old one
ssh-copy-id -i ~/.ssh/homelab_rsa_new.pub user@server

# Test new key works
ssh -i ~/.ssh/homelab_rsa_new user@server

# Update SSH config to use new key
# Remove old public key from server authorized_keys
# Archive old key pair

Server-Specific Troubleshooting

Home Lab Servers (10.10.0.x)

  • Physical access available for recovery
  • Container hosts may need different user contexts
  • Shared credentials historically used (security risk)

Cloud Servers

  • Provider console access as fallback
  • Root user typically used (create non-root users)
  • Different security contexts than home network
  • Patterns: patterns/networking/ssh-key-management.md
  • Complete setup: examples/networking/ssh-homelab-setup.md