claude-home/docker/examples/docker-iptables-troubleshooting-session.md
Cal Corum 4b7eca8a46
All checks were successful
Reindex Knowledge Base / reindex (push) Successful in 3s
docs: add YAML frontmatter to all 151 markdown files
Adds title, description, type, domain, and tags frontmatter to every
doc for improved KB semantic search. The description field is prepended
to every search chunk, and domain/type/tags enable filtered queries.

Type values: context, guide, runbook, reference, troubleshooting
Domain values match directory structure (networking, docker, etc.)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 09:00:44 -05:00

270 lines
8.5 KiB
Markdown

---
title: "Docker iptables/nftables Troubleshooting"
description: "Detailed troubleshooting session for Docker daemon failing to start due to iptables/nftables backend conflicts on Nobara (Fedora-based), including NAT chain creation errors and legacy backend workarounds."
type: troubleshooting
domain: docker
tags: [docker, iptables, nftables, fedora, nobara, networking, nat, firewall]
---
# Docker iptables/nftables Backend Troubleshooting Session
## Session Context
- **Date**: August 8, 2025
- **System**: Nobara PC (Fedora-based gaming distro)
- **User**: cal
- **Working Directory**: `/mnt/NV2/Development/claude-home`
- **Goal**: Get Docker working to run Tdarr Node container
## System Information
```bash
# OS Details
uname -a
# Linux nobara-pc 6.15.5-200.nobara.fc42.x86_64 #1 SMP PREEMPT_DYNAMIC Sun Jul 6 11:56:20 UTC 2025 x86_64 GNU/Linux
# Hardware
# AMD Ryzen 7 7800X3D 8-Core Processor
# 62GB RAM
# NVIDIA GeForce RTX 4080 SUPER
# Distribution
# Nobara (Fedora 42-based)
```
## Problem Summary
Docker daemon fails to start with persistent error:
```
failed to start daemon: Error initializing network controller: error obtaining controller instance: failed to register "bridge" driver: failed to create NAT chain DOCKER: COMMAND_FAILED: INVALID_IPV: 'ipv4' is not a valid backend or is unavailable
```
## Root Cause Analysis
### Initial Discovery
1. **Missing iptables**: Docker couldn't find `iptables` command in PATH
2. **Backend conflict**: System using nftables but Docker expects iptables-legacy
3. **Package inconsistency**: `iptables-nft` package installed but binary missing initially
### Key Findings
- `dnf list installed | grep -i iptables` initially returned nothing
- `firewalld` and `nftables` services were both inactive
- `iptables-nft` package was installed but `/usr/bin/iptables` didn't exist
- After reinstall, iptables worked but used nftables backend
- NAT table incompatible: `iptables v1.8.11 (nf_tables): table 'nat' is incompatible, use 'nft' tool.`
## Troubleshooting Steps Performed
### Step 1: Package Investigation
```bash
# Check installed iptables packages
dnf list installed | grep -i iptables
# Result: No matching packages (surprising!)
# Check service status
systemctl status nftables # inactive (dead)
firewall-cmd --get-backend-type # firewalld not running
# Check if iptables binary exists
which iptables # not found
/usr/bin/iptables --version # No such file or directory
```
### Step 2: Package Reinstallation
```bash
# Reinstall iptables-nft package
sudo dnf reinstall -y iptables-nft
# Verify installation
rpm -ql iptables-nft | grep bin
# Shows /usr/bin/iptables should exist
# Test after reinstall
iptables --version
# Result: iptables v1.8.11 (nf_tables) - SUCCESS!
```
### Step 3: Backend Compatibility Testing
```bash
# Test NAT table access
sudo iptables -t nat -L
# Error: iptables v1.8.11 (nf_tables): table `nat' is incompatible, use 'nft' tool.
```
### Step 4: Legacy Backend Installation
```bash
# Install iptables-legacy
sudo dnf install -y iptables-legacy iptables-legacy-libs
# Set up alternatives system
sudo alternatives --install /usr/bin/iptables iptables /usr/bin/iptables-legacy 10
sudo alternatives --install /usr/bin/ip6tables ip6tables /usr/bin/ip6tables-legacy 10
# Test NAT table with legacy backend
sudo iptables -t nat -L
# SUCCESS: Shows empty NAT chains
```
### Step 5: Docker Restart Attempts
```bash
# Remove NVIDIA daemon.json config (potential conflict)
sudo rm -f /etc/docker/daemon.json
# Load NAT kernel module explicitly
sudo modprobe iptable_nat
# Try starting firewalld (in case Docker needs it)
sudo systemctl enable --now firewalld
# Multiple restart attempts
sudo systemctl start docker
# ALL FAILED with same NAT chain error
```
## Current State
- ✅ iptables-legacy installed and configured
- ✅ NAT table accessible via `iptables -t nat -L`
- ✅ All required kernel modules should be available
- ❌ Docker still fails with NAT chain creation error
- ❌ Same error persists despite backend switch
## Analysis of Persistent Issue
### Potential Causes
1. **Kernel State Contamination**: nftables rules/chains may still be active in kernel memory
2. **Module Loading Order**: iptables vs nftables modules loaded in conflicting order
3. **Docker Caching**: Docker may be caching the old backend detection
4. **Firewall Integration**: Docker + firewalld interaction on Fedora/Nobara
5. **System-Level Backend Selection**: Some system-wide iptables backend lock
### Evidence Supporting Kernel State Theory
- Error message is identical across all restart attempts
- iptables command works fine manually
- NAT table shows properly but Docker can't create chains
- Issue persists despite configuration changes
## Next Session Action Plan
### Immediate Steps After System Reboot
1. **Verify Backend Status**:
```bash
iptables --version # Should show legacy
sudo iptables -t nat -L # Should show clean NAT table
```
2. **Check Kernel Modules**:
```bash
lsmod | grep -E "(iptable|nf_|ip_tables)"
modprobe -l | grep -E "(iptable|nf_table)"
```
3. **Test Docker Start**:
```bash
sudo systemctl start docker
docker --version
```
### If Issue Persists After Reboot
#### Alternative Approach 1: Docker Configuration Override
```bash
# Create daemon.json to disable iptables management
sudo mkdir -p /etc/docker
cat <<EOF | sudo tee /etc/docker/daemon.json
{
"iptables": false,
"bridge": "none"
}
EOF
sudo systemctl start docker
```
#### Alternative Approach 2: Podman as Docker Alternative
```bash
# Install podman as Docker drop-in replacement
sudo dnf install -y podman podman-docker
# Test with Tdarr container
podman run --rm ghcr.io/haveagitgat/tdarr_node:latest --help
```
#### Alternative Approach 3: Docker Desktop
```bash
# Consider Docker Desktop for Linux (handles networking differently)
# May bypass system iptables issues entirely
```
#### Alternative Approach 4: Deep System Cleanup
```bash
# Nuclear option: Remove all networking packages and reinstall
sudo dnf remove -y iptables* nftables firewalld
sudo dnf install -y iptables-legacy iptables-nft firewalld
sudo dnf reinstall -y docker-ce
```
### Diagnostic Commands for Next Session
```bash
# Full network state capture
ip addr show
ip route show
sudo iptables-save > /tmp/iptables-state.txt
sudo nft list ruleset > /tmp/nft-state.txt
# Docker troubleshooting
sudo dockerd --debug --log-level=debug > /tmp/docker-debug.log 2>&1 &
# Kill after 30 seconds and examine log
# System journal deep dive
journalctl -u docker.service --since="1 hour ago" -o verbose > /tmp/docker-journal.log
```
## Known Working Configuration Target
### Expected Working State
- **iptables**: Legacy backend active
- **Docker**: Running with NAT chain creation successful
- **Network**: Docker bridge network functional
- **Containers**: Can start and access network
### Tdarr Node Test Command
```bash
cd ~/docker/tdarr-node
# Update IP in compose file first:
# serverIP=<TDARR_SERVER_IP>
docker-compose -f tdarr-node-basic.yml up -d
```
## Related Documentation Created
- `/patterns/docker/gpu-acceleration.md` - GPU troubleshooting patterns
- `/reference/docker/nvidia-troubleshooting.md` - NVIDIA container toolkit
- `/examples/docker/tdarr-node-local/` - Working configurations
## System Context Notes
- This is a gaming-focused Nobara distribution
- May have different default networking than standard Fedora
- NVIDIA drivers already working (nvidia-smi functional)
- System has been used for other Docker containers successfully in past
- Recent NVIDIA container toolkit installation may have triggered the issue
## Success Criteria for Next Session
1. ✅ Docker service starts without errors
2.`docker ps` command works
3. ✅ Simple container can run: `docker run --rm hello-world`
4. ✅ Tdarr node container can start (even if can't connect to server yet)
5. ✅ Network connectivity from containers works
## Escalation Options
If standard troubleshooting fails:
1. **Nobara Community**: Check Nobara Discord/forums for similar issues
2. **Docker Desktop**: Use different Docker implementation
3. **Podman Migration**: Switch to podman as Docker replacement
4. **System Reinstall**: Fresh OS install (nuclear option)
5. **Container Alternatives**: LXC/systemd containers instead of Docker
## Files to Check Next Session
- `/etc/docker/daemon.json` - Docker configuration
- `/var/log/docker.log` - Docker service logs
- `~/.docker/config.json` - User Docker config
- `/proc/sys/net/ipv4/ip_forward` - IP forwarding enabled
- `/etc/systemd/system/docker.service.d/` - Service overrides
---
*End of troubleshooting session log*