claude-home/productivity/n8n/troubleshooting.md
Cal Corum 4b7eca8a46
All checks were successful
Reindex Knowledge Base / reindex (push) Successful in 3s
docs: add YAML frontmatter to all 151 markdown files
Adds title, description, type, domain, and tags frontmatter to every
doc for improved KB semantic search. The description field is prepended
to every search chunk, and domain/type/tags enable filtered queries.

Type values: context, guide, runbook, reference, troubleshooting
Domain values match directory structure (networking, docker, etc.)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 09:00:44 -05:00

586 lines
13 KiB
Markdown

---
title: "n8n Troubleshooting Guide"
description: "Troubleshooting guide for n8n at n8n.manticorum.com covering container startup failures, PostgreSQL issues, SSL/access problems, webhook debugging, credential decryption errors, performance tuning, and emergency recovery procedures."
type: troubleshooting
domain: productivity
tags: [n8n, troubleshooting, docker, postgresql, webhook, ssl, nginx-proxy-manager]
---
# n8n Troubleshooting Guide
Common issues and solutions for the n8n deployment at n8n.manticorum.com.
## Quick Diagnostics
**First steps when n8n isn't working:**
```bash
# Check container status
ssh root@10.10.0.210 "docker ps --filter name=n8n"
# Check logs for errors
ssh root@10.10.0.210 "cd /opt/n8n && docker compose logs --tail=50 n8n"
# Check if service is responding
curl -I http://10.10.0.210:5678
# Check NPM proxy status (if external access fails)
# Access NPM UI and check proxy host status
```
---
## Container Issues
### n8n Container Won't Start
**Symptoms:**
- Container exits immediately after starting
- `docker ps` shows no n8n container
- Error in logs: "database connection failed"
**Diagnosis:**
```bash
ssh root@10.10.0.210 "cd /opt/n8n && docker compose logs n8n | tail -30"
```
**Solutions:**
1. **Check PostgreSQL is healthy:**
```bash
ssh root@10.10.0.210 "docker compose ps postgres"
# Status should show "healthy"
```
2. **Verify database credentials in .env:**
```bash
ssh root@10.10.0.210 "cat /opt/n8n/.env | grep POSTGRES"
```
3. **Restart both services:**
```bash
ssh root@10.10.0.210 "cd /opt/n8n && docker compose down && docker compose up -d"
```
4. **Check database connectivity:**
```bash
ssh root@10.10.0.210 "docker compose exec postgres psql -U n8n -d n8n -c 'SELECT 1;'"
```
### PostgreSQL Container Issues
**Symptoms:**
- n8n fails to connect to database
- PostgreSQL container shows "unhealthy" status
**Diagnosis:**
```bash
ssh root@10.10.0.210 "cd /opt/n8n && docker compose logs postgres | tail -50"
```
**Common Causes:**
1. **Corrupted database:**
```bash
# Check database integrity
ssh root@10.10.0.210 "docker compose exec postgres pg_isready -U n8n"
```
2. **Disk space full:**
```bash
ssh root@10.10.0.210 "df -h /"
# Should have >10GB free
```
3. **Permission issues:**
```bash
ssh root@10.10.0.210 "docker volume inspect n8n_postgres_data"
```
**Recovery:**
```bash
# Restore from backup
ssh root@10.10.0.210 "
cd /opt/n8n
docker compose down
docker volume rm n8n_postgres_data
docker compose up -d postgres
# Wait for healthy status
cat /root/n8n-backup-YYYYMMDD.sql | docker compose exec -T postgres psql -U n8n n8n
docker compose up -d n8n
"
```
---
## Access Issues
### Can't Access n8n.manticorum.com
**Symptoms:**
- Browser shows "Connection timed out" or "Can't reach this page"
- Works on http://10.10.0.210:5678 but not via domain
**Diagnosis Steps:**
1. **Check DNS resolution:**
```bash
nslookup n8n.manticorum.com
# Should return your public IP
```
2. **Test internal access:**
```bash
curl -I http://10.10.0.210:5678
# Should return HTTP 200
```
3. **Check NPM proxy host:**
- Login to NPM UI
- Verify proxy host for n8n.manticorum.com exists
- Check if status shows "online"
4. **Test NPM connectivity:**
```bash
# From NPM host
curl -I http://10.10.0.210:5678
```
**Solutions:**
1. **DNS not configured:**
- Add A record: `n8n.manticorum.com``[your-public-IP]`
- Wait for DNS propagation (up to 48 hours)
2. **NPM proxy host misconfigured:**
- Domain: `n8n.manticorum.com`
- Scheme: `http` (not https)
- Forward Host: `10.10.0.210`
- Forward Port: `5678`
- ✅ Enable WebSockets Support
3. **Firewall blocking:**
- Ensure ports 80 and 443 open on firewall
- Check Proxmox firewall rules
- Check LXC firewall if enabled
### SSL Certificate Issues
**Symptoms:**
- Browser shows "Your connection is not private"
- Certificate error in browser
- NPM shows "Certificate request failed"
**Diagnosis:**
```bash
# Test SSL
openssl s_client -connect n8n.manticorum.com:443 -servername n8n.manticorum.com
# Check certificate expiry
echo | openssl s_client -connect n8n.manticorum.com:443 2>/dev/null | openssl x509 -noout -dates
```
**Solutions:**
1. **Request new certificate in NPM:**
- Edit proxy host
- SSL tab → Request new SSL Certificate
- Ensure email is correct
- Check Let's Encrypt rate limits (5 per week)
2. **DNS validation failing:**
- Verify domain points to correct IP
- Ensure port 80 is accessible (Let's Encrypt uses HTTP validation)
3. **Use DNS challenge instead:**
- If port 80 is blocked, use DNS challenge method in NPM
- Requires API credentials for your DNS provider
### Login/Authentication Issues
**Symptoms:**
- Can access n8n but login fails
- "Invalid credentials" error
- Basic auth popup keeps appearing
**Diagnosis:**
```bash
# Check current credentials
ssh root@10.10.0.210 "cat /opt/n8n/.env | grep BASIC_AUTH"
```
**Solutions:**
1. **Reset admin password:**
```bash
ssh root@10.10.0.210 "
cd /opt/n8n
# Generate new password
NEW_PASS=\$(openssl rand -base64 16 | tr -d '/+=')
echo \"New password: \$NEW_PASS\"
# Update .env
sed -i \"s/N8N_BASIC_AUTH_PASSWORD=.*/N8N_BASIC_AUTH_PASSWORD=\$NEW_PASS/\" .env
# Restart
docker compose restart n8n
"
```
2. **Clear browser cache:**
- Browser may cache old credentials
- Try incognito/private window
- Clear site data for n8n.manticorum.com
3. **Disable basic auth temporarily:**
```bash
ssh root@10.10.0.210 "
cd /opt/n8n
sed -i 's/N8N_BASIC_AUTH_ACTIVE=true/N8N_BASIC_AUTH_ACTIVE=false/' .env
docker compose restart n8n
"
```
**Warning:** Only do this for troubleshooting, re-enable immediately!
---
## Workflow Issues
### Webhooks Not Working
**Symptoms:**
- External services can't trigger workflows
- Webhook URL returns 404 or timeout
- Test webhooks work but production ones don't
**Diagnosis:**
```bash
# Test webhook URL
curl -X POST https://n8n.manticorum.com/webhook/test
# Check n8n logs for incoming requests
ssh root@10.10.0.210 "docker compose logs -f n8n | grep webhook"
```
**Common Causes:**
1. **Incorrect WEBHOOK_URL in configuration:**
```bash
ssh root@10.10.0.210 "cat /opt/n8n/.env | grep WEBHOOK_URL"
# Should be: https://n8n.manticorum.com/
```
2. **Workflow not activated:**
- Check workflow is toggled "Active" in n8n UI
- Look for green indicator on workflow
3. **NPM WebSocket support not enabled:**
- Edit proxy host in NPM
- Details tab → ✅ WebSockets Support
4. **Firewall blocking webhooks:**
- Ensure external services can reach your public IP on port 443
**Solutions:**
```bash
# Update webhook URL
ssh root@10.10.0.210 "
cd /opt/n8n
sed -i 's|WEBHOOK_URL=.*|WEBHOOK_URL=https://n8n.manticorum.com/|' .env
docker compose restart n8n
"
# Test after restart
curl -X POST https://n8n.manticorum.com/webhook/test
```
### Executions Failing or Timing Out
**Symptoms:**
- Workflows start but never complete
- Timeout errors in execution logs
- Memory errors
**Diagnosis:**
```bash
# Check resource usage
ssh root@10.10.0.210 "docker stats --no-stream n8n"
# Check execution logs
# Access n8n UI → Executions → View failed execution
```
**Solutions:**
1. **Increase timeout in NPM:**
- NPM proxy host → Advanced tab
- Add: `proxy_read_timeout 300;`
2. **Increase LXC resources:**
```bash
# On Proxmox host
ssh root@10.10.0.11 "
pct set 210 --memory 16384 # Increase to 16GB
pct set 210 --cores 8 # Increase to 8 cores
pct reboot 210
"
```
3. **Optimize workflow:**
- Break large workflows into smaller ones
- Use pagination for API calls
- Add delay nodes between heavy operations
4. **Check external service timeouts:**
- API you're calling may be slow
- Increase timeout in HTTP Request nodes
### Database/Credential Issues
**Symptoms:**
- "Error loading credentials" in workflow
- Saved credentials not appearing
- "Credentials could not be decrypted"
**Critical Error - Encryption Key Changed:**
If you see "could not be decrypted," the encryption key was changed or is incorrect.
**This is UNRECOVERABLE without the original key!**
```bash
# Check current encryption key
ssh root@10.10.0.210 "cat /opt/n8n/.env | grep N8N_ENCRYPTION_KEY"
# If you have the old key, restore it:
ssh root@10.10.0.210 "
cd /opt/n8n
sed -i 's/N8N_ENCRYPTION_KEY=.*/N8N_ENCRYPTION_KEY=YOUR_OLD_KEY/' .env
docker compose restart n8n
"
```
**Prevention:**
- Backup `.env` file regularly
- Store encryption key in password manager
- Never regenerate encryption key after initial setup
---
## Performance Issues
### n8n Running Slow
**Symptoms:**
- UI sluggish or unresponsive
- Workflows take longer than expected
- High CPU/memory usage
**Diagnosis:**
```bash
# Check resource usage
ssh root@10.10.0.210 "
docker stats n8n n8n-postgres
df -h /
free -h
"
# Check PostgreSQL performance
ssh root@10.10.0.210 "
docker compose exec postgres psql -U n8n -d n8n -c '
SELECT query, calls, total_time, mean_time
FROM pg_stat_statements
ORDER BY total_time DESC
LIMIT 10;'
"
```
**Solutions:**
1. **Clean up old executions:**
```bash
# In n8n UI: Settings → Executions
# Set: "Delete executions older than X days"
```
2. **Optimize database:**
```bash
ssh root@10.10.0.210 "
docker compose exec postgres psql -U n8n -d n8n -c 'VACUUM ANALYZE;'
"
```
3. **Increase LXC resources** (see above)
4. **Check disk I/O:**
```bash
ssh root@10.10.0.210 "iostat -x 1 5"
# If %util is consistently >80%, consider faster storage
```
### Database Growing Too Large
**Symptoms:**
- Disk space warning
- n8n slowing down over time
- Backup files becoming huge
**Diagnosis:**
```bash
# Check database size
ssh root@10.10.0.210 "
docker compose exec postgres psql -U n8n -d n8n -c '
SELECT pg_size_pretty(pg_database_size(current_database()));'
"
# Check table sizes
ssh root@10.10.0.210 "
docker compose exec postgres psql -U n8n -d n8n -c '
SELECT tablename, pg_size_pretty(pg_total_relation_size(tablename::text))
FROM pg_tables
WHERE schemaname = '\''public'\''
ORDER BY pg_total_relation_size(tablename::text) DESC;'
"
```
**Solutions:**
1. **Configure execution pruning:**
- Settings → Executions
- Enable: "Delete executions older than 7 days"
- Set: "Max execution data to save"
2. **Manual cleanup:**
```bash
ssh root@10.10.0.210 "
docker compose exec postgres psql -U n8n -d n8n -c '
DELETE FROM execution_entity
WHERE \"startedAt\" < NOW() - INTERVAL '\''30 days'\'';
VACUUM FULL;'
"
```
---
## Emergency Procedures
### Complete Service Restart
```bash
ssh root@10.10.0.210 "cd /opt/n8n && docker compose down && docker compose up -d"
```
### Emergency Backup Before Changes
```bash
ssh root@10.10.0.210 "
cd /opt/n8n
# Create emergency backup
docker compose exec -T postgres pg_dump -U n8n n8n > /root/n8n-emergency-$(date +%Y%m%d-%H%M%S).sql
# Copy .env
cp .env /root/n8n-env-emergency-$(date +%Y%m%d-%H%M%S).env
"
```
### Complete Reset (DESTRUCTIVE)
**Only if all else fails and you're okay losing workflows:**
```bash
ssh root@10.10.0.210 "
cd /opt/n8n
docker compose down
docker volume rm n8n_data n8n_postgres_data
docker compose up -d
"
```
**Note:** This deletes everything. Restore from backup immediately after!
---
## Prevention & Monitoring
### Regular Maintenance
**Weekly:**
- Check disk space: `df -h /`
- Review failed executions in n8n UI
- Check log for errors: `docker compose logs --since 7d`
**Monthly:**
- Backup database and .env file
- Update n8n: `docker compose pull && docker compose up -d`
- Vacuum database: `VACUUM ANALYZE;`
- Review execution data retention settings
**Quarterly:**
- Test disaster recovery procedure
- Review and archive old workflows
- Audit credentials and remove unused ones
- Check for security updates
### Monitoring Setup
**Basic health check script:**
```bash
#!/bin/bash
# /opt/monitoring/check-n8n.sh
STATUS=$(curl -s -o /dev/null -w "%{http_code}" http://10.10.0.210:5678)
if [ "$STATUS" != "200" ]; then
echo "❌ n8n is down! Status: $STATUS"
# Send alert (Discord, email, etc.)
else
echo "✅ n8n is healthy"
fi
```
**Add to cron:**
```bash
*/5 * * * * /opt/monitoring/check-n8n.sh >> /var/log/n8n-health.log 2>&1
```
---
## Getting Help
### Log Collection for Support
```bash
# Collect all relevant logs
ssh root@10.10.0.210 "
cd /opt/n8n
mkdir -p /tmp/n8n-debug
docker compose logs --tail=200 > /tmp/n8n-debug/docker-logs.txt
docker compose ps > /tmp/n8n-debug/container-status.txt
cat .env | sed 's/PASSWORD=.*/PASSWORD=***/' > /tmp/n8n-debug/env-redacted.txt
df -h > /tmp/n8n-debug/disk-space.txt
free -h > /tmp/n8n-debug/memory.txt
docker stats --no-stream > /tmp/n8n-debug/container-stats.txt
tar -czf /root/n8n-debug-$(date +%Y%m%d-%H%M%S).tar.gz /tmp/n8n-debug/
"
```
### Resources
- **n8n Community Forum:** https://community.n8n.io/
- **Official Docs:** https://docs.n8n.io/
- **GitHub Issues:** https://github.com/n8n-io/n8n/issues
- **Discord:** https://discord.gg/n8n
### When to Escalate
Escalate to n8n community/support if:
- Database corruption suspected
- Consistent crashes with no clear cause
- Performance issues persist after optimization
- Security concerns
- Bug suspected in n8n itself
Always provide:
- n8n version: `docker inspect n8n | grep Image`
- Error messages from logs
- Steps to reproduce
- What you've already tried