claude-home/monitoring/recovered-lxc300/server-diagnostics/SKILL.md
Cal Corum 28abde7c9f chore: add recovered CT 302 configs, archive tdarr scripts, clean up repo
- Add recovered LXC 300/302 server-diagnostics configs as reference
  (headless Claude permission patterns, health check client)
- Archive decommissioned tdarr monitoring scripts
- Gitignore rpg-art/ directory
- Delete stray temp files and swarm-test/

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 00:41:41 -06:00

178 lines
4.9 KiB
Markdown

---
name: server-diagnostics
description: |
Automated server troubleshooting for Docker containers and system health.
Provides SSH-based diagnostics, log reading, metrics collection, and low-risk
remediation. USE WHEN N8N triggers troubleshooting, container issues detected,
or system health checks needed.
---
# Server Diagnostics - Automated Troubleshooting
## When to Activate This Skill
- N8N triggers with error context
- "diagnose container X", "check docker status"
- "read logs from server", "check disk usage"
- "troubleshoot server issue"
- Any automated health check response
## Quick Start
### Check All Containers
```bash
python ~/.claude/skills/server-diagnostics/client.py docker-status paper-dynasty
```
### Quick Health Check (Docker + System Metrics)
```bash
python ~/.claude/skills/server-diagnostics/client.py health paper-dynasty
```
### Get Container Logs
```bash
python ~/.claude/skills/server-diagnostics/client.py docker-logs paper-dynasty paper-dynasty_discord-app_1 --lines 200
```
### Restart a Container
```bash
python ~/.claude/skills/server-diagnostics/client.py docker-restart paper-dynasty paper-dynasty_discord-app_1
```
### System Metrics
```bash
python ~/.claude/skills/server-diagnostics/client.py metrics paper-dynasty --type all
python ~/.claude/skills/server-diagnostics/client.py metrics paper-dynasty --type disk
```
### Run Diagnostic Command
```bash
python ~/.claude/skills/server-diagnostics/client.py diagnostic paper-dynasty disk_usage
python ~/.claude/skills/server-diagnostics/client.py diagnostic paper-dynasty memory_usage
```
## Troubleshooting Workflow
When an issue is reported:
1. **Quick Health Check** - Get overview of containers and system state
```bash
python ~/.claude/skills/server-diagnostics/client.py health paper-dynasty
```
2. **Check MemoryGraph** - Recall similar issues
```bash
python ~/.claude/skills/memorygraph/client.py recall "docker container error"
```
3. **Get Container Logs** - Look for errors
```bash
python ~/.claude/skills/server-diagnostics/client.py docker-logs paper-dynasty <container> --lines 500 --filter error
```
4. **Remediate if Safe** - Restart if allowed
```bash
python ~/.claude/skills/server-diagnostics/client.py docker-restart paper-dynasty <container>
```
5. **Store Solution** - Save to MemoryGraph if resolved
```bash
python ~/.claude/skills/memorygraph/client.py store \
--type solution \
--title "Fixed <container> issue" \
--content "Description of problem and solution" \
--tags "docker,paper-dynasty,troubleshooting" \
--importance 0.7
```
## Server Inventory
| Server | IP | SSH User | Description |
|--------|-----|----------|-------------|
| paper-dynasty | 10.10.0.88 | cal | Paper Dynasty Discord bots and services |
## Monitored Containers
| Container | Critical | Restart Allowed | Description |
|-----------|----------|-----------------|-------------|
| paper-dynasty_discord-app_1 | Yes | Yes | Paper Dynasty Discord bot |
| paper-dynasty_db_1 | Yes | Yes | PostgreSQL database |
| paper-dynasty_adminer_1 | No | Yes | Database admin UI |
| sba-website_sba-web_1 | Yes | Yes | SBA website |
| sba-ghost_sba-ghost_1 | No | Yes | Ghost CMS |
## Available Diagnostic Commands
- `disk_usage` - df -h
- `memory_usage` - free -h
- `cpu_usage` - top -bn1 | head -20
- `cpu_load` - uptime
- `process_list` - ps aux --sort=-%mem | head -20
- `network_status` - ss -tuln
- `docker_ps` - docker ps -a (formatted)
- `docker_stats` - docker stats --no-stream
- `journal_errors` - journalctl -p err -n 50
## Security Constraints
### DENIED Patterns (Will Be Rejected)
- rm -rf, rm -r /
- dd if=, mkfs
- shutdown, reboot
- systemctl stop
- chmod 777
- wget|sh, curl|sh
### Container Restart Rules
- Only containers in config.yaml with restart_allowed: true
- N8N container restart is NEVER allowed (it triggers us)
## MemoryGraph Integration
Before troubleshooting, check for known solutions:
```bash
python ~/.claude/skills/memorygraph/client.py recall "docker paper-dynasty"
```
After resolving, store the pattern:
```bash
python ~/.claude/skills/memorygraph/client.py store \
--type solution \
--title "Brief description" \
--content "Full explanation..." \
--tags "docker,paper-dynasty,fix" \
--importance 0.7
```
## Common Issues and Solutions
### Container Not Running
1. Check logs for crash reason
2. Check disk space and memory
3. Attempt restart if allowed
4. Escalate if restart fails
### High Memory Usage
1. Check which container is consuming
2. Review docker stats
3. Check for memory leaks in logs
4. Consider container restart
### Disk Space Low
1. Run disk_usage diagnostic
2. Check docker system df
3. Consider log rotation
4. Alert user for cleanup
## Output Format
All commands return JSON:
```json
{
"success": true,
"stdout": "...",
"stderr": "...",
"returncode": 0,
"data": {...} // Parsed data if applicable
}
```