docs: add CT 302 SSH alias and git auth details to server-diagnostics
Documents the claude-runner SSH alias, HTTPS token auth method, and notes that SSH git remotes don't work from CT 302. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
b75a09e86e
commit
ed16fee9f7
91
monitoring/server-diagnostics/CONTEXT.md
Normal file
91
monitoring/server-diagnostics/CONTEXT.md
Normal file
@ -0,0 +1,91 @@
|
||||
# Server Diagnostics — Deployment & Architecture
|
||||
|
||||
## Overview
|
||||
|
||||
Automated server health monitoring running on CT 302 (claude-runner, 10.10.0.148).
|
||||
Two-tier system: Python health checks handle 99% of issues autonomously; Claude
|
||||
is only invoked for complex failures that scripts can't resolve.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌──────────────────────┐ ┌──────────────────────────────────┐
|
||||
│ N8N (LXC 210) │ │ CT 302 — claude-runner │
|
||||
│ 10.10.0.210 │ │ 10.10.0.148 │
|
||||
│ │ │ │
|
||||
│ ┌─────────────────┐ │ SSH │ ┌──────────────────────────┐ │
|
||||
│ │ Cron: */15 min │─┼─────┼─→│ health_check.py │ │
|
||||
│ │ │ │ │ │ (exit 0/1/2) │ │
|
||||
│ │ Branch on exit: │ │ │ └──────────────────────────┘ │
|
||||
│ │ 0 → stop │ │ │ │
|
||||
│ │ 1 → stop │ │ │ ┌──────────────────────────┐ │
|
||||
│ │ 2 → invoke │─┼─────┼─→│ claude --print │ │
|
||||
│ │ Claude │ │ │ │ + client.py │ │
|
||||
│ └─────────────────┘ │ │ └──────────────────────────┘ │
|
||||
│ │ │ │
|
||||
│ ┌─────────────────┐ │ │ SSH keys: │
|
||||
│ │ Uptime Kuma │ │ │ - homelab_rsa (→ target servers)│
|
||||
│ │ webhook trigger │ │ │ - n8n_runner_key (← N8N) │
|
||||
│ └─────────────────┘ │ └──────────────────────────────────┘
|
||||
└──────────────────────┘
|
||||
│ SSH to target servers
|
||||
▼
|
||||
┌────────────────┐ ┌────────────────┐ ┌────────────────┐
|
||||
│ arr-stack │ │ gitea │ │ uptime-kuma │
|
||||
│ 10.10.0.221 │ │ 10.10.0.225 │ │ 10.10.0.227 │
|
||||
│ Docker: sonarr │ │ systemd: gitea │ │ Docker: kuma │
|
||||
│ radarr, etc. │ │ Docker: runner │ │ │
|
||||
└────────────────┘ └────────────────┘ └────────────────┘
|
||||
```
|
||||
|
||||
## Cost Model
|
||||
|
||||
- **Exit 0** (healthy): $0 — pure Python, no API call
|
||||
- **Exit 1** (auto-remediated): $0 — Python restarts container + Discord webhook
|
||||
- **Exit 2** (escalation): ~$0.10-0.15 — Claude Sonnet invoked via `claude --print`
|
||||
|
||||
At 96 checks/day (every 15 min), typical cost is near $0 unless something
|
||||
actually breaks and can't be auto-fixed.
|
||||
|
||||
## Repository
|
||||
|
||||
**Gitea:** `cal/claude-runner-monitoring`
|
||||
**Deployed to:** `/root/.claude` on CT 302
|
||||
**SSH alias:** `claude-runner` (root@10.10.0.148, defined in `~/.ssh/config`)
|
||||
**Update method:** `ssh claude-runner "cd /root/.claude && git pull"`
|
||||
|
||||
### Git Auth on CT 302
|
||||
|
||||
CT 302 pushes to Gitea via HTTPS with a token auth header (embedded-credential URLs are rejected by Gitea). The token is stored locally in `~/.claude/secrets/claude_runner_monitoring_gitea_token` and configured on CT 302 via:
|
||||
|
||||
```
|
||||
git config http.https://git.manticorum.com/.extraHeader 'Authorization: token <token>'
|
||||
```
|
||||
|
||||
CT 302 does **not** have an SSH key registered with Gitea, so SSH git remotes won't work.
|
||||
|
||||
## Files
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `CLAUDE.md` | Runner-specific instructions for Claude |
|
||||
| `settings.json` | Locked-down permissions (read-only + restart only) |
|
||||
| `skills/server-diagnostics/health_check.py` | Tier 1: automated health checks |
|
||||
| `skills/server-diagnostics/client.py` | Tier 2: Claude's diagnostic toolkit |
|
||||
| `skills/server-diagnostics/notifier.py` | Discord webhook notifications |
|
||||
| `skills/server-diagnostics/config.yaml` | Server inventory + security rules |
|
||||
| `skills/server-diagnostics/SKILL.md` | Skill reference |
|
||||
| `skills/server-diagnostics/CLAUDE.md` | Remediation methodology |
|
||||
|
||||
## Adding a New Server
|
||||
|
||||
1. Add entry to `config.yaml` under `servers:` with hostname, containers, etc.
|
||||
2. Ensure CT 302 can SSH: `ssh -i /root/.ssh/homelab_rsa root@<ip> hostname`
|
||||
3. Commit to Gitea, pull on CT 302
|
||||
4. Add Uptime Kuma monitors if desired
|
||||
|
||||
## Related
|
||||
|
||||
- [monitoring/CONTEXT.md](../CONTEXT.md) — Overall monitoring architecture
|
||||
- [productivity/n8n/CONTEXT.md](../../productivity/n8n/CONTEXT.md) — N8N deployment
|
||||
- Uptime Kuma status page: https://status.manticorum.com
|
||||
Loading…
Reference in New Issue
Block a user