diff --git a/CLAUDE.md b/CLAUDE.md index 1a8826b..09af6c5 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -52,8 +52,9 @@ When working with specific technologies, automatically load their dedicated cont - Load: `monitoring/CONTEXT.md` (technology overview and patterns) - Load: `monitoring/troubleshooting.md` (error handling and debugging) - If working in `/monitoring/scripts/`: Load `monitoring/scripts/CONTEXT.md` (script-specific documentation) - - Note: Uptime Kuma centralized service monitoring on LXC 227 (10.10.0.227) + - Note: Uptime Kuma on LXC 227 (10.10.0.227) - 20 active monitors with Discord alerts - Note: Status page at https://status.manticorum.com (internal: http://10.10.0.227:3001) + - Note: Uptime Kuma API via `uptime-kuma-api` Python library, creds in `~/.claude/secrets/kuma_web_password` - Note: Windows desktop monitoring with Discord notifications available - Note: Comprehensive Tdarr API monitoring with dataclass-based status tracking - Note: Jellyfin GPU monitoring with auto-restart (`jellyfin_gpu_monitor.py`) diff --git a/monitoring/CONTEXT.md b/monitoring/CONTEXT.md index 10d14df..97fc17c 100644 --- a/monitoring/CONTEXT.md +++ b/monitoring/CONTEXT.md @@ -53,9 +53,9 @@ curl -X POST "$DISCORD_WEBHOOK" \ **Key Features**: - HTTP/HTTPS, TCP, DNS, Docker, and ping monitoring -- Built-in Discord notification support -- Public/private status pages -- Multi-protocol health checks with configurable intervals +- Discord notification integration (default alert channel for all monitors) +- Public status page at https://status.manticorum.com +- Multi-protocol health checks at 60-second intervals with 3 retries - Certificate expiration monitoring **Infrastructure**: @@ -63,12 +63,42 @@ curl -X POST "$DISCORD_WEBHOOK" \ - Docker with AppArmor unconfined (required for Docker-in-LXC) - Data persisted via Docker named volume (`uptime-kuma-data`) - Compose config: `server-configs/uptime-kuma/docker-compose/uptime-kuma/` +- SSH alias: `uptime-kuma` +- Admin credentials: username `cal`, password in `~/.claude/secrets/kuma_web_password` -**Recommended Monitors**: -- All Docker hosts: Jellyfin, Tdarr, n8n, Gitea, Foundry, Pi-holes, NPM, Discord bots -- Databases: strat-database PostgreSQL instances -- External: Akamai services, SBA website -- Infrastructure: Proxmox API, Home Assistant +**Active Monitors (20)**: + +| Tag | Monitor | Type | Target | +|-----|---------|------|--------| +| Infrastructure | Proxmox VE | HTTP | https://10.10.0.11:8006 | +| Infrastructure | Home Assistant | HTTP | http://10.0.0.28:8123 | +| DNS | Pi-hole Primary DNS | DNS | 10.10.0.16:53 | +| DNS | Pi-hole Secondary DNS | DNS | 10.10.0.226:53 | +| Media | Jellyfin | HTTP | http://10.10.0.226:8096 | +| Media | Tdarr | HTTP | http://10.10.0.226:8265 | +| Media | Sonarr | HTTP | http://10.10.0.221:8989 | +| Media | Radarr | HTTP | http://10.10.0.221:7878 | +| Media | Jellyseerr | HTTP | http://10.10.0.221:5055 | +| DevOps | Gitea | HTTP | http://10.10.0.225:3000 | +| DevOps | n8n | HTTP | http://10.10.0.210:5678 | +| Networking | NPM Local (Admin) | HTTP | http://10.10.0.16:81 | +| Networking | Pi-hole Primary Web | HTTP | http://10.10.0.16:81/admin | +| Networking | Pi-hole Secondary Web | HTTP | http://10.10.0.226:8053/admin | +| Gaming | Foundry VTT | HTTP | http://10.10.0.223:30000 | +| AI | OpenClaw Gateway | HTTP | http://10.10.0.224:18789 | +| Bots | discord-bots VM | Ping | 10.10.0.33 | +| Bots | sba-bots VM | Ping | 10.10.0.88 | +| Database | PostgreSQL (strat-database) | TCP | 10.10.0.42:5432 | +| External | Akamai NPM | HTTP | http://172.237.147.99 | + +**Notifications**: +- Discord webhook: "Discord - Homelab Alerts" (default, applied to all monitors) +- Alerts on service down (after 3 retries at 30s intervals) and on recovery + +**API Access**: +- Python library: `uptime-kuma-api` (pip installed) +- Connection: `UptimeKumaApi("http://10.10.0.227:3001")` +- Used for programmatic monitor/notification management ### Network and Service Monitoring **Purpose**: Monitor critical infrastructure availability