From 3b2e031f45f00116e59da85587acc87c369554af Mon Sep 17 00:00:00 2001 From: Cal Corum Date: Sat, 7 Feb 2026 23:02:49 -0600 Subject: [PATCH] Update monitoring docs with Uptime Kuma monitors and Discord alerts Document all 20 active monitors with targets and tags, Discord notification configuration, and API access details for programmatic management via uptime-kuma-api. Co-Authored-By: Claude Opus 4.6 --- CLAUDE.md | 3 ++- monitoring/CONTEXT.md | 46 +++++++++++++++++++++++++++++++++++-------- 2 files changed, 40 insertions(+), 9 deletions(-) diff --git a/CLAUDE.md b/CLAUDE.md index 1a8826b..09af6c5 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -52,8 +52,9 @@ When working with specific technologies, automatically load their dedicated cont - Load: `monitoring/CONTEXT.md` (technology overview and patterns) - Load: `monitoring/troubleshooting.md` (error handling and debugging) - If working in `/monitoring/scripts/`: Load `monitoring/scripts/CONTEXT.md` (script-specific documentation) - - Note: Uptime Kuma centralized service monitoring on LXC 227 (10.10.0.227) + - Note: Uptime Kuma on LXC 227 (10.10.0.227) - 20 active monitors with Discord alerts - Note: Status page at https://status.manticorum.com (internal: http://10.10.0.227:3001) + - Note: Uptime Kuma API via `uptime-kuma-api` Python library, creds in `~/.claude/secrets/kuma_web_password` - Note: Windows desktop monitoring with Discord notifications available - Note: Comprehensive Tdarr API monitoring with dataclass-based status tracking - Note: Jellyfin GPU monitoring with auto-restart (`jellyfin_gpu_monitor.py`) diff --git a/monitoring/CONTEXT.md b/monitoring/CONTEXT.md index 10d14df..97fc17c 100644 --- a/monitoring/CONTEXT.md +++ b/monitoring/CONTEXT.md @@ -53,9 +53,9 @@ curl -X POST "$DISCORD_WEBHOOK" \ **Key Features**: - HTTP/HTTPS, TCP, DNS, Docker, and ping monitoring -- Built-in Discord notification support -- Public/private status pages -- Multi-protocol health checks with configurable intervals +- Discord notification integration (default alert channel for all monitors) +- Public status page at https://status.manticorum.com +- Multi-protocol health checks at 60-second intervals with 3 retries - Certificate expiration monitoring **Infrastructure**: @@ -63,12 +63,42 @@ curl -X POST "$DISCORD_WEBHOOK" \ - Docker with AppArmor unconfined (required for Docker-in-LXC) - Data persisted via Docker named volume (`uptime-kuma-data`) - Compose config: `server-configs/uptime-kuma/docker-compose/uptime-kuma/` +- SSH alias: `uptime-kuma` +- Admin credentials: username `cal`, password in `~/.claude/secrets/kuma_web_password` -**Recommended Monitors**: -- All Docker hosts: Jellyfin, Tdarr, n8n, Gitea, Foundry, Pi-holes, NPM, Discord bots -- Databases: strat-database PostgreSQL instances -- External: Akamai services, SBA website -- Infrastructure: Proxmox API, Home Assistant +**Active Monitors (20)**: + +| Tag | Monitor | Type | Target | +|-----|---------|------|--------| +| Infrastructure | Proxmox VE | HTTP | https://10.10.0.11:8006 | +| Infrastructure | Home Assistant | HTTP | http://10.0.0.28:8123 | +| DNS | Pi-hole Primary DNS | DNS | 10.10.0.16:53 | +| DNS | Pi-hole Secondary DNS | DNS | 10.10.0.226:53 | +| Media | Jellyfin | HTTP | http://10.10.0.226:8096 | +| Media | Tdarr | HTTP | http://10.10.0.226:8265 | +| Media | Sonarr | HTTP | http://10.10.0.221:8989 | +| Media | Radarr | HTTP | http://10.10.0.221:7878 | +| Media | Jellyseerr | HTTP | http://10.10.0.221:5055 | +| DevOps | Gitea | HTTP | http://10.10.0.225:3000 | +| DevOps | n8n | HTTP | http://10.10.0.210:5678 | +| Networking | NPM Local (Admin) | HTTP | http://10.10.0.16:81 | +| Networking | Pi-hole Primary Web | HTTP | http://10.10.0.16:81/admin | +| Networking | Pi-hole Secondary Web | HTTP | http://10.10.0.226:8053/admin | +| Gaming | Foundry VTT | HTTP | http://10.10.0.223:30000 | +| AI | OpenClaw Gateway | HTTP | http://10.10.0.224:18789 | +| Bots | discord-bots VM | Ping | 10.10.0.33 | +| Bots | sba-bots VM | Ping | 10.10.0.88 | +| Database | PostgreSQL (strat-database) | TCP | 10.10.0.42:5432 | +| External | Akamai NPM | HTTP | http://172.237.147.99 | + +**Notifications**: +- Discord webhook: "Discord - Homelab Alerts" (default, applied to all monitors) +- Alerts on service down (after 3 retries at 30s intervals) and on recovery + +**API Access**: +- Python library: `uptime-kuma-api` (pip installed) +- Connection: `UptimeKumaApi("http://10.10.0.227:3001")` +- Used for programmatic monitor/notification management ### Network and Service Monitoring **Purpose**: Monitor critical infrastructure availability