store: Health check escalation logic: only critical-severity triggers exit 2

2026-02-19 21:59:38 -06:00 · 2026-02-19 21:59:38 -06:00 · 5ae40699d9
commit 5ae40699d9
parent 2624930a08
1 changed files with 12 additions and 0 deletions
--- a/graph/solutions/health-check-escalation-logic-only-critical-severity-trigger-ad5a91.md
+++ b/graph/solutions/health-check-escalation-logic-only-critical-severity-trigger-ad5a91.md
@ -0,0 +1,12 @@
+---
+id: ad5a911d-c7fa-4b7e-a488-17560c4067b3
+type: solution
+title: "Health check escalation logic: only critical-severity triggers exit 2"
+tags: [monitoring, claude-runner-monitoring, python, fix]
+importance: 0.8
+confidence: 0.8
+created: "2026-02-20T03:59:38.027169+00:00"
+updated: "2026-02-20T03:59:38.027169+00:00"
+---
+
+In health_check.py, the original logic put ALL non-auto-remediable issues into the escalations list, triggering exit 2 (Claude invocation) even for minor warnings like load_high. Fixed by only adding issues with `severity == "critical"` to escalations. Warnings are still included in the JSON output but don't trigger Claude escalation. This prevents unnecessary API costs from load spikes on LXCs sharing a Proxmox host (gitea and uptime-kuma both report the host's load, which often exceeds 2-core thresholds).