store: Architecture: N8N Server Health Monitor uses Master Loop + Sub-workflow pattern

This commit is contained in:
Cal Corum 2026-03-01 09:58:58 -06:00
parent c0a62f3669
commit 3b3487a43f

View File

@ -0,0 +1,38 @@
---
id: 67898e52-470a-470e-b149-43fef0047ae9
type: decision
title: "Architecture: N8N Server Health Monitor uses Master Loop + Sub-workflow pattern"
tags: [n8n, server-diagnostics, workflow-architecture, decision, claude-home]
importance: 0.65
confidence: 0.8
created: "2026-03-01T15:58:58.161120+00:00"
updated: "2026-03-01T15:58:58.161120+00:00"
---
# N8N Server Health Monitor Architecture
## Workflow IDs
- **Master Loop**: `p7XmW23SgCs3hEkY` — runs every 5 minutes
- **Sub-workflow**: `BhzYmWr6NcIDoioy` — per-server health check
## Flow
1. Master Loop SSHes to CT 300, fetches server list
2. Splits list into items, calls sub-workflow per server
3. Sub-workflow runs `health_check.py --server {key}`, parses exit code:
- `0` = healthy
- `1` = remediated
- `2` = escalation → runs `remediate.sh`
4. Master loop aggregates all results
5. Discord webhook fires **only if escalations found** (`has_escalations=true`)
## Rationale for Split Design
- Sub-workflow: 10/10 successes, 0 errors — isolated per-server logic cleanly
- Master loop handles orchestration and notification separately
- Conditional Discord notify avoids alert spam on healthy runs
## Notes
- Sub-workflow `onError` updates on SSH nodes applied without corruption (safe change)
- The escalation-only Discord path was the source of the silent Master Loop failure bug (missing URL)
## Tags
n8n, server-diagnostics, workflow-architecture