feat: add session resumption and Agent SDK evaluation

- runner.sh: opt-in session persistence via session_resumable and resume_last_session settings; fix read_setting to normalize booleans - issue-poller.sh: capture and log session_id from worker invocations, include in result JSON - pr-reviewer-dispatcher.sh: capture and log session_id from reviews - n8n workflow: add --append-system-prompt to initial SSH node, add Follow Up Diagnostics node using --resume for deeper investigation, update Discord Alert with remediation details - Add Agent SDK evaluation doc (CLI vs Python/TS SDK comparison) - Update CONTEXT.md with session resumption documentation Closes #3 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Merge pull request 'fix: document per-core load threshold policy for health monitoring (#22 )' (#42 ) from issue/22-tune-n8n-alert-thresholds-to-per-core-load-metrics into main
2026-04-03 19:59:44 +00:00 · 2026-04-03 18:36:14 +00:00 · 2026-04-03 13:35:23 -05:00 · 2026-04-03 12:00:22 -05:00
2 changed files with 69 additions and 0 deletions
--- a/monitoring/server-diagnostics/CONTEXT.md
+++ b/monitoring/server-diagnostics/CONTEXT.md
@ -92,6 +92,42 @@ CT 302 does **not** have an SSH key registered with Gitea, so SSH git remotes wo
 3. Commit to Gitea, pull on CT 302
 4. Add Uptime Kuma monitors if desired

+## Health Check Thresholds
+
+Thresholds are evaluated in `health_check.py`. All load thresholds use **per-core** metrics
+to avoid false positives from LXC containers (which see the Proxmox host's aggregate load).
+
+### Load Average
+
+| Metric | Value | Rationale |
+|--------|-------|-----------|
+| `LOAD_WARN_PER_CORE` | `0.7` | Elevated — investigate if sustained |
+| `LOAD_CRIT_PER_CORE` | `1.0` | Saturated — CPU is a bottleneck |
+| Sample window | 5-minute | Filters transient spikes (not 1-minute) |
+
+**Formula**: `load_per_core = load_5m / nproc`
+
+**Why per-core?** Proxmox LXC containers see the host's aggregate load average via the
+shared kernel. A 32-core Proxmox host at load 9 is at 0.28/core (healthy), but a naive
+absolute threshold of 2× would trigger at 9 for a 4-core LXC. Using `load_5m / nproc`
+where `nproc` returns the host's visible core count gives the correct ratio.
+
+**Validation examples**:
+- Proxmox host: load 9 / 32 cores = 0.28/core → no alert ✓
+- VM 116 at 0.75/core → warning ✓ (above 0.7 threshold)
+- VM at 1.1/core → critical ✓
+
+### Other Thresholds
+
+| Check | Threshold | Notes |
+|-------|-----------|-------|
+| Zombie processes | 5 | Single zombies are transient noise; alert only if ≥ 5 |
+| Swap usage | 30% of total swap | Percentage-based to handle varied swap sizes across hosts |
+| Disk warning | 85% | |
+| Disk critical | 95% | |
+| Memory | 90% | |
+| Uptime alert | Non-urgent Discord post | Not a page-level alert |
+
 ## Related

 - [monitoring/CONTEXT.md](../CONTEXT.md) — Overall monitoring architecture
--- a/workstation/troubleshooting.md
+++ b/workstation/troubleshooting.md
@ -0,0 +1,33 @@
+---
+title: "Workstation Troubleshooting"
+description: "Troubleshooting notes for Nobara/KDE Wayland workstation issues."
+type: troubleshooting
+domain: workstation
+tags: [troubleshooting, wayland, kde]
+---
+
+# Workstation Troubleshooting
+
+## Discord screen sharing shows no windows on KDE Wayland (2026-04-03)
+
+**Severity:** Medium — cannot share screen via Discord desktop app
+
+**Problem:** Clicking "Share Your Screen" in Discord desktop app (v0.0.131, Electron 37) opens the Discord picker but shows zero windows/screens. Same behavior in both the desktop app and the web app when using Discord's own picker. Affects both native Wayland and XWayland modes.
+
+**Root Cause:** Discord's built-in screen picker uses Electron's `desktopCapturer.getSources()` which relies on X11 window enumeration. On KDE Wayland:
+- In native Wayland mode: no X11 windows exist, so the picker is empty
+- In forced X11/XWayland mode (`ELECTRON_OZONE_PLATFORM_HINT=x11`): Discord can only see other XWayland windows (itself, Android emulator), not native Wayland apps
+- Discord ignores `--use-fake-ui-for-media-stream` and other Chromium flags that should force portal usage
+- The `discord-flags.conf` file is **not read** by the Nobara/RPM Discord package — flags must go in the `.desktop` file `Exec=` line
+
+**Fix:** Use **Discord web app in Firefox** for screen sharing. Firefox natively delegates to the XDG Desktop Portal via PipeWire, which shows the KDE screen picker with all windows. The desktop app's own picker remains broken on Wayland as of v0.0.131.
+
+Configuration applied (for general Discord Wayland support):
+- `~/.local/share/applications/discord.desktop` — overrides system `.desktop` with Wayland flags
+- `~/.config/discord-flags.conf` — created but not read by this Discord build
+
+**Lesson:**
+- Discord desktop on Linux Wayland cannot do screen sharing through its own picker — always use the web app in Firefox for this
+- Electron's `desktopCapturer` API is fundamentally X11-only; the PipeWire/portal path requires the app to use `getDisplayMedia()` instead, which Discord's desktop app does not do
+- `discord-flags.conf` is unreliable across distros — always verify flags landed in `/proc/<pid>/cmdline`
+- Vesktop (community client) is an alternative that properly implements portal-based screen sharing, if the web app is insufficient
Author	SHA1	Message	Date
Cal Corum	e321e7bd47	feat: add session resumption and Agent SDK evaluation All checks were successful Auto-merge docs-only PRs / auto-merge-docs (pull_request) Successful in 2s Details - runner.sh: opt-in session persistence via session_resumable and resume_last_session settings; fix read_setting to normalize booleans - issue-poller.sh: capture and log session_id from worker invocations, include in result JSON - pr-reviewer-dispatcher.sh: capture and log session_id from reviews - n8n workflow: add --append-system-prompt to initial SSH node, add Follow Up Diagnostics node using --resume for deeper investigation, update Discord Alert with remediation details - Add Agent SDK evaluation doc (CLI vs Python/TS SDK comparison) - Update CONTEXT.md with session resumption documentation Closes #3 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 19:59:44 +00:00
cal	4e33e1cae3	Merge pull request 'fix: document per-core load threshold policy for health monitoring (#22 )' (#42 ) from issue/22-tune-n8n-alert-thresholds-to-per-core-load-metrics into main All checks were successful Reindex Knowledge Base / reindex (push) Successful in 2s Details	2026-04-03 18:36:14 +00:00
Cal Corum	193ae68f96	docs: document per-core load threshold policy for server health monitoring (#22 ) All checks were successful Auto-merge docs-only PRs / auto-merge-docs (pull_request) Successful in 5s Details Closes #22 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-03 13:35:23 -05:00
Cal Corum	7c9c96eb52	docs: sync KB — troubleshooting.md All checks were successful Reindex Knowledge Base / reindex (push) Successful in 3s Details	2026-04-03 12:00:22 -05:00