claude-home/development/kb-rag-system.md
Cal Corum 1ca0458a66
All checks were successful
Reindex Knowledge Base / reindex (push) Successful in 2s
docs: sync KB — kb-rag-system.md
2026-03-17 22:44:09 -05:00

10 KiB

title description type domain tags
Knowledge Base RAG System Semantic search system (md-kb-rag) over claude-home docs using vector embeddings. Covers Docker stack architecture on manticore, Qdrant + nomic-embed pipeline, MCP integration, Gitea webhook auto-sync, and troubleshooting. guide development
kb-rag
mcp
qdrant
embeddings
docker
gitea
semantic-search
manticore

Knowledge Base RAG System (md-kb-rag)

Overview

Semantic search over the entire claude-home documentation repo using vector embeddings. Runs on ubuntu-manticore (10.10.0.226) as a Docker stack, exposed as an MCP server to Claude Code for the claude-home project.

  • App: st0nefish/md-kb-rag (Rust)
  • Host: manticore (10.10.0.226)
  • Stack location: ~/docker/md-kb-rag/ on manticore
  • MCP endpoint: http://10.10.0.226:8001/mcp
  • Webhook endpoint: http://10.10.0.226:8001/hooks/reindex
  • Auto-sync: Gitea Actions workflow triggers on push to main (.md files only)

Architecture

Three containers in a single Docker Compose stack:

Container Image Role Port
md-kb-rag-kb-rag-1 ghcr.io/st0nefish/md-kb-rag:latest MCP server + indexer 8001
md-kb-rag-qdrant-1 qdrant/qdrant:v1.17.0 Vector database 6333/6334 (localhost)
md-kb-rag-embeddings-1 ghcr.io/ggml-org/llama.cpp:server-cuda GPU embedding server 8080 (internal)

Embedding Model

  • Model: nomic-embed-text-v2-moe (Q8_0 quantization)
  • Vector size: 768 dimensions
  • Context size: 8192 tokens
  • GPU accelerated: NVIDIA CUDA with flash attention

Data Flow

push .md to main → Gitea Action → POST /hooks/reindex (HMAC-signed)
  → kb-rag: git pull → incremental index → nomic-embed generates vectors
  → stored in Qdrant → MCP search tool queries Qdrant → returns ranked chunks to Claude

MCP Integration

Claude Code Config

Registered as a user-level MCP server in ~/.claude.json under the top-level mcpServers key:

{
  "kb-search": {
    "type": "url",
    "url": "http://10.10.0.226:8001/mcp",
    "headers": {
      "Authorization": "Bearer <MCP_BEARER_TOKEN from .env>"
    }
  }
}

See workstation/claude-code-config.md for details on MCP server configuration.

Available MCP Tools

Semantic search across all indexed documents. Returns ranked chunks with scores.

query: "natural language search query"
limit: 10          # max results (default 10, max 50)
domain: null       # optional filter
type: null         # optional filter
tags: []           # optional tag filter

get_document

Retrieve full raw content of a document by file path (as returned by search results).

path: "/data/productivity/google-workspace-cli.md"

Auto-Sync Pipeline

The KB data lives at ~/docker/md-kb-rag/data/repo/ on manticore as a proper git clone of http://10.10.0.225:3000/cal/claude-home.git. Syncing is fully automated.

How It Works

  1. Push .md files to main branch on Gitea
  2. Gitea Actions workflow (.gitea/workflows/kb-reindex.yml) fires
  3. Workflow sends HMAC-SHA256 signed POST to http://10.10.0.226:8001/hooks/reindex
  4. md-kb-rag receives webhook → runs git fetch + git merge --ff-only (using GIT_PULL_TOKEN) → runs incremental reindex
  5. Only changed files are re-embedded (content hash comparison via SQLite state DB)

Webhook Authentication

  • Provider: Gitea (native format)
  • Header: x-gitea-signature containing hex-encoded HMAC-SHA256 of the request body
  • Secret: stored as WEBHOOK_SECRET in .env on manticore and as KB_WEBHOOK_SECRET Gitea repo secret
  • Body must include {"ref": "refs/heads/main"} to match the configured branch

Gitea Actions Workflow

# .gitea/workflows/kb-reindex.yml
name: Reindex Knowledge Base
on:
  push:
    branches: [main]
    paths: ['**/*.md']
jobs:
  reindex:
    runs-on: ubuntu-latest
    steps:
      - name: Trigger KB re-index
        env:
          WEBHOOK_SECRET: ${{ secrets.KB_WEBHOOK_SECRET }}
        run: |
          BODY='{"ref":"refs/heads/main"}'
          SIG=$(echo -n "$BODY" | openssl dgst -sha256 -hmac "$WEBHOOK_SECRET" | awk '{print $2}')
          curl -sf -X POST http://10.10.0.226:8001/hooks/reindex \
            -H "Content-Type: application/json" \
            -H "x-gitea-signature: $SIG" \
            -d "$BODY"          

Manual Re-indexing

# Incremental (only changed/new files — fast, use this normally)
ssh manticore "docker exec md-kb-rag-kb-rag-1 md-kb-rag index"

# Full re-index (clears state DB, re-embeds everything — slow)
ssh manticore "docker exec md-kb-rag-kb-rag-1 md-kb-rag index --full"

Manual Webhook Test

BODY='{"ref":"refs/heads/main"}'
SECRET='<webhook-secret>'
SIG=$(echo -n "$BODY" | openssl dgst -sha256 -hmac "$SECRET" | awk '{print $2}')
curl -sf -X POST http://10.10.0.226:8001/hooks/reindex \
  -H "Content-Type: application/json" \
  -H "x-gitea-signature: $SIG" \
  -d "$BODY"

Operations

Health Check

ssh manticore "docker exec md-kb-rag-kb-rag-1 md-kb-rag health"

Status (file list + Qdrant point count)

ssh manticore "docker exec md-kb-rag-kb-rag-1 md-kb-rag status"

Validate Markdown (without indexing)

ssh manticore "docker exec md-kb-rag-kb-rag-1 md-kb-rag validate"

View Logs

ssh manticore "docker logs md-kb-rag-kb-rag-1 --tail 50"

Restart Stack

ssh manticore "cd ~/docker/md-kb-rag && docker compose restart"

Adding New Documentation

  1. Create/edit the markdown file in /mnt/NV2/Development/claude-home/
  2. Update the relevant CONTEXT.md with a summary and link
  3. Commit and push to main — the pipeline handles the rest
  4. Verify with a kb-search MCP search query to confirm the new content is findable

Environment Variables (on manticore)

File: ~/docker/md-kb-rag/.env

Variable Purpose
MODEL_PATH Path to embedding model directory
MODEL_FILE Embedding model filename (nomic-embed-text-v2-moe.Q8_0.gguf)
KB_PATH Path to knowledge base repo (./data/repo)
MCP_PORT MCP server port (8001)
MCP_BEARER_TOKEN Auth token for MCP endpoint
WEBHOOK_SECRET HMAC secret for webhook auth (shared with Gitea repo secret)
GIT_PULL_TOKEN Gitea token for authenticated git fetch during webhook reindex
RUST_LOG Log level (info)

Troubleshooting

Search returns no results

  • Check health: ssh manticore "docker exec md-kb-rag-kb-rag-1 md-kb-rag health"
  • Verify files are synced: ssh manticore "ls ~/docker/md-kb-rag/data/repo/path/to/file.md"
  • Re-index: may need --full if state DB is out of sync

MCP connection refused

  • Check container is running: ssh manticore "docker ps | grep kb-rag"
  • Check port is listening: ssh manticore "curl -s http://localhost:8001/health"
  • Restart: ssh manticore "cd ~/docker/md-kb-rag && docker compose restart kb-rag"

Embedding server OOM / crash

  • The llama.cpp embedding server uses GPU memory. Check with ssh manticore "nvidia-smi"
  • Restart embeddings container: ssh manticore "cd ~/docker/md-kb-rag && docker compose restart embeddings"

Stale index after deleting files

  • Incremental indexing doesn't remove orphaned vectors for deleted files
  • Run --full re-index to clean up: ssh manticore "docker exec md-kb-rag-kb-rag-1 md-kb-rag index --full"

Webhook returns 500 "Git pull failed"

  • Check container logs: ssh manticore "docker logs md-kb-rag-kb-rag-1 --tail 20"
  • "dubious ownership": The .gitconfig with safe.directory = /data isn't mounted or GIT_CONFIG_GLOBAL env var is missing
  • "Permission denied": Container must run as user: "1000:1000" to match repo file ownership
  • "FETCH_HEAD" error: Same as permission denied — uid mismatch

MCP session disconnects after container restart

  • Run /mcp in Claude Code to reconnect
  • This happens because the Streamable HTTP session is invalidated when the container restarts

Docker Compose Notes

The kb-rag service has these non-obvious requirements:

  • user: "1000:1000" — must match the uid/gid that owns data/repo/ for git pull to work
  • config.yaml mount — provides source.git_url and branch so the webhook handler knows to run git pull
  • .gitconfig mount + GIT_CONFIG_GLOBAL env var — git needs safe.directory = /data since the volume owner differs from the container's default user

Changelog

2026-03-17 — Image Update + Config Fixes

Image pull: Updated ghcr.io/st0nefish/md-kb-rag:latest (8 upstream commits since initial deploy on 2026-03-11).

Key upstream changes applied:

  • GIT_PULL_TOKEN support — Webhook-triggered reindex now uses explicit git fetch + git merge --ff-only with a token injected into the HTTPS URL. Previously the git pull inside Docker was silently failing (no SSH client, dubious ownership errors).
  • Auto-clone on startup — Setting source.git_url allows the container to shallow-clone the repo into an empty volume on first boot. Not adopted (we use a bind-mount), but available.
  • EMBEDDING_API_KEY support — Optional env var for authenticated embedding providers. Not needed for local llama.cpp.
  • Custom MCP instructions — New mcp.instructions config field sets the server-level instructions block sent to MCP clients. Server auto-appends discovered filter metadata (domains, types, tags).
  • Bug fixes — Webhook rate limiter gap, globset deny-all fallback, RwLock panic in MCP startup, HTTP 429/503 retry logic for embedding API.

Config changes made:

  • Added GIT_PULL_TOKEN env var to .env (Gitea token with repo read access)
  • Added GIT_PULL_TOKEN=${GIT_PULL_TOKEN:-} to docker-compose.yml environment section
  • Added mcp.instructions to config.yaml with proactive search trigger keywords matching the claude-home topic areas

Env vars table update:

Variable Purpose
GIT_PULL_TOKEN Gitea token for authenticated git fetch during webhook reindex

Result: Webhook reindex pipeline now works end-to-end (push → Gitea Action → webhook → git fetch with auth → incremental reindex). Verified with live push test.