claude-home/development/kb-rag-system.md
Cal Corum 4b7eca8a46
All checks were successful
Reindex Knowledge Base / reindex (push) Successful in 3s
docs: add YAML frontmatter to all 151 markdown files
Adds title, description, type, domain, and tags frontmatter to every
doc for improved KB semantic search. The description field is prepended
to every search chunk, and domain/type/tags enable filtered queries.

Type values: context, guide, runbook, reference, troubleshooting
Domain values match directory structure (networking, docker, etc.)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 09:00:44 -05:00

8.3 KiB

title description type domain tags
Knowledge Base RAG System Semantic search system (md-kb-rag) over claude-home docs using vector embeddings. Covers Docker stack architecture on manticore, Qdrant + nomic-embed pipeline, MCP integration, Gitea webhook auto-sync, and troubleshooting. guide development
kb-rag
mcp
qdrant
embeddings
docker
gitea
semantic-search
manticore

Knowledge Base RAG System (md-kb-rag)

Overview

Semantic search over the entire claude-home documentation repo using vector embeddings. Runs on ubuntu-manticore (10.10.0.226) as a Docker stack, exposed as an MCP server to Claude Code for the claude-home project.

  • App: st0nefish/md-kb-rag (Rust)
  • Host: manticore (10.10.0.226)
  • Stack location: ~/docker/md-kb-rag/ on manticore
  • MCP endpoint: http://10.10.0.226:8001/mcp
  • Webhook endpoint: http://10.10.0.226:8001/hooks/reindex
  • Auto-sync: Gitea Actions workflow triggers on push to main (.md files only)

Architecture

Three containers in a single Docker Compose stack:

Container Image Role Port
md-kb-rag-kb-rag-1 ghcr.io/st0nefish/md-kb-rag:latest MCP server + indexer 8001
md-kb-rag-qdrant-1 qdrant/qdrant:v1.17.0 Vector database 6333/6334 (localhost)
md-kb-rag-embeddings-1 ghcr.io/ggml-org/llama.cpp:server-cuda GPU embedding server 8080 (internal)

Embedding Model

  • Model: nomic-embed-text-v2-moe (Q8_0 quantization)
  • Vector size: 768 dimensions
  • Context size: 8192 tokens
  • GPU accelerated: NVIDIA CUDA with flash attention

Data Flow

push .md to main → Gitea Action → POST /hooks/reindex (HMAC-signed)
  → kb-rag: git pull → incremental index → nomic-embed generates vectors
  → stored in Qdrant → MCP search tool queries Qdrant → returns ranked chunks to Claude

MCP Integration

Claude Code Config

Registered as a user-level MCP server in ~/.claude.json under the top-level mcpServers key:

{
  "kb-search": {
    "type": "url",
    "url": "http://10.10.0.226:8001/mcp",
    "headers": {
      "Authorization": "Bearer <MCP_BEARER_TOKEN from .env>"
    }
  }
}

See workstation/claude-code-config.md for details on MCP server configuration.

Available MCP Tools

Semantic search across all indexed documents. Returns ranked chunks with scores.

query: "natural language search query"
limit: 10          # max results (default 10, max 50)
domain: null       # optional filter
type: null         # optional filter
tags: []           # optional tag filter

get_document

Retrieve full raw content of a document by file path (as returned by search results).

path: "/data/productivity/google-workspace-cli.md"

Auto-Sync Pipeline

The KB data lives at ~/docker/md-kb-rag/data/repo/ on manticore as a proper git clone of http://10.10.0.225:3000/cal/claude-home.git. Syncing is fully automated.

How It Works

  1. Push .md files to main branch on Gitea
  2. Gitea Actions workflow (.gitea/workflows/kb-reindex.yml) fires
  3. Workflow sends HMAC-SHA256 signed POST to http://10.10.0.226:8001/hooks/reindex
  4. md-kb-rag receives webhook → runs git pull --ff-only → runs incremental reindex
  5. Only changed files are re-embedded (content hash comparison via SQLite state DB)

Webhook Authentication

  • Provider: Gitea (native format)
  • Header: x-gitea-signature containing hex-encoded HMAC-SHA256 of the request body
  • Secret: stored as WEBHOOK_SECRET in .env on manticore and as KB_WEBHOOK_SECRET Gitea repo secret
  • Body must include {"ref": "refs/heads/main"} to match the configured branch

Gitea Actions Workflow

# .gitea/workflows/kb-reindex.yml
name: Reindex Knowledge Base
on:
  push:
    branches: [main]
    paths: ['**/*.md']
jobs:
  reindex:
    runs-on: ubuntu-latest
    steps:
      - name: Trigger KB re-index
        env:
          WEBHOOK_SECRET: ${{ secrets.KB_WEBHOOK_SECRET }}
        run: |
          BODY='{"ref":"refs/heads/main"}'
          SIG=$(echo -n "$BODY" | openssl dgst -sha256 -hmac "$WEBHOOK_SECRET" | awk '{print $2}')
          curl -sf -X POST http://10.10.0.226:8001/hooks/reindex \
            -H "Content-Type: application/json" \
            -H "x-gitea-signature: $SIG" \
            -d "$BODY"          

Manual Re-indexing

# Incremental (only changed/new files — fast, use this normally)
ssh manticore "docker exec md-kb-rag-kb-rag-1 md-kb-rag index"

# Full re-index (clears state DB, re-embeds everything — slow)
ssh manticore "docker exec md-kb-rag-kb-rag-1 md-kb-rag index --full"

Manual Webhook Test

BODY='{"ref":"refs/heads/main"}'
SECRET='<webhook-secret>'
SIG=$(echo -n "$BODY" | openssl dgst -sha256 -hmac "$SECRET" | awk '{print $2}')
curl -sf -X POST http://10.10.0.226:8001/hooks/reindex \
  -H "Content-Type: application/json" \
  -H "x-gitea-signature: $SIG" \
  -d "$BODY"

Operations

Health Check

ssh manticore "docker exec md-kb-rag-kb-rag-1 md-kb-rag health"

Status (file list + Qdrant point count)

ssh manticore "docker exec md-kb-rag-kb-rag-1 md-kb-rag status"

Validate Markdown (without indexing)

ssh manticore "docker exec md-kb-rag-kb-rag-1 md-kb-rag validate"

View Logs

ssh manticore "docker logs md-kb-rag-kb-rag-1 --tail 50"

Restart Stack

ssh manticore "cd ~/docker/md-kb-rag && docker compose restart"

Adding New Documentation

  1. Create/edit the markdown file in /mnt/NV2/Development/claude-home/
  2. Update the relevant CONTEXT.md with a summary and link
  3. Commit and push to main — the pipeline handles the rest
  4. Verify with a kb-search MCP search query to confirm the new content is findable

Environment Variables (on manticore)

File: ~/docker/md-kb-rag/.env

Variable Purpose
MODEL_PATH Path to embedding model directory
MODEL_FILE Embedding model filename (nomic-embed-text-v2-moe.Q8_0.gguf)
KB_PATH Path to knowledge base repo (./data/repo)
MCP_PORT MCP server port (8001)
MCP_BEARER_TOKEN Auth token for MCP endpoint
WEBHOOK_SECRET HMAC secret for webhook auth (shared with Gitea repo secret)
RUST_LOG Log level (info)

Troubleshooting

Search returns no results

  • Check health: ssh manticore "docker exec md-kb-rag-kb-rag-1 md-kb-rag health"
  • Verify files are synced: ssh manticore "ls ~/docker/md-kb-rag/data/repo/path/to/file.md"
  • Re-index: may need --full if state DB is out of sync

MCP connection refused

  • Check container is running: ssh manticore "docker ps | grep kb-rag"
  • Check port is listening: ssh manticore "curl -s http://localhost:8001/health"
  • Restart: ssh manticore "cd ~/docker/md-kb-rag && docker compose restart kb-rag"

Embedding server OOM / crash

  • The llama.cpp embedding server uses GPU memory. Check with ssh manticore "nvidia-smi"
  • Restart embeddings container: ssh manticore "cd ~/docker/md-kb-rag && docker compose restart embeddings"

Stale index after deleting files

  • Incremental indexing doesn't remove orphaned vectors for deleted files
  • Run --full re-index to clean up: ssh manticore "docker exec md-kb-rag-kb-rag-1 md-kb-rag index --full"

Webhook returns 500 "Git pull failed"

  • Check container logs: ssh manticore "docker logs md-kb-rag-kb-rag-1 --tail 20"
  • "dubious ownership": The .gitconfig with safe.directory = /data isn't mounted or GIT_CONFIG_GLOBAL env var is missing
  • "Permission denied": Container must run as user: "1000:1000" to match repo file ownership
  • "FETCH_HEAD" error: Same as permission denied — uid mismatch

MCP session disconnects after container restart

  • Run /mcp in Claude Code to reconnect
  • This happens because the Streamable HTTP session is invalidated when the container restarts

Docker Compose Notes

The kb-rag service has these non-obvious requirements:

  • user: "1000:1000" — must match the uid/gid that owns data/repo/ for git pull to work
  • config.yaml mount — provides source.git_url and branch so the webhook handler knows to run git pull
  • .gitconfig mount + GIT_CONFIG_GLOBAL env var — git needs safe.directory = /data since the volume owner differs from the container's default user