10 KiB
| title | description | type | domain | tags | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Knowledge Base RAG System | Semantic search system (md-kb-rag) over claude-home docs using vector embeddings. Covers Docker stack architecture on manticore, Qdrant + nomic-embed pipeline, MCP integration, Gitea webhook auto-sync, and troubleshooting. | guide | development |
|
Knowledge Base RAG System (md-kb-rag)
Overview
Semantic search over the entire claude-home documentation repo using vector embeddings. Runs on ubuntu-manticore (10.10.0.226) as a Docker stack, exposed as an MCP server to Claude Code for the claude-home project.
- App: st0nefish/md-kb-rag (Rust)
- Host:
manticore(10.10.0.226) - Stack location:
~/docker/md-kb-rag/on manticore - MCP endpoint:
http://10.10.0.226:8001/mcp - Webhook endpoint:
http://10.10.0.226:8001/hooks/reindex - Auto-sync: Gitea Actions workflow triggers on push to
main(.mdfiles only)
Architecture
Three containers in a single Docker Compose stack:
| Container | Image | Role | Port |
|---|---|---|---|
md-kb-rag-kb-rag-1 |
ghcr.io/st0nefish/md-kb-rag:latest |
MCP server + indexer | 8001 |
md-kb-rag-qdrant-1 |
qdrant/qdrant:v1.17.0 |
Vector database | 6333/6334 (localhost) |
md-kb-rag-embeddings-1 |
ghcr.io/ggml-org/llama.cpp:server-cuda |
GPU embedding server | 8080 (internal) |
Embedding Model
- Model:
nomic-embed-text-v2-moe(Q8_0 quantization) - Vector size: 768 dimensions
- Context size: 8192 tokens
- GPU accelerated: NVIDIA CUDA with flash attention
Data Flow
push .md to main → Gitea Action → POST /hooks/reindex (HMAC-signed)
→ kb-rag: git pull → incremental index → nomic-embed generates vectors
→ stored in Qdrant → MCP search tool queries Qdrant → returns ranked chunks to Claude
MCP Integration
Claude Code Config
Registered as a user-level MCP server in ~/.claude.json under the top-level mcpServers key:
{
"kb-search": {
"type": "url",
"url": "http://10.10.0.226:8001/mcp",
"headers": {
"Authorization": "Bearer <MCP_BEARER_TOKEN from .env>"
}
}
}
See workstation/claude-code-config.md for details on MCP server configuration.
Available MCP Tools
search
Semantic search across all indexed documents. Returns ranked chunks with scores.
query: "natural language search query"
limit: 10 # max results (default 10, max 50)
domain: null # optional filter
type: null # optional filter
tags: [] # optional tag filter
get_document
Retrieve full raw content of a document by file path (as returned by search results).
path: "/data/productivity/google-workspace-cli.md"
Auto-Sync Pipeline
The KB data lives at ~/docker/md-kb-rag/data/repo/ on manticore as a proper git clone of http://10.10.0.225:3000/cal/claude-home.git. Syncing is fully automated.
How It Works
- Push
.mdfiles tomainbranch on Gitea - Gitea Actions workflow (
.gitea/workflows/kb-reindex.yml) fires - Workflow sends HMAC-SHA256 signed POST to
http://10.10.0.226:8001/hooks/reindex - md-kb-rag receives webhook → runs
git fetch+git merge --ff-only(usingGIT_PULL_TOKEN) → runs incremental reindex - Only changed files are re-embedded (content hash comparison via SQLite state DB)
Webhook Authentication
- Provider: Gitea (native format)
- Header:
x-gitea-signaturecontaining hex-encoded HMAC-SHA256 of the request body - Secret: stored as
WEBHOOK_SECRETin.envon manticore and asKB_WEBHOOK_SECRETGitea repo secret - Body must include
{"ref": "refs/heads/main"}to match the configured branch
Gitea Actions Workflow
# .gitea/workflows/kb-reindex.yml
name: Reindex Knowledge Base
on:
push:
branches: [main]
paths: ['**/*.md']
jobs:
reindex:
runs-on: ubuntu-latest
steps:
- name: Trigger KB re-index
env:
WEBHOOK_SECRET: ${{ secrets.KB_WEBHOOK_SECRET }}
run: |
BODY='{"ref":"refs/heads/main"}'
SIG=$(echo -n "$BODY" | openssl dgst -sha256 -hmac "$WEBHOOK_SECRET" | awk '{print $2}')
curl -sf -X POST http://10.10.0.226:8001/hooks/reindex \
-H "Content-Type: application/json" \
-H "x-gitea-signature: $SIG" \
-d "$BODY"
Manual Re-indexing
# Incremental (only changed/new files — fast, use this normally)
ssh manticore "docker exec md-kb-rag-kb-rag-1 md-kb-rag index"
# Full re-index (clears state DB, re-embeds everything — slow)
ssh manticore "docker exec md-kb-rag-kb-rag-1 md-kb-rag index --full"
Manual Webhook Test
BODY='{"ref":"refs/heads/main"}'
SECRET='<webhook-secret>'
SIG=$(echo -n "$BODY" | openssl dgst -sha256 -hmac "$SECRET" | awk '{print $2}')
curl -sf -X POST http://10.10.0.226:8001/hooks/reindex \
-H "Content-Type: application/json" \
-H "x-gitea-signature: $SIG" \
-d "$BODY"
Operations
Health Check
ssh manticore "docker exec md-kb-rag-kb-rag-1 md-kb-rag health"
Status (file list + Qdrant point count)
ssh manticore "docker exec md-kb-rag-kb-rag-1 md-kb-rag status"
Validate Markdown (without indexing)
ssh manticore "docker exec md-kb-rag-kb-rag-1 md-kb-rag validate"
View Logs
ssh manticore "docker logs md-kb-rag-kb-rag-1 --tail 50"
Restart Stack
ssh manticore "cd ~/docker/md-kb-rag && docker compose restart"
Adding New Documentation
- Create/edit the markdown file in
/mnt/NV2/Development/claude-home/ - Update the relevant CONTEXT.md with a summary and link
- Commit and push to
main— the pipeline handles the rest - Verify with a
kb-searchMCP search query to confirm the new content is findable
Environment Variables (on manticore)
File: ~/docker/md-kb-rag/.env
| Variable | Purpose |
|---|---|
MODEL_PATH |
Path to embedding model directory |
MODEL_FILE |
Embedding model filename (nomic-embed-text-v2-moe.Q8_0.gguf) |
KB_PATH |
Path to knowledge base repo (./data/repo) |
MCP_PORT |
MCP server port (8001) |
MCP_BEARER_TOKEN |
Auth token for MCP endpoint |
WEBHOOK_SECRET |
HMAC secret for webhook auth (shared with Gitea repo secret) |
GIT_PULL_TOKEN |
Gitea token for authenticated git fetch during webhook reindex |
RUST_LOG |
Log level (info) |
Troubleshooting
Search returns no results
- Check health:
ssh manticore "docker exec md-kb-rag-kb-rag-1 md-kb-rag health" - Verify files are synced:
ssh manticore "ls ~/docker/md-kb-rag/data/repo/path/to/file.md" - Re-index: may need
--fullif state DB is out of sync
MCP connection refused
- Check container is running:
ssh manticore "docker ps | grep kb-rag" - Check port is listening:
ssh manticore "curl -s http://localhost:8001/health" - Restart:
ssh manticore "cd ~/docker/md-kb-rag && docker compose restart kb-rag"
Embedding server OOM / crash
- The llama.cpp embedding server uses GPU memory. Check with
ssh manticore "nvidia-smi" - Restart embeddings container:
ssh manticore "cd ~/docker/md-kb-rag && docker compose restart embeddings"
Stale index after deleting files
- Incremental indexing doesn't remove orphaned vectors for deleted files
- Run
--fullre-index to clean up:ssh manticore "docker exec md-kb-rag-kb-rag-1 md-kb-rag index --full"
Webhook returns 500 "Git pull failed"
- Check container logs:
ssh manticore "docker logs md-kb-rag-kb-rag-1 --tail 20" - "dubious ownership": The
.gitconfigwithsafe.directory = /dataisn't mounted orGIT_CONFIG_GLOBALenv var is missing - "Permission denied": Container must run as
user: "1000:1000"to match repo file ownership - "FETCH_HEAD" error: Same as permission denied — uid mismatch
MCP session disconnects after container restart
- Run
/mcpin Claude Code to reconnect - This happens because the Streamable HTTP session is invalidated when the container restarts
Docker Compose Notes
The kb-rag service has these non-obvious requirements:
user: "1000:1000"— must match the uid/gid that ownsdata/repo/for git pull to workconfig.yamlmount — providessource.git_urlandbranchso the webhook handler knows to rungit pull.gitconfigmount +GIT_CONFIG_GLOBALenv var — git needssafe.directory = /datasince the volume owner differs from the container's default user
Changelog
2026-03-17 — Image Update + Config Fixes
Image pull: Updated ghcr.io/st0nefish/md-kb-rag:latest (8 upstream commits since initial deploy on 2026-03-11).
Key upstream changes applied:
GIT_PULL_TOKENsupport — Webhook-triggered reindex now uses explicitgit fetch+git merge --ff-onlywith a token injected into the HTTPS URL. Previously the git pull inside Docker was silently failing (no SSH client, dubious ownership errors).- Auto-clone on startup — Setting
source.git_urlallows the container to shallow-clone the repo into an empty volume on first boot. Not adopted (we use a bind-mount), but available. EMBEDDING_API_KEYsupport — Optional env var for authenticated embedding providers. Not needed for local llama.cpp.- Custom MCP instructions — New
mcp.instructionsconfig field sets the server-level instructions block sent to MCP clients. Server auto-appends discovered filter metadata (domains, types, tags). - Bug fixes — Webhook rate limiter gap, globset deny-all fallback, RwLock panic in MCP startup, HTTP 429/503 retry logic for embedding API.
Config changes made:
- Added
GIT_PULL_TOKENenv var to.env(Gitea token with repo read access) - Added
GIT_PULL_TOKEN=${GIT_PULL_TOKEN:-}todocker-compose.ymlenvironment section - Added
mcp.instructionstoconfig.yamlwith proactive search trigger keywords matching the claude-home topic areas
Env vars table update:
| Variable | Purpose |
|---|---|
GIT_PULL_TOKEN |
Gitea token for authenticated git fetch during webhook reindex |
Result: Webhook reindex pipeline now works end-to-end (push → Gitea Action → webhook → git fetch with auth → incremental reindex). Verified with live push test.