Strat-O-Matic rules Q&A chatbot with Discord integration
Go to file
Cal Corum c3218f70c4 refactor: hexagonal architecture with ports & adapters, DI, and test-first development
Domain layer (zero framework imports):
- domain/models.py: pure dataclasses (RuleDocument, RuleSearchResult,
  Conversation, ChatMessage, LLMResponse, ChatResult)
- domain/ports.py: ABC interfaces (RuleRepository, LLMPort,
  ConversationStore, IssueTracker)
- domain/services.py: ChatService orchestrates Q&A flow using only ports

Outbound adapters (implement domain ports):
- adapters/outbound/openrouter.py: OpenRouterLLM with persistent httpx
  client, robust JSON parsing, regex citation fallback
- adapters/outbound/sqlite_convos.py: SQLiteConversationStore with
  async_sessionmaker, timezone-aware datetimes, cleanup support
- adapters/outbound/gitea_issues.py: GiteaIssueTracker with markdown
  injection protection (fenced code blocks)
- adapters/outbound/chroma_rules.py: ChromaRuleRepository with clamped
  similarity scores

Inbound adapter:
- adapters/inbound/api.py: thin FastAPI router with input validation
  (max_length constraints), proper HTTP status codes (503 for missing LLM)

Configuration & wiring:
- config/settings.py: Pydantic v2 SettingsConfigDict (no module-level singleton)
- config/container.py: create_app() factory with lifespan-managed DI
- main.py: minimal entry point

Test infrastructure (90 tests, all passing):
- tests/fakes/: in-memory implementations of all 4 ports
- tests/domain/: 26 tests for models and ChatService
- tests/adapters/: 64 tests for all adapters using fakes/mocks
- No real API calls, no model downloads, no disk I/O in fast tests

Also fixes: aiosqlite version constraint (>=0.19.0), adds hatch build
targets for new package layout.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 15:51:16 -05:00
adapters refactor: hexagonal architecture with ports & adapters, DI, and test-first development 2026-03-08 15:51:16 -05:00
app fix: resolve 4 critical bugs found in code review 2026-03-08 15:31:11 -05:00
config refactor: hexagonal architecture with ports & adapters, DI, and test-first development 2026-03-08 15:51:16 -05:00
data/rules feat: initial chatbot implementation with FastAPI, ChromaDB, Discord bot, and Gitea integration 2026-03-08 15:19:26 -05:00
domain refactor: hexagonal architecture with ports & adapters, DI, and test-first development 2026-03-08 15:51:16 -05:00
scripts feat: initial chatbot implementation with FastAPI, ChromaDB, Discord bot, and Gitea integration 2026-03-08 15:19:26 -05:00
tests refactor: hexagonal architecture with ports & adapters, DI, and test-first development 2026-03-08 15:51:16 -05:00
.env.example feat: initial chatbot implementation with FastAPI, ChromaDB, Discord bot, and Gitea integration 2026-03-08 15:19:26 -05:00
.gitignore feat: initial chatbot implementation with FastAPI, ChromaDB, Discord bot, and Gitea integration 2026-03-08 15:19:26 -05:00
docker-compose.yml feat: initial chatbot implementation with FastAPI, ChromaDB, Discord bot, and Gitea integration 2026-03-08 15:19:26 -05:00
Dockerfile feat: initial chatbot implementation with FastAPI, ChromaDB, Discord bot, and Gitea integration 2026-03-08 15:19:26 -05:00
main.py refactor: hexagonal architecture with ports & adapters, DI, and test-first development 2026-03-08 15:51:16 -05:00
pyproject.toml refactor: hexagonal architecture with ports & adapters, DI, and test-first development 2026-03-08 15:51:16 -05:00
pyrightconfig.json refactor: hexagonal architecture with ports & adapters, DI, and test-first development 2026-03-08 15:51:16 -05:00
README.md feat: initial chatbot implementation with FastAPI, ChromaDB, Discord bot, and Gitea integration 2026-03-08 15:19:26 -05:00
setup.sh feat: initial chatbot implementation with FastAPI, ChromaDB, Discord bot, and Gitea integration 2026-03-08 15:19:26 -05:00
uv.lock refactor: hexagonal architecture with ports & adapters, DI, and test-first development 2026-03-08 15:51:16 -05:00

Strat-Chatbot

AI-powered Q&A chatbot for Strat-O-Matic baseball league rules.

Features

  • Natural language Q&A: Ask questions about league rules in plain English
  • Semantic search: Uses ChromaDB vector embeddings to find relevant rules
  • Rule citations: Always cites specific rule IDs (e.g., "Rule 5.2.1(b)")
  • Conversation threading: Maintains conversation context for follow-up questions
  • Gitea integration: Automatically creates issues for unanswered questions
  • Discord integration: Slash command /ask with reply-based follow-ups

Architecture

┌─────────┐    ┌──────────────┐    ┌─────────────┐
│ Discord │────│ FastAPI      │────│ ChromaDB    │
│ Bot     │    │ (port 8000)  │    │ (vectors)   │
└─────────┘    └──────────────┘    └─────────────┘
                                        │
                                  ┌───────▼──────┐
                                  │ Markdown     │
                                  │ Rule Files   │
                                  └──────────────┘
                                        │
                                  ┌───────▼──────┐
                                  │ OpenRouter   │
                                  │ (LLM API)    │
                                  └──────────────┘
                                        │
                                  ┌───────▼──────┐
                                  │ Gitea        │
                                  │ Issues       │
                                  └──────────────┘

Quick Start

Prerequisites

  • Docker & Docker Compose
  • OpenRouter API key
  • Discord bot token
  • Gitea token (optional, for issue creation)

Setup

  1. Clone and configure
cd strat-chatbot
cp .env.example .env
# Edit .env with your API keys and tokens
  1. Prepare rules

Place your rule documents in data/rules/ as Markdown files with YAML frontmatter:

---
rule_id: "5.2.1(b)"
title: "Stolen Base Attempts"
section: "Baserunning"
parent_rule: "5.2"
page_ref: "32"
---

When a runner attempts to steal...
  1. Ingest rules
# With Docker Compose (recommended)
docker compose up -d
docker compose exec api python scripts/ingest_rules.py

# Or locally
uv sync
uv run scripts/ingest_rules.py
  1. Start services
docker compose up -d

The API will be available at http://localhost:8000

The Discord bot will connect and sync slash commands.

Runtime Configuration

Environment Variable Required? Description
OPENROUTER_API_KEY Yes OpenRouter API key
OPENROUTER_MODEL No Model ID (default: stepfun/step-3.5-flash:free)
DISCORD_BOT_TOKEN No Discord bot token (omit to run API only)
DISCORD_GUILD_ID No Guild ID for slash command sync (faster than global)
GITEA_TOKEN No Gitea API token (for issue creation)
GITEA_OWNER No Gitea username (default: cal)
GITEA_REPO No Repository name (default: strat-chatbot)

API Endpoints

Endpoint Method Description
/health GET Health check with stats
/chat POST Send a question and get a response
/stats GET Knowledge base and system statistics

Chat Request

{
  "message": "Can a runner steal on a 2-2 count?",
  "user_id": "123456789",
  "channel_id": "987654321",
  "conversation_id": "optional-uuid",
  "parent_message_id": "optional-parent-uuid"
}

Chat Response

{
  "response": "Yes, according to Rule 5.2.1(b)...",
  "conversation_id": "conv-uuid",
  "message_id": "msg-uuid",
  "cited_rules": ["5.2.1(b)", "5.3"],
  "confidence": 0.85,
  "needs_human": false
}

Development

Local Development (without Docker)

# Install dependencies
uv sync

# Ingest rules
uv run scripts/ingest_rules.py

# Run API server
uv run app/main.py

# In another terminal, run Discord bot
uv run app/discord_bot.py

Project Structure

strat-chatbot/
├── app/
│   ├── __init__.py
│   ├── config.py          # Configuration management
│   ├── database.py        # SQLAlchemy conversation state
│   ├── gitea.py           # Gitea API client
│   ├── llm.py             # OpenRouter integration
│   ├── main.py            # FastAPI app
│   ├── models.py          # Pydantic models
│   ├── vector_store.py    # ChromaDB wrapper
│   └── discord_bot.py     # Discord bot
├── data/
│   ├── chroma/            # Vector DB (auto-created)
│   └── rules/             # Your markdown rule files
├── scripts/
│   └── ingest_rules.py    # Ingestion pipeline
├── tests/                 # Test files
├── .env.example
├── Dockerfile
├── docker-compose.yml
└── pyproject.toml

Performance Optimizations

  • Embedding cache: ChromaDB persists embeddings on disk
  • Rule chunking: Each rule is a separate document, no context fragmentation
  • Top-k search: Configurable number of rules to retrieve (default: 10)
  • Conversation TTL: 30 minutes to limit database size
  • Async operations: All I/O is non-blocking

Testing the API

curl -X POST http://localhost:8000/chat \
  -H "Content-Type: application/json" \
  -d '{
    "message": "What happens if the pitcher balks?",
    "user_id": "test123",
    "channel_id": "general"
  }'

Gitea Integration

When the bot encounters a question it can't answer confidently (confidence < 0.4), it will automatically:

  1. Log the question to console
  2. Create an issue in your configured Gitea repo
  3. Include: user ID, channel, question, attempted rules, conversation link

Issues are labeled with:

  • rules-gap - needs a rule addition or clarification
  • ai-generated - created by AI bot
  • needs-review - requires human administrator attention

To-Do

  • Build OpenRouter Docker client with proper torch dependencies
  • Add PDF ingestion support (convert PDF → Markdown)
  • Implement rule change detection and incremental updates
  • Add rate limiting per Discord user
  • Create admin endpoints for rule management
  • Add Prometheus metrics for monitoring
  • Build unit and integration tests

License

TBD