- Add Home Assistant deployment guide with container architecture - Document platform analysis comparing Home Assistant, OpenHAB, and Node-RED - Add voice automation architecture with local/cloud hybrid approach - Include implementation details for Rhasspy + Home Assistant integration - Provide step-by-step deployment guides and configuration templates - Document privacy-focused voice processing with local wake word detection 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
5.5 KiB
5.5 KiB
Voice-Controlled Automation Architecture
Vision: Speech → Claude Code → Home Assistant Pipeline
High-Level Flow
[Microphone] → [STT Engine] → [Command Parser] → [Claude Code API] → [Home Assistant] → [Actions]
Component Architecture
1. Speech-to-Text (STT) Engine - Local Options
Whisper (OpenAI) - Recommended
- Excellent accuracy, runs locally
- Multiple model sizes (tiny to large)
- GPU acceleration available
- Container deployment:
openai/whisper
Alternative: Vosk
- Lighter weight, faster response
- Good for command recognition
- Multiple language models available
2. Voice Activity Detection (VAD)
Wake Word Detection
- Porcupine (Picovoice) - local wake word detection
- Custom wake phrases: "Hey Claude", "Computer", etc.
- Always-listening with privacy protection
Push-to-Talk Alternative
- Hardware button integration
- Mobile app trigger
- Keyboard shortcut
3. Command Processing Pipeline
Natural Language Parser
- Claude Code interprets spoken commands
- Converts to Home Assistant service calls
- Handles context and ambiguity
Command Categories:
- Direct device control: "Turn off living room lights"
- Scene activation: "Set movie mode"
- Status queries: "What's the temperature upstairs?"
- Complex automations: "Start my morning routine"
4. Claude Code Integration
API Bridge Service
- Local service accepting STT output
- Formats requests to Claude Code API
- Maintains conversation context
- Returns structured HA commands
Command Translation Examples:
Speech: "Turn down the bedroom lights"
Claude: Interprets as light.turn_on service call
HA Command: {"service": "light.turn_on", "target": "light.bedroom", "brightness_pct": 30}
5. Home Assistant Integration
RESTful API Integration
- Direct API calls to HA instance
- WebSocket connection for real-time updates
- Authentication via long-lived access tokens
Voice Response Integration
- HA TTS service for confirmations
- Status announcements
- Error handling feedback
Deployment Architecture
Container Stack Addition
# Add to existing HA docker-compose.yml
# STT Service
whisper-api:
container_name: ha-whisper
image: onerahmet/openai-whisper-asr-webservice:latest
ports:
- "9000:9000"
environment:
- ASR_MODEL=base # or small, medium, large
volumes:
- ./whisper-models:/root/.cache/whisper
deploy:
resources:
reservations:
devices:
- driver: nvidia # Optional GPU acceleration
count: 1
capabilities: [gpu]
# Voice Processing Bridge
voice-bridge:
container_name: ha-voice-bridge
build: ./voice-bridge # Custom service
ports:
- "8080:8080"
environment:
- CLAUDE_API_KEY=${CLAUDE_API_KEY}
- HA_URL=http://homeassistant:8123
- HA_TOKEN=${HA_TOKEN}
- WHISPER_URL=http://whisper-api:9000
volumes:
- ./voice-bridge-config:/config
depends_on:
- homeassistant
- whisper-api
Hardware Requirements
Microphone Setup:
- USB microphone or audio interface
- Raspberry Pi with mic for remote rooms
- Existing smart speakers (if hackable)
Processing Power:
- Whisper base model: ~1GB RAM, CPU sufficient
- Whisper large model: ~2GB RAM, GPU recommended
- Your Proxmox setup can easily handle this
Privacy & Security Considerations
Local-First Design
- All STT processing on local hardware
- No cloud APIs for voice recognition
- Claude Code API calls only for command interpretation
- HA commands never leave local network
Security Architecture
Internet ← [Firewall] ← [Claude API calls only] ← [Voice Bridge] ← [Local STT] ← [Microphone]
↓
[Home Assistant] ← [Local Network Only]
Data Flow
- Audio capture - stays local
- STT processing - stays local
- Text command - sent to Claude Code API (text only)
- HA commands - executed locally
- No audio data ever leaves your network
Implementation Phases
Phase 1: Core STT Integration
- Deploy Whisper container
- Basic speech-to-text testing
- Integration with HA via simple commands
Phase 2: Claude Code Bridge
- Build voice-bridge service
- Integrate Claude Code API for command interpretation
- Basic natural language processing
Phase 3: Advanced Features
- Wake word detection
- Multi-room microphone setup
- Context-aware conversations
- Voice response integration
Phase 4: Optimization
- GPU acceleration for STT
- Custom wake words
- Conversation memory
- Advanced natural language understanding
Example Use Cases
Simple Commands
- "Turn off all lights"
- "Set temperature to 72 degrees"
- "Activate movie scene"
Complex Requests
- "Turn on the lights in rooms where people are detected"
- "Start my bedtime routine in 10 minutes"
- "If it's going to rain tomorrow, close the garage door"
Status Queries
- "What's the status of the security system?"
- "Are all the doors locked?"
- "Show me energy usage this month"
Integration with Existing Plans
This voice system would layer on top of your planned HA deployment:
- No changes to core HA architecture
- Additional containers for voice processing
- API integration rather than HA core modifications
- Gradual rollout after HA migration is stable
The voice system becomes another automation trigger alongside:
- Time-based automations
- Sensor-based automations
- Manual app/dashboard controls
- Voice commands via Claude Code