Implemented complete STT (Speech-to-Text) system with 4 engines:
1. **PocketSphinxEngine** (new)
- Lightweight keyword spotting
- Perfect for passive wake word detection
- ~10MB model, very low CPU/RAM usage
- Keywords: "celuna", "hey celuna", etc.
2. **VoskSTTEngine** (existing)
- Balanced local STT for full transcription
- 50MB models, good accuracy
- Already working
3. **WhisperCppEngine** (new)
- High-quality offline STT using whisper.cpp
- 75MB-2.9GB models depending on quality
- Excellent accuracy, runs entirely local
4. **WhisperAPIEngine** (existing)
- Cloud STT via OpenAI Whisper API
- Best accuracy, requires internet + API key
- Already working
Features:
- Full JSON configuration via config/voice.json
- Auto-selection mode tries engines in order
- Dual mode support (passive + active)
- Fallback chain for reliability
- All engines use ISTTEngine interface
Updated:
- STTEngineFactory: Added support for all 4 engines
- CMakeLists.txt: Added new source files
- docs/STT_CONFIGURATION.md: Complete config guide
Config example (voice.json):
{
"passive_mode": { "engine": "pocketsphinx" },
"active_mode": { "engine": "vosk", "fallback": "whisper-api" }
}
Architecture: ISTTService → STTEngineFactory → 4 engines
Build: ✅ Compiles successfully
Status: Phase 7 complete, ready for testing
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
16 lines
283 B
JSON
16 lines
283 B
JSON
{
|
|
"environment": {
|
|
"platform": "linux",
|
|
"testDirectory": "tests/integration"
|
|
},
|
|
"summary": {
|
|
"failed": 0,
|
|
"passed": 0,
|
|
"skipped": 0,
|
|
"successRate": 0.0,
|
|
"total": 0,
|
|
"totalDurationMs": 0
|
|
},
|
|
"tests": [],
|
|
"timestamp": "2025-11-29T09:01:38Z"
|
|
} |