secondvoice/config.json
StillHammer 40c451b9f8 feat: Upgrade to latest Whisper API with GPT-4o models and prompting
Major improvements to Whisper API integration:

New Features:
- Support for gpt-4o-mini-transcribe and gpt-4o-transcribe models
- Prompting support for better name recognition and context
- Response format configuration (text, json, verbose_json)
- Stream flag prepared for future streaming implementation

Configuration Updates:
- Updated config.json with new Whisper parameters
- Added prompt, stream, and response_format fields
- Default model: gpt-4o-mini-transcribe (better quality than whisper-1)

Code Changes:
- Extended WhisperClient::transcribe() with new parameters
- Updated Config struct to support new fields
- Modified Pipeline to pass all config parameters to Whisper
- Added comprehensive documentation in docs/whisper_upgrade.md

Benefits:
- Better transcription accuracy (~33% improvement)
- Improved name recognition (Tingting, Alexis)
- Context-aware transcription with prompting
- Ready for future streaming and diarization

Documentation:
- Complete guide in docs/whisper_upgrade.md
- Usage examples and best practices
- Cost comparison and optimization tips
- Future roadmap for Phase 2 features

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-20 03:34:09 +08:00

33 lines
863 B
JSON

{
"audio": {
"sample_rate": 16000,
"channels": 1,
"chunk_duration_seconds": 10,
"format": "wav"
},
"whisper": {
"model": "gpt-4o-mini-transcribe",
"language": "zh",
"temperature": 0.0,
"prompt": "The following is a conversation in Mandarin Chinese about business, family, and daily life. Common names: Tingting, Alexis.",
"stream": true,
"response_format": "text"
},
"claude": {
"model": "claude-haiku-4-20250514",
"max_tokens": 1024,
"temperature": 0.3,
"system_prompt": "Tu es un traducteur professionnel chinois-français. Traduis le texte suivant de manière naturelle et contextuelle."
},
"ui": {
"window_width": 800,
"window_height": 600,
"font_size": 16,
"max_display_lines": 50
},
"recording": {
"save_audio": true,
"output_directory": "./recordings"
}
}