# SecondVoice Real-time Chinese to French translation system for live meetings. ## Overview SecondVoice captures audio, transcribes Chinese speech using OpenAI's Whisper API, and translates it to French using Claude AI in real-time. Perfect for understanding Chinese meetings on the fly. ## Features - 🎤 Real-time audio capture - 🗣️ Chinese speech-to-text (Whisper API) - 🌐 Chinese to French translation (Claude API) - 🖥️ Clean ImGui interface - 💾 Full recording saved to disk - ⚙️ Configurable chunk sizes and settings ## Requirements ### System Dependencies (Linux) ```bash # PortAudio sudo apt install libasound2-dev # OpenGL sudo apt install libgl1-mesa-dev libglu1-mesa-dev ``` ### vcpkg Install vcpkg if not already installed: ```bash git clone https://github.com/microsoft/vcpkg.git cd vcpkg ./bootstrap-vcpkg.sh export VCPKG_ROOT=$(pwd) ``` ## Setup 1. **Clone the repository** ```bash git clone cd secondvoice ``` 2. **Create `.env` file** (copy from `.env.example`) ```bash cp .env.example .env # Edit .env and add your API keys: # OPENAI_API_KEY=sk-... # ANTHROPIC_API_KEY=sk-ant-... ``` 3. **Configure settings** (optional) Edit `config.json` to customize: - Audio chunk duration (default: 10s) - Sample rate (default: 16kHz) - UI window size - Output directory 4. **Build the project** ```bash # Configure with vcpkg cmake -B build -DCMAKE_TOOLCHAIN_FILE=$VCPKG_ROOT/scripts/buildsystems/vcpkg.cmake # Build cmake --build build -j$(nproc) ``` ## Usage ```bash cd build ./SecondVoice ``` The application will: 1. Open an ImGui window 2. Start capturing audio from your microphone 3. Display Chinese transcriptions and French translations in real-time 4. Click **STOP RECORDING** button to finish 5. Save the full audio recording to `recordings/recording_YYYYMMDD_HHMMSS.wav` ## Architecture ``` Audio Capture (PortAudio) ↓ Whisper API (Speech-to-Text) ↓ Claude API (Translation) ↓ ImGui UI (Display) ``` ### Threading Model - **Thread 1**: Audio capture (PortAudio callback) - **Thread 2**: AI processing (Whisper + Claude API calls) - **Thread 3**: UI rendering (ImGui + OpenGL) ## Configuration ### config.json ```json { "audio": { "sample_rate": 16000, "channels": 1, "chunk_duration_seconds": 10 }, "whisper": { "model": "whisper-1", "language": "zh" }, "claude": { "model": "claude-haiku-4-20250514", "max_tokens": 1024 } } ``` ### .env ```env OPENAI_API_KEY=sk-... ANTHROPIC_API_KEY=sk-ant-... ``` ## Cost Estimation - **Whisper**: ~$0.006/minute (~$0.36/hour) - **Claude Haiku**: ~$0.03-0.05/hour - **Total**: ~$0.40/hour of recording ## Project Structure ``` secondvoice/ ├── src/ │ ├── main.cpp # Entry point │ ├── audio/ # Audio capture & buffer │ ├── api/ # Whisper & Claude clients │ ├── ui/ # ImGui interface │ ├── utils/ # Config & thread-safe queue │ └── core/ # Pipeline orchestration ├── docs/ # Documentation ├── recordings/ # Output recordings ├── config.json # Runtime configuration ├── .env # API keys (not committed) └── CMakeLists.txt # Build configuration ``` ## Development ### Building in Debug Mode ```bash cmake -B build -DCMAKE_BUILD_TYPE=Debug -DCMAKE_TOOLCHAIN_FILE=$VCPKG_ROOT/scripts/buildsystems/vcpkg.cmake cmake --build build ``` ### Running Tests ```bash # TODO: Add tests ``` ## Troubleshooting ### No audio capture - Check microphone permissions - Verify PortAudio is properly installed: `pa_devs` (if available) - Try different audio device in code ### API errors - Verify API keys in `.env` are correct - Check internet connection - Monitor API rate limits ### Build errors - Ensure vcpkg is properly set up - Check all system dependencies are installed - Try `cmake --build build --clean-first` ## Roadmap ### Phase 1 - MVP (Current) - ✅ Audio capture - ✅ Whisper integration - ✅ Claude integration - ✅ ImGui UI - ✅ Stop button ### Phase 2 - Enhancement - ⬜ Auto-summary post-meeting - ⬜ Export transcripts - ⬜ Search functionality - ⬜ Speaker diarization - ⬜ Replay mode ## License See LICENSE file. ## Contributing This is a personal project, but suggestions and bug reports are welcome via issues. ## Contact See docs/SecondVoice.md for project context and motivation.