Complete Windows build support for SecondVoice: Build System: - Added CMakePresets.json with Windows and Linux presets - Created build.bat script for easy Windows builds - Support for Visual Studio 2019+ with Ninja generator - Automatic vcpkg integration and dependency installation Scripts: - build.bat with Debug/Release modes and clean builds - Auto-detection of Visual Studio and compiler tools - User-friendly error messages and setup instructions Documentation: - Comprehensive docs/build_windows.md guide - Step-by-step Windows build instructions - Troubleshooting section for common issues - Distribution guide for portable .exe Updates: - Updated README.md with cross-platform instructions - Enhanced .gitignore for Windows build artifacts - Separate build directories for Windows/Linux Platform Support: - Windows 10/11 with Visual Studio 2019+ - Linux with GCC/Clang (existing) - Shared vcpkg dependencies across platforms Output: - Windows: build/windows-release/Release/SecondVoice.exe - Linux: build/SecondVoice Next Steps: - Build on Windows with: build.bat --release - Executable ready for distribution - Same config.json and .env work cross-platform 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
249 lines
5.0 KiB
Markdown
249 lines
5.0 KiB
Markdown
# SecondVoice
|
|
|
|
Real-time Chinese to French translation system for live meetings.
|
|
|
|
## Overview
|
|
|
|
SecondVoice captures audio, transcribes Chinese speech using OpenAI's Whisper API, and translates it to French using Claude AI in real-time. Perfect for understanding Chinese meetings on the fly.
|
|
|
|
## Features
|
|
|
|
- 🎤 Real-time audio capture
|
|
- 🗣️ Chinese speech-to-text (Whisper API)
|
|
- 🌐 Chinese to French translation (Claude API)
|
|
- 🖥️ Clean ImGui interface
|
|
- 💾 Full recording saved to disk
|
|
- ⚙️ Configurable chunk sizes and settings
|
|
|
|
## Requirements
|
|
|
|
### Cross-Platform Support
|
|
|
|
SecondVoice works on **Windows** and **Linux**.
|
|
|
|
#### Windows
|
|
- Visual Studio 2019 or later (with C++ tools)
|
|
- vcpkg package manager
|
|
- See detailed guide: [docs/build_windows.md](docs/build_windows.md)
|
|
|
|
#### Linux
|
|
- GCC/Clang with C++17 support
|
|
- System dependencies: `libasound2-dev`, `libgl1-mesa-dev`, `libglu1-mesa-dev`
|
|
- vcpkg package manager
|
|
|
|
### vcpkg Installation
|
|
|
|
**Linux**:
|
|
```bash
|
|
git clone https://github.com/microsoft/vcpkg.git
|
|
cd vcpkg
|
|
./bootstrap-vcpkg.sh
|
|
export VCPKG_ROOT=$(pwd)
|
|
```
|
|
|
|
**Windows**:
|
|
```powershell
|
|
git clone https://github.com/microsoft/vcpkg.git C:\vcpkg
|
|
cd C:\vcpkg
|
|
.\bootstrap-vcpkg.bat
|
|
setx VCPKG_ROOT "C:\vcpkg"
|
|
```
|
|
|
|
## Setup
|
|
|
|
1. **Clone the repository**
|
|
|
|
```bash
|
|
git clone <repository-url>
|
|
cd secondvoice
|
|
```
|
|
|
|
2. **Create `.env` file** (copy from `.env.example`)
|
|
|
|
**Linux**:
|
|
```bash
|
|
cp .env.example .env
|
|
nano .env
|
|
# Add your API keys:
|
|
# OPENAI_API_KEY=sk-...
|
|
# ANTHROPIC_API_KEY=sk-ant-...
|
|
```
|
|
|
|
**Windows**:
|
|
```powershell
|
|
copy .env.example .env
|
|
notepad .env
|
|
# Add your API keys
|
|
```
|
|
|
|
3. **Build the project**
|
|
|
|
**Linux**:
|
|
```bash
|
|
./build.sh
|
|
# Or manually:
|
|
# cmake -B build -DCMAKE_TOOLCHAIN_FILE=$VCPKG_ROOT/scripts/buildsystems/vcpkg.cmake
|
|
# cmake --build build -j$(nproc)
|
|
```
|
|
|
|
**Windows**:
|
|
```batch
|
|
build.bat --release
|
|
REM Or see detailed guide: docs/build_windows.md
|
|
```
|
|
|
|
## Usage
|
|
|
|
**Linux**:
|
|
```bash
|
|
cd build
|
|
./SecondVoice
|
|
```
|
|
|
|
**Windows**:
|
|
```batch
|
|
cd build\windows-release\Release
|
|
SecondVoice.exe
|
|
```
|
|
|
|
The application will:
|
|
1. Open an ImGui window
|
|
2. Start capturing audio from your microphone
|
|
3. Display Chinese transcriptions and French translations in real-time
|
|
4. Click **STOP RECORDING** button to finish
|
|
5. Save the full audio recording to `recordings/recording_YYYYMMDD_HHMMSS.wav`
|
|
|
|
## Architecture
|
|
|
|
```
|
|
Audio Capture (PortAudio)
|
|
↓
|
|
Whisper API (Speech-to-Text)
|
|
↓
|
|
Claude API (Translation)
|
|
↓
|
|
ImGui UI (Display)
|
|
```
|
|
|
|
### Threading Model
|
|
|
|
- **Thread 1**: Audio capture (PortAudio callback)
|
|
- **Thread 2**: AI processing (Whisper + Claude API calls)
|
|
- **Thread 3**: UI rendering (ImGui + OpenGL)
|
|
|
|
## Configuration
|
|
|
|
### config.json
|
|
|
|
```json
|
|
{
|
|
"audio": {
|
|
"sample_rate": 16000,
|
|
"channels": 1,
|
|
"chunk_duration_seconds": 10
|
|
},
|
|
"whisper": {
|
|
"model": "whisper-1",
|
|
"language": "zh"
|
|
},
|
|
"claude": {
|
|
"model": "claude-haiku-4-20250514",
|
|
"max_tokens": 1024
|
|
}
|
|
}
|
|
```
|
|
|
|
### .env
|
|
|
|
```env
|
|
OPENAI_API_KEY=sk-...
|
|
ANTHROPIC_API_KEY=sk-ant-...
|
|
```
|
|
|
|
## Cost Estimation
|
|
|
|
- **Whisper**: ~$0.006/minute (~$0.36/hour)
|
|
- **Claude Haiku**: ~$0.03-0.05/hour
|
|
- **Total**: ~$0.40/hour of recording
|
|
|
|
## Project Structure
|
|
|
|
```
|
|
secondvoice/
|
|
├── src/
|
|
│ ├── main.cpp # Entry point
|
|
│ ├── audio/ # Audio capture & buffer
|
|
│ ├── api/ # Whisper & Claude clients
|
|
│ ├── ui/ # ImGui interface
|
|
│ ├── utils/ # Config & thread-safe queue
|
|
│ └── core/ # Pipeline orchestration
|
|
├── docs/ # Documentation
|
|
├── recordings/ # Output recordings
|
|
├── config.json # Runtime configuration
|
|
├── .env # API keys (not committed)
|
|
└── CMakeLists.txt # Build configuration
|
|
```
|
|
|
|
## Development
|
|
|
|
### Building in Debug Mode
|
|
|
|
```bash
|
|
cmake -B build -DCMAKE_BUILD_TYPE=Debug -DCMAKE_TOOLCHAIN_FILE=$VCPKG_ROOT/scripts/buildsystems/vcpkg.cmake
|
|
cmake --build build
|
|
```
|
|
|
|
### Running Tests
|
|
|
|
```bash
|
|
# TODO: Add tests
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### No audio capture
|
|
|
|
- Check microphone permissions
|
|
- Verify PortAudio is properly installed: `pa_devs` (if available)
|
|
- Try different audio device in code
|
|
|
|
### API errors
|
|
|
|
- Verify API keys in `.env` are correct
|
|
- Check internet connection
|
|
- Monitor API rate limits
|
|
|
|
### Build errors
|
|
|
|
- Ensure vcpkg is properly set up
|
|
- Check all system dependencies are installed
|
|
- Try `cmake --build build --clean-first`
|
|
|
|
## Roadmap
|
|
|
|
### Phase 1 - MVP (Current)
|
|
- ✅ Audio capture
|
|
- ✅ Whisper integration
|
|
- ✅ Claude integration
|
|
- ✅ ImGui UI
|
|
- ✅ Stop button
|
|
|
|
### Phase 2 - Enhancement
|
|
- ⬜ Auto-summary post-meeting
|
|
- ⬜ Export transcripts
|
|
- ⬜ Search functionality
|
|
- ⬜ Speaker diarization
|
|
- ⬜ Replay mode
|
|
|
|
## License
|
|
|
|
See LICENSE file.
|
|
|
|
## Contributing
|
|
|
|
This is a personal project, but suggestions and bug reports are welcome via issues.
|
|
|
|
## Contact
|
|
|
|
See docs/SecondVoice.md for project context and motivation.
|