Go to file
2025-11-20 03:09:40 +08:00
docs docs: Add comprehensive next steps guide 2025-11-20 03:09:40 +08:00
recordings feat: Implement complete MVP architecture for SecondVoice 2025-11-20 03:08:03 +08:00
src feat: Implement complete MVP architecture for SecondVoice 2025-11-20 03:08:03 +08:00
.env.example feat: Implement complete MVP architecture for SecondVoice 2025-11-20 03:08:03 +08:00
.gitignore feat: Implement complete MVP architecture for SecondVoice 2025-11-20 03:08:03 +08:00
build.sh feat: Implement complete MVP architecture for SecondVoice 2025-11-20 03:08:03 +08:00
CMakeLists.txt feat: Implement complete MVP architecture for SecondVoice 2025-11-20 03:08:03 +08:00
config.json feat: Implement complete MVP architecture for SecondVoice 2025-11-20 03:08:03 +08:00
README.md feat: Implement complete MVP architecture for SecondVoice 2025-11-20 03:08:03 +08:00
vcpkg.json feat: Implement complete MVP architecture for SecondVoice 2025-11-20 03:08:03 +08:00

SecondVoice

Real-time Chinese to French translation system for live meetings.

Overview

SecondVoice captures audio, transcribes Chinese speech using OpenAI's Whisper API, and translates it to French using Claude AI in real-time. Perfect for understanding Chinese meetings on the fly.

Features

  • 🎤 Real-time audio capture
  • 🗣️ Chinese speech-to-text (Whisper API)
  • 🌐 Chinese to French translation (Claude API)
  • 🖥️ Clean ImGui interface
  • 💾 Full recording saved to disk
  • ⚙️ Configurable chunk sizes and settings

Requirements

System Dependencies (Linux)

# PortAudio
sudo apt install libasound2-dev

# OpenGL
sudo apt install libgl1-mesa-dev libglu1-mesa-dev

vcpkg

Install vcpkg if not already installed:

git clone https://github.com/microsoft/vcpkg.git
cd vcpkg
./bootstrap-vcpkg.sh
export VCPKG_ROOT=$(pwd)

Setup

  1. Clone the repository
git clone <repository-url>
cd secondvoice
  1. Create .env file (copy from .env.example)
cp .env.example .env
# Edit .env and add your API keys:
# OPENAI_API_KEY=sk-...
# ANTHROPIC_API_KEY=sk-ant-...
  1. Configure settings (optional)

Edit config.json to customize:

  • Audio chunk duration (default: 10s)
  • Sample rate (default: 16kHz)
  • UI window size
  • Output directory
  1. Build the project
# Configure with vcpkg
cmake -B build -DCMAKE_TOOLCHAIN_FILE=$VCPKG_ROOT/scripts/buildsystems/vcpkg.cmake

# Build
cmake --build build -j$(nproc)

Usage

cd build
./SecondVoice

The application will:

  1. Open an ImGui window
  2. Start capturing audio from your microphone
  3. Display Chinese transcriptions and French translations in real-time
  4. Click STOP RECORDING button to finish
  5. Save the full audio recording to recordings/recording_YYYYMMDD_HHMMSS.wav

Architecture

Audio Capture (PortAudio)
    ↓
Whisper API (Speech-to-Text)
    ↓
Claude API (Translation)
    ↓
ImGui UI (Display)

Threading Model

  • Thread 1: Audio capture (PortAudio callback)
  • Thread 2: AI processing (Whisper + Claude API calls)
  • Thread 3: UI rendering (ImGui + OpenGL)

Configuration

config.json

{
  "audio": {
    "sample_rate": 16000,
    "channels": 1,
    "chunk_duration_seconds": 10
  },
  "whisper": {
    "model": "whisper-1",
    "language": "zh"
  },
  "claude": {
    "model": "claude-haiku-4-20250514",
    "max_tokens": 1024
  }
}

.env

OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...

Cost Estimation

  • Whisper: ~$0.006/minute (~$0.36/hour)
  • Claude Haiku: ~$0.03-0.05/hour
  • Total: ~$0.40/hour of recording

Project Structure

secondvoice/
├── src/
│   ├── main.cpp                 # Entry point
│   ├── audio/                   # Audio capture & buffer
│   ├── api/                     # Whisper & Claude clients
│   ├── ui/                      # ImGui interface
│   ├── utils/                   # Config & thread-safe queue
│   └── core/                    # Pipeline orchestration
├── docs/                        # Documentation
├── recordings/                  # Output recordings
├── config.json                  # Runtime configuration
├── .env                         # API keys (not committed)
└── CMakeLists.txt              # Build configuration

Development

Building in Debug Mode

cmake -B build -DCMAKE_BUILD_TYPE=Debug -DCMAKE_TOOLCHAIN_FILE=$VCPKG_ROOT/scripts/buildsystems/vcpkg.cmake
cmake --build build

Running Tests

# TODO: Add tests

Troubleshooting

No audio capture

  • Check microphone permissions
  • Verify PortAudio is properly installed: pa_devs (if available)
  • Try different audio device in code

API errors

  • Verify API keys in .env are correct
  • Check internet connection
  • Monitor API rate limits

Build errors

  • Ensure vcpkg is properly set up
  • Check all system dependencies are installed
  • Try cmake --build build --clean-first

Roadmap

Phase 1 - MVP (Current)

  • Audio capture
  • Whisper integration
  • Claude integration
  • ImGui UI
  • Stop button

Phase 2 - Enhancement

  • Auto-summary post-meeting
  • Export transcripts
  • Search functionality
  • Speaker diarization
  • Replay mode

License

See LICENSE file.

Contributing

This is a personal project, but suggestions and bug reports are welcome via issues.

Contact

See docs/SecondVoice.md for project context and motivation.