Go to file

StillHammer 94ad6b4a22 feat: Add MinGW support - Build without Visual Studio! Lightweight Windows build option using MinGW-w64 instead of Visual Studio: Size Comparison: - Visual Studio: 10-20 GB install - MinGW: ~500 MB install (20x smaller!) New Files: - setup_mingw.bat: One-click installer for all tools - Chocolatey (package manager) - MinGW-w64 (GCC compiler) - CMake, Ninja, Git - vcpkg integration - build_mingw.bat: Build script for MinGW - Auto-detection of GCC - Debug/Release modes - Clean build support - User-friendly error messages - WINDOWS_MINGW.md: Complete MinGW guide - Installation instructions - Troubleshooting - Performance comparison MSVC vs GCC - Distribution guide CMake Updates: - Added mingw-debug and mingw-release presets - GCC compiler flags: -O3 -Wall -Wextra - Static linking for portable .exe Documentation: - Updated WINDOWS_QUICK_START.md with MinGW option - Comparison table: MinGW vs Visual Studio - Recommendation: MinGW for most users Benefits: - 20x smaller download (500MB vs 10-20GB) - 5-10 min install vs 30-60 min - Same performance as MSVC - Portable standalone .exe - Perfect for users without Visual Studio Usage: 1. Run setup_mingw.bat (one time) 2. Restart terminal 3. Run build_mingw.bat --release 4. Done! Output: build/mingw-release/SecondVoice.exe 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>		2025-11-20 03:42:41 +08:00
docs	feat: Add Windows support with .exe build	2025-11-20 03:38:18 +08:00
recordings	feat: Implement complete MVP architecture for SecondVoice	2025-11-20 03:08:03 +08:00
src	feat: Upgrade to latest Whisper API with GPT-4o models and prompting	2025-11-20 03:34:09 +08:00
.env.example	feat: Implement complete MVP architecture for SecondVoice	2025-11-20 03:08:03 +08:00
.gitignore	feat: Add Windows support with .exe build	2025-11-20 03:38:18 +08:00
build_mingw.bat	feat: Add MinGW support - Build without Visual Studio!	2025-11-20 03:42:41 +08:00
build.bat	feat: Add Windows support with .exe build	2025-11-20 03:38:18 +08:00
build.sh	feat: Implement complete MVP architecture for SecondVoice	2025-11-20 03:08:03 +08:00
CMakeLists.txt	fix: Resolve compilation errors and build successfully	2025-11-20 03:27:18 +08:00
CMakePresets.json	feat: Add MinGW support - Build without Visual Studio!	2025-11-20 03:42:41 +08:00
config.json	feat: Upgrade to latest Whisper API with GPT-4o models and prompting	2025-11-20 03:34:09 +08:00
README.md	feat: Add Windows support with .exe build	2025-11-20 03:38:18 +08:00
setup_mingw.bat	feat: Add MinGW support - Build without Visual Studio!	2025-11-20 03:42:41 +08:00
vcpkg.json	feat: Implement complete MVP architecture for SecondVoice	2025-11-20 03:08:03 +08:00
WINDOWS_MINGW.md	feat: Add MinGW support - Build without Visual Studio!	2025-11-20 03:42:41 +08:00
WINDOWS_QUICK_START.md	feat: Add MinGW support - Build without Visual Studio!	2025-11-20 03:42:41 +08:00

README.md

SecondVoice

Real-time Chinese to French translation system for live meetings.

Overview

SecondVoice captures audio, transcribes Chinese speech using OpenAI's Whisper API, and translates it to French using Claude AI in real-time. Perfect for understanding Chinese meetings on the fly.

Features

🎤 Real-time audio capture
🗣️ Chinese speech-to-text (Whisper API)
🌐 Chinese to French translation (Claude API)
🖥️ Clean ImGui interface
💾 Full recording saved to disk
⚙️ Configurable chunk sizes and settings

Requirements

Cross-Platform Support

SecondVoice works on Windows and Linux.

Windows

Visual Studio 2019 or later (with C++ tools)
vcpkg package manager
See detailed guide: docs/build_windows.md

Linux

GCC/Clang with C++17 support
System dependencies: libasound2-dev, libgl1-mesa-dev, libglu1-mesa-dev
vcpkg package manager

vcpkg Installation

Linux:

git clone https://github.com/microsoft/vcpkg.git
cd vcpkg
./bootstrap-vcpkg.sh
export VCPKG_ROOT=$(pwd)

Windows:

git clone https://github.com/microsoft/vcpkg.git C:\vcpkg
cd C:\vcpkg
.\bootstrap-vcpkg.bat
setx VCPKG_ROOT "C:\vcpkg"

Setup

Clone the repository

git clone <repository-url>
cd secondvoice

Create .env file (copy from .env.example)

Linux:

cp .env.example .env
nano .env
# Add your API keys:
# OPENAI_API_KEY=sk-...
# ANTHROPIC_API_KEY=sk-ant-...

Windows:

copy .env.example .env
notepad .env
# Add your API keys

Build the project

Linux:

./build.sh
# Or manually:
# cmake -B build -DCMAKE_TOOLCHAIN_FILE=$VCPKG_ROOT/scripts/buildsystems/vcpkg.cmake
# cmake --build build -j$(nproc)

Windows:

build.bat --release
REM Or see detailed guide: docs/build_windows.md

Usage

Linux:

cd build
./SecondVoice

Windows:

cd build\windows-release\Release
SecondVoice.exe

The application will:

Open an ImGui window
Start capturing audio from your microphone
Display Chinese transcriptions and French translations in real-time
Click STOP RECORDING button to finish
Save the full audio recording to recordings/recording_YYYYMMDD_HHMMSS.wav

Architecture

Audio Capture (PortAudio)
    ↓
Whisper API (Speech-to-Text)
    ↓
Claude API (Translation)
    ↓
ImGui UI (Display)

Threading Model

Thread 1: Audio capture (PortAudio callback)
Thread 2: AI processing (Whisper + Claude API calls)
Thread 3: UI rendering (ImGui + OpenGL)

Configuration

config.json

{
  "audio": {
    "sample_rate": 16000,
    "channels": 1,
    "chunk_duration_seconds": 10
  },
  "whisper": {
    "model": "whisper-1",
    "language": "zh"
  },
  "claude": {
    "model": "claude-haiku-4-20250514",
    "max_tokens": 1024
  }
}

.env

OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...

Cost Estimation

Whisper: ~$0.006/minute (~$0.36/hour)
Claude Haiku: ~$0.03-0.05/hour
Total: ~$0.40/hour of recording

Project Structure

secondvoice/
├── src/
│   ├── main.cpp                 # Entry point
│   ├── audio/                   # Audio capture & buffer
│   ├── api/                     # Whisper & Claude clients
│   ├── ui/                      # ImGui interface
│   ├── utils/                   # Config & thread-safe queue
│   └── core/                    # Pipeline orchestration
├── docs/                        # Documentation
├── recordings/                  # Output recordings
├── config.json                  # Runtime configuration
├── .env                         # API keys (not committed)
└── CMakeLists.txt              # Build configuration

Development

Building in Debug Mode

cmake -B build -DCMAKE_BUILD_TYPE=Debug -DCMAKE_TOOLCHAIN_FILE=$VCPKG_ROOT/scripts/buildsystems/vcpkg.cmake
cmake --build build

Running Tests

# TODO: Add tests

Troubleshooting

No audio capture

Check microphone permissions
Verify PortAudio is properly installed: pa_devs (if available)
Try different audio device in code

API errors

Verify API keys in .env are correct
Check internet connection
Monitor API rate limits

Build errors

Ensure vcpkg is properly set up
Check all system dependencies are installed
Try cmake --build build --clean-first

Roadmap

Phase 1 - MVP (Current)

✅ Audio capture
✅ Whisper integration
✅ Claude integration
✅ ImGui UI
✅ Stop button

Phase 2 - Enhancement

⬜ Auto-summary post-meeting
⬜ Export transcripts
⬜ Search functionality
⬜ Speaker diarization
⬜ Replay mode

License

See LICENSE file.

Contributing

This is a personal project, but suggestions and bug reports are welcome via issues.

Contact

See docs/SecondVoice.md for project context and motivation.