StillHammer 99a9cc22d7 feat: Add Windows support with .exe build

Complete Windows build support for SecondVoice:

Build System:
- Added CMakePresets.json with Windows and Linux presets
- Created build.bat script for easy Windows builds
- Support for Visual Studio 2019+ with Ninja generator
- Automatic vcpkg integration and dependency installation

Scripts:
- build.bat with Debug/Release modes and clean builds
- Auto-detection of Visual Studio and compiler tools
- User-friendly error messages and setup instructions

Documentation:
- Comprehensive docs/build_windows.md guide
- Step-by-step Windows build instructions
- Troubleshooting section for common issues
- Distribution guide for portable .exe

Updates:
- Updated README.md with cross-platform instructions
- Enhanced .gitignore for Windows build artifacts
- Separate build directories for Windows/Linux

Platform Support:
- Windows 10/11 with Visual Studio 2019+
- Linux with GCC/Clang (existing)
- Shared vcpkg dependencies across platforms

Output:
- Windows: build/windows-release/Release/SecondVoice.exe
- Linux: build/SecondVoice

Next Steps:
- Build on Windows with: build.bat --release
- Executable ready for distribution
- Same config.json and .env work cross-platform

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-11-20 03:38:18 +08:00

5.0 KiB

Raw Blame History

SecondVoice

Real-time Chinese to French translation system for live meetings.

Overview

SecondVoice captures audio, transcribes Chinese speech using OpenAI's Whisper API, and translates it to French using Claude AI in real-time. Perfect for understanding Chinese meetings on the fly.

Features

🎤 Real-time audio capture
🗣️ Chinese speech-to-text (Whisper API)
🌐 Chinese to French translation (Claude API)
🖥️ Clean ImGui interface
💾 Full recording saved to disk
⚙️ Configurable chunk sizes and settings

Requirements

Cross-Platform Support

SecondVoice works on Windows and Linux.

Windows

Visual Studio 2019 or later (with C++ tools)
vcpkg package manager
See detailed guide: docs/build_windows.md

Linux

GCC/Clang with C++17 support
System dependencies: libasound2-dev, libgl1-mesa-dev, libglu1-mesa-dev
vcpkg package manager

vcpkg Installation

Linux:

git clone https://github.com/microsoft/vcpkg.git
cd vcpkg
./bootstrap-vcpkg.sh
export VCPKG_ROOT=$(pwd)

Windows:

git clone https://github.com/microsoft/vcpkg.git C:\vcpkg
cd C:\vcpkg
.\bootstrap-vcpkg.bat
setx VCPKG_ROOT "C:\vcpkg"

Setup

Clone the repository

git clone <repository-url>
cd secondvoice

Create .env file (copy from .env.example)

Linux:

cp .env.example .env
nano .env
# Add your API keys:
# OPENAI_API_KEY=sk-...
# ANTHROPIC_API_KEY=sk-ant-...

Windows:

copy .env.example .env
notepad .env
# Add your API keys

Build the project

Linux:

./build.sh
# Or manually:
# cmake -B build -DCMAKE_TOOLCHAIN_FILE=$VCPKG_ROOT/scripts/buildsystems/vcpkg.cmake
# cmake --build build -j$(nproc)

Windows:

build.bat --release
REM Or see detailed guide: docs/build_windows.md

Usage

Linux:

cd build
./SecondVoice

Windows:

cd build\windows-release\Release
SecondVoice.exe

The application will:

Open an ImGui window
Start capturing audio from your microphone
Display Chinese transcriptions and French translations in real-time
Click STOP RECORDING button to finish
Save the full audio recording to recordings/recording_YYYYMMDD_HHMMSS.wav

Architecture

Audio Capture (PortAudio)
    ↓
Whisper API (Speech-to-Text)
    ↓
Claude API (Translation)
    ↓
ImGui UI (Display)

Threading Model

Thread 1: Audio capture (PortAudio callback)
Thread 2: AI processing (Whisper + Claude API calls)
Thread 3: UI rendering (ImGui + OpenGL)

Configuration

config.json

{
  "audio": {
    "sample_rate": 16000,
    "channels": 1,
    "chunk_duration_seconds": 10
  },
  "whisper": {
    "model": "whisper-1",
    "language": "zh"
  },
  "claude": {
    "model": "claude-haiku-4-20250514",
    "max_tokens": 1024
  }
}

.env

OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...

Cost Estimation

Whisper: ~$0.006/minute (~$0.36/hour)
Claude Haiku: ~$0.03-0.05/hour
Total: ~$0.40/hour of recording

Project Structure

secondvoice/
├── src/
│   ├── main.cpp                 # Entry point
│   ├── audio/                   # Audio capture & buffer
│   ├── api/                     # Whisper & Claude clients
│   ├── ui/                      # ImGui interface
│   ├── utils/                   # Config & thread-safe queue
│   └── core/                    # Pipeline orchestration
├── docs/                        # Documentation
├── recordings/                  # Output recordings
├── config.json                  # Runtime configuration
├── .env                         # API keys (not committed)
└── CMakeLists.txt              # Build configuration

Development

Building in Debug Mode

cmake -B build -DCMAKE_BUILD_TYPE=Debug -DCMAKE_TOOLCHAIN_FILE=$VCPKG_ROOT/scripts/buildsystems/vcpkg.cmake
cmake --build build

Running Tests

# TODO: Add tests

Troubleshooting

No audio capture

Check microphone permissions
Verify PortAudio is properly installed: pa_devs (if available)
Try different audio device in code

API errors

Verify API keys in .env are correct
Check internet connection
Monitor API rate limits

Build errors

Ensure vcpkg is properly set up
Check all system dependencies are installed
Try cmake --build build --clean-first

Roadmap

Phase 1 - MVP (Current)

✅ Audio capture
✅ Whisper integration
✅ Claude integration
✅ ImGui UI
✅ Stop button

Phase 2 - Enhancement

⬜ Auto-summary post-meeting
⬜ Export transcripts
⬜ Search functionality
⬜ Speaker diarization
⬜ Replay mode

License

See LICENSE file.

Contributing

This is a personal project, but suggestions and bug reports are welcome via issues.

Contact

See docs/SecondVoice.md for project context and motivation.

5.0 KiB Raw Blame History

SecondVoice

Overview

Features

Requirements

Cross-Platform Support

Windows

Linux

vcpkg Installation

Setup

Usage

Architecture

Threading Model

Configuration

config.json

.env

Cost Estimation

Project Structure

Development

Building in Debug Mode

Running Tests

Troubleshooting

No audio capture

API errors

Build errors

Roadmap

Phase 1 - MVP (Current)

Phase 2 - Enhancement

License

Contributing

Contact

5.0 KiB

Raw Blame History