Major features:
- Session logging system with detailed segment tracking (audio files, metadata, latencies)
- Input gain control (0.5x-5.0x amplifier) with soft clipping
- Context-aware Whisper prompts using recent transcriptions
- Comprehensive segment metadata (RMS, peak, duration, timestamps)
- API latency measurements for Whisper and Claude
- Audio hash-based duplicate detection
- Hallucination filtering with detailed logging
Changes:
- Add SessionLogger class for structured session data export
- Apply input gain before VAD and denoising (not just raw input)
- Enhanced Pipeline with segment tracking and error logging
- New UI control for input gain amplifier
- Sessions saved to sessions/ directory with transcripts/ export
- Improved Whisper prompt in config.json (French instructions)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Add RNNoise neural network audio denoising (16kHz↔48kHz resampling)
- Add transient suppressor to filter claps/clicks/pops before RNNoise
- VAD now works on FILTERED audio (not raw) to avoid false triggers
- Real-time denoised audio level display in UI
- Save denoised audio previews in Opus format (.ogg)
- Add extensive Whisper hallucination filter (Tingting, music, etc.)
- Add "Clear" button to reset accumulated translations
- Double VAD thresholds (0.02/0.08) for less sensitivity
- Update Claude prompt to handle offensive content gracefully
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Replace cpp-httplib with native WinHTTP for HTTPS support
- Switch from whisper-1 to gpt-4o-mini-transcribe model
- Use Opus/OGG encoding instead of WAV (~10x smaller files)
- Implement sliding window audio capture with overlap
- Add transcription deduplication for overlapping segments
- Add Voice Activity Detection (VAD) to filter silence/noise
- Filter Whisper hallucinations (Amara.org, etc.)
- Add UTF-8 console support for Chinese characters
- Add Chinese font loading in ImGui
- Make Claude responses concise (translation only, no explanations)
- Configurable window size, font size, chunk duration/step
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Replace WAV with Opus encoding for Whisper API uploads:
- Add libopus & libogg via FetchContent (no vcpkg dependency)
- Implement AudioBuffer::saveToOpus() with Ogg container
- Configure Opus encoder for voice (VOIP mode, 32kbps VBR)
- Update WhisperClient to use Opus format (audio/ogg)
- Fix Windows temp file path compatibility
Benefits:
- 46x smaller files (37KB vs 1.7MB for 10s audio)
- Reduced API costs and bandwidth
- Faster uploads for real-time translation
- Whisper API fully supports Opus/Ogg format
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
PROBLÈME RÉSOLU: Les shaders ImGui compilent maintenant avec succès!
Changements majeurs:
- Remplacé vcpkg ImGui par FetchContent (compilation from source)
- Créé wrapper GLAD pour ImGui (imgui_opengl3_glad.cpp)
- Ajout de makeContextCurrent() pour gérer le contexte OpenGL multi-thread
- Release du contexte dans initialize(), puis rendu current dans uiThread()
Root Cause Analysis:
1. Rendering s'exécute dans uiThread() (thread séparé)
2. Contexte OpenGL créé dans thread principal n'était pas accessible
3. glCreateShader retournait 0 avec GL_INVALID_OPERATION (erreur 1282)
4. Solution: Transfer du contexte OpenGL du thread principal vers uiThread
Debugging profond:
- Ajout de logs debug dans ImGui pour tracer glCreateShader
- Découvert que handle=0 indiquait échec de création (pas compilation)
- Identifié erreur WGL \"ressource en cours d'utilisation\" = contexte locked
Fichiers modifiés:
- vcpkg.json: Supprimé imgui
- CMakeLists.txt: FetchContent pour ImGui + imgui_backends library
- src/imgui_opengl3_glad.cpp: Nouveau wrapper GLAD
- src/ui/TranslationUI.{h,cpp}: Ajout makeContextCurrent()
- src/core/Pipeline.cpp: Appel makeContextCurrent() dans uiThread()
- build/.../imgui_impl_opengl3.cpp: Debug logs (temporaire)
Résultat: UI fonctionne! NVIDIA RTX 4060 GPU, OpenGL 3.3.0, shaders compilent
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Changes:
- Add GLAD dependency via vcpkg for proper OpenGL function loading
- Force NVIDIA GPU usage with game-style exports (NvOptimusEnablement)
- Create working console version (SecondVoice_Console.exe)
- Add dual executable build (UI + Console versions)
- Update to OpenGL 4.6 Core Profile with GLSL 460
- Add GPU detection and logging
- Fix GLFW header conflicts with GLFW_INCLUDE_NONE
Note: OpenGL shaders still failing to compile despite GLAD integration.
Console version is fully functional for audio capture and translation.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Build fixes:
- Add missing includes (<cstdint>, <iomanip>, <sstream>, <string>, <vector>)
- Fix unused parameter warnings with (void) casts
- Fix cpp-httplib API: Use UploadFormDataItems instead of MultipartFormDataItems
- Fix portaudio linking: Use portaudio_static target instead of portaudio
All modules now compile without errors. Executable built successfully (13MB).
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>