Reorganize repository structure
- Move all Python scripts to tools/ directory - Move documentation files to docs/ directory - Create exams/ and homework/ directories for future use - Remove temporary test file (page1_preview.png) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
parent
acbe1b4769
commit
a61a32b57f
229
docs/chinese_audio_tts_pipeline.md
Normal file
229
docs/chinese_audio_tts_pipeline.md
Normal file
@ -0,0 +1,229 @@
|
||||
# Chinese Audio to Text Extractor - Simple Transcription
|
||||
|
||||
## Objectif
|
||||
|
||||
Extraire le texte de fichiers MP3 de cours de chinois en utilisant Whisper.
|
||||
|
||||
### Problème résolu
|
||||
- Besoin de récupérer le contenu textuel des cours audio
|
||||
- Conversion MP3 → Texte simple et rapide
|
||||
|
||||
### Solution
|
||||
Pipeline minimaliste : MP3 → Whisper → Texte brut
|
||||
|
||||
---
|
||||
|
||||
## Architecture Pipeline
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────┐
|
||||
│ INPUT: cours_chinois.mp3 (45min) │
|
||||
└──────────────┬──────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────┐
|
||||
│ Transcription (Whisper) │
|
||||
│ ├─ Model: whisper-1 (OpenAI API) │
|
||||
│ ├─ Language: zh (mandarin) │
|
||||
│ └─ Output: transcript.txt │
|
||||
└──────────────┬──────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────┐
|
||||
│ OUTPUT: cours_chinois.txt │
|
||||
│ 你好。我叫Alexis。今天我们学习... │
|
||||
└─────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Plan d'Implémentation Python
|
||||
|
||||
### Structure du projet
|
||||
|
||||
```
|
||||
chinese-transcriber/
|
||||
├── transcribe.py # Script principal
|
||||
├── input/ # MP3 source
|
||||
├── output/ # Fichiers .txt générés
|
||||
├── .env # API key
|
||||
└── requirements.txt
|
||||
```
|
||||
|
||||
### Dépendances (requirements.txt)
|
||||
|
||||
```txt
|
||||
openai>=1.0.0 # Whisper API
|
||||
python-dotenv>=1.0.0 # Env variables
|
||||
```
|
||||
|
||||
### Script Principal (transcribe.py)
|
||||
|
||||
```python
|
||||
"""
|
||||
Transcription simple MP3 → TXT avec Whisper
|
||||
"""
|
||||
import openai
|
||||
from pathlib import Path
|
||||
from dotenv import load_dotenv
|
||||
import os
|
||||
|
||||
def transcribe_audio(audio_path: Path, api_key: str) -> str:
|
||||
"""
|
||||
Transcrit un fichier MP3 en chinois
|
||||
|
||||
Args:
|
||||
audio_path: Chemin vers MP3
|
||||
api_key: Clé API OpenAI
|
||||
|
||||
Returns:
|
||||
Texte transcrit
|
||||
"""
|
||||
client = openai.OpenAI(api_key=api_key)
|
||||
|
||||
with open(audio_path, "rb") as audio_file:
|
||||
transcript = client.audio.transcriptions.create(
|
||||
model="whisper-1",
|
||||
file=audio_file,
|
||||
language="zh", # Force mandarin
|
||||
response_format="text" # Texte brut
|
||||
)
|
||||
|
||||
return transcript
|
||||
|
||||
def main():
|
||||
# Load API key
|
||||
load_dotenv()
|
||||
api_key = os.getenv("OPENAI_API_KEY")
|
||||
|
||||
if not api_key:
|
||||
print("Error: OPENAI_API_KEY not found in .env")
|
||||
return
|
||||
|
||||
# Setup paths
|
||||
input_dir = Path("input")
|
||||
output_dir = Path("output")
|
||||
output_dir.mkdir(exist_ok=True)
|
||||
|
||||
# Get MP3 files
|
||||
mp3_files = list(input_dir.glob("*.mp3"))
|
||||
|
||||
if not mp3_files:
|
||||
print(f"No MP3 files found in {input_dir}/")
|
||||
return
|
||||
|
||||
print(f"Found {len(mp3_files)} MP3 files to transcribe\n")
|
||||
|
||||
# Process each file
|
||||
for mp3_file in mp3_files:
|
||||
print(f"Processing: {mp3_file.name}...")
|
||||
|
||||
try:
|
||||
# Transcribe
|
||||
text = transcribe_audio(mp3_file, api_key)
|
||||
|
||||
# Save to TXT
|
||||
output_path = output_dir / f"{mp3_file.stem}.txt"
|
||||
with open(output_path, "w", encoding="utf-8") as f:
|
||||
f.write(text)
|
||||
|
||||
print(f"✓ Saved to: {output_path}\n")
|
||||
|
||||
except Exception as e:
|
||||
print(f"✗ Error: {e}\n")
|
||||
|
||||
print("=== Transcription completed ===")
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Environment Variables (.env)
|
||||
|
||||
```bash
|
||||
OPENAI_API_KEY=sk-...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Estimation Coûts
|
||||
|
||||
### Pour 10 heures de cours audio
|
||||
|
||||
| Service | Coût | Calcul |
|
||||
|---------|------|--------|
|
||||
| **Whisper API** | **$3.60** | 10h × $0.006/min × 60min |
|
||||
|
||||
**Ultra-abordable** pour extraction simple de texte.
|
||||
|
||||
---
|
||||
|
||||
## Usage
|
||||
|
||||
### Installation
|
||||
|
||||
```bash
|
||||
mkdir chinese-transcriber
|
||||
cd chinese-transcriber
|
||||
|
||||
# Créer structure
|
||||
mkdir input output
|
||||
|
||||
# Installer dépendances
|
||||
pip install openai python-dotenv
|
||||
|
||||
# Créer .env
|
||||
echo "OPENAI_API_KEY=sk-..." > .env
|
||||
|
||||
# Copier le script transcribe.py
|
||||
```
|
||||
|
||||
### Exécution
|
||||
|
||||
```bash
|
||||
# 1. Placer tes MP3 dans input/
|
||||
cp /path/to/cours*.mp3 input/
|
||||
|
||||
# 2. Run script
|
||||
python transcribe.py
|
||||
|
||||
# Output:
|
||||
# Found 3 MP3 files to transcribe
|
||||
#
|
||||
# Processing: cours_1.mp3...
|
||||
# ✓ Saved to: output/cours_1.txt
|
||||
#
|
||||
# Processing: cours_2.mp3...
|
||||
# ✓ Saved to: output/cours_2.txt
|
||||
# ...
|
||||
```
|
||||
|
||||
### Output
|
||||
|
||||
Fichiers `.txt` avec texte chinois brut :
|
||||
|
||||
```
|
||||
output/cours_1.txt:
|
||||
你好。我叫Alexis。今天我们学习汉语。
|
||||
第一课是关于问候的。你好吗?我很好,谢谢。
|
||||
...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Statut
|
||||
|
||||
✅ **PLAN SIMPLE - PRÊT À UTILISER**
|
||||
|
||||
Script minimaliste pour extraction texte MP3 → TXT.
|
||||
|
||||
**Next steps si besoin** :
|
||||
1. Tester sur tes fichiers MP3 chinois
|
||||
2. Si besoin découpage automatique, voir options full TTS pipeline (commenté dans versions précédentes)
|
||||
|
||||
---
|
||||
|
||||
*Créé : 27 octobre 2025*
|
||||
*Stack : Python 3.10+, Whisper API seulement*
|
||||
Loading…
Reference in New Issue
Block a user