Compare commits
2 Commits
23bb4cd2d9
...
ce32ae3134
| Author | SHA1 | Date | |
|---|---|---|---|
| ce32ae3134 | |||
| d4ac6f5859 |
9
.claude/settings.local.json
Normal file
9
.claude/settings.local.json
Normal file
@ -0,0 +1,9 @@
|
|||||||
|
{
|
||||||
|
"permissions": {
|
||||||
|
"allow": [
|
||||||
|
"Bash(npm run server:*)"
|
||||||
|
],
|
||||||
|
"deny": [],
|
||||||
|
"ask": []
|
||||||
|
}
|
||||||
|
}
|
||||||
128
CLAUDE.md
Normal file
128
CLAUDE.md
Normal file
@ -0,0 +1,128 @@
|
|||||||
|
# Video to MP3 Transcriptor - Instructions pour Claude
|
||||||
|
|
||||||
|
## À propos du projet
|
||||||
|
Ce projet est une API Node.js/Express pour télécharger des vidéos YouTube en MP3, les transcrire, les traduire et les résumer.
|
||||||
|
|
||||||
|
## Documentation
|
||||||
|
|
||||||
|
### Documentation API
|
||||||
|
La documentation complète de l'API se trouve dans **`docs/API.md`**.
|
||||||
|
|
||||||
|
**IMPORTANT** : Cette documentation doit TOUJOURS être maintenue à jour. Chaque fois qu'un endpoint est modifié, ajouté ou supprimé, la documentation doit être mise à jour en conséquence.
|
||||||
|
|
||||||
|
### Responsabilités de maintenance de la documentation
|
||||||
|
|
||||||
|
Quand tu modifies le code, tu DOIS mettre à jour `docs/API.md` si :
|
||||||
|
- Un nouvel endpoint est ajouté
|
||||||
|
- Un endpoint existant est modifié (paramètres, réponses, etc.)
|
||||||
|
- Un endpoint est supprimé
|
||||||
|
- Les modèles par défaut changent
|
||||||
|
- De nouveaux paramètres sont ajoutés
|
||||||
|
- Le format des réponses change
|
||||||
|
|
||||||
|
## Structure du projet
|
||||||
|
|
||||||
|
```
|
||||||
|
videotoMP3Transcriptor/
|
||||||
|
├── docs/
|
||||||
|
│ └── API.md # Documentation complète de l'API
|
||||||
|
├── src/
|
||||||
|
│ ├── server.js # Serveur Express et routes API
|
||||||
|
│ ├── services/
|
||||||
|
│ │ ├── youtube.js # Téléchargement YouTube
|
||||||
|
│ │ ├── transcription.js # Transcription OpenAI
|
||||||
|
│ │ ├── translation.js # Traduction GPT
|
||||||
|
│ │ └── summarize.js # Résumé GPT-5.1
|
||||||
|
│ └── cli.js # Interface en ligne de commande
|
||||||
|
├── public/ # Interface web (si présente)
|
||||||
|
├── output/ # Répertoire de sortie par défaut
|
||||||
|
├── .env # Variables d'environnement
|
||||||
|
└── package.json
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
### Port du serveur
|
||||||
|
- Port par défaut : **8888**
|
||||||
|
- Configurable via `process.env.PORT` dans `.env`
|
||||||
|
|
||||||
|
### Modèles par défaut
|
||||||
|
- **Transcription** : `gpt-4o-mini-transcribe`
|
||||||
|
- **Résumé** : `gpt-5.1`
|
||||||
|
- **Traduction** : `gpt-4o-mini` (hardcodé)
|
||||||
|
|
||||||
|
### Variables d'environnement requises
|
||||||
|
```env
|
||||||
|
OPENAI_API_KEY=sk-...
|
||||||
|
PORT=8888 # optionnel
|
||||||
|
OUTPUT_DIR=./output # optionnel
|
||||||
|
```
|
||||||
|
|
||||||
|
## Commandes importantes
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Lancer le serveur
|
||||||
|
npm run server
|
||||||
|
|
||||||
|
# Lancer le CLI
|
||||||
|
npm run cli
|
||||||
|
|
||||||
|
# Installer les dépendances
|
||||||
|
npm install
|
||||||
|
```
|
||||||
|
|
||||||
|
## Points d'attention
|
||||||
|
|
||||||
|
### Paramètres outputPath
|
||||||
|
Tous les endpoints supportent maintenant un paramètre `outputPath` optionnel pour spécifier un répertoire de sortie personnalisé. Si non spécifié, le répertoire par défaut `OUTPUT_DIR` est utilisé.
|
||||||
|
|
||||||
|
### Modèles de transcription disponibles
|
||||||
|
- `gpt-4o-mini-transcribe` (par défaut) - Rapide et économique
|
||||||
|
- `gpt-4o-transcribe` - Qualité supérieure
|
||||||
|
- `whisper-1` - Modèle original Whisper (supporte plus de formats)
|
||||||
|
|
||||||
|
### Formats de sortie
|
||||||
|
- **Transcription** : txt, json, srt, vtt (selon le modèle)
|
||||||
|
- **Traduction** : txt
|
||||||
|
- **Résumé** : txt
|
||||||
|
|
||||||
|
## Règles de développement
|
||||||
|
|
||||||
|
1. **Documentation d'abord** : Avant de modifier un endpoint, vérifie `docs/API.md`
|
||||||
|
2. **Après modification** : Mets à jour immédiatement `docs/API.md`
|
||||||
|
3. **Tests** : Redémarre le serveur après chaque modification
|
||||||
|
4. **Cohérence** : Garde la même structure de réponse pour tous les endpoints similaires
|
||||||
|
|
||||||
|
## Architecture des endpoints
|
||||||
|
|
||||||
|
### Endpoints streaming (SSE)
|
||||||
|
- `/download-stream`
|
||||||
|
- `/process-stream`
|
||||||
|
- `/summarize-stream`
|
||||||
|
|
||||||
|
Ces endpoints utilisent Server-Sent Events pour envoyer des mises à jour de progression en temps réel.
|
||||||
|
|
||||||
|
### Endpoints non-streaming
|
||||||
|
- `/download`
|
||||||
|
- `/process`
|
||||||
|
- Tous les endpoints POST avec upload de fichiers
|
||||||
|
|
||||||
|
Ces endpoints retournent une réponse unique une fois le traitement terminé.
|
||||||
|
|
||||||
|
## Maintenance
|
||||||
|
|
||||||
|
Lors de l'ajout de nouvelles fonctionnalités :
|
||||||
|
1. Implémente la fonctionnalité dans le service approprié (`src/services/`)
|
||||||
|
2. Ajoute les routes dans `src/server.js`
|
||||||
|
3. **Mets à jour `docs/API.md` IMMÉDIATEMENT**
|
||||||
|
4. Teste l'endpoint avec curl ou Postman
|
||||||
|
5. Vérifie que la documentation est claire et complète
|
||||||
|
|
||||||
|
## Notes importantes
|
||||||
|
|
||||||
|
- Le serveur doit toujours être sur le port **8888**
|
||||||
|
- Les clés API OpenAI sont requises pour transcription/traduction/résumé
|
||||||
|
- Le répertoire `output/` est créé automatiquement si inexistant
|
||||||
|
- Les fichiers uploadés sont stockés dans `OUTPUT_DIR`
|
||||||
|
- Les vidéos YouTube sont téléchargées en MP3 automatiquement
|
||||||
561
docs/API.md
Normal file
561
docs/API.md
Normal file
@ -0,0 +1,561 @@
|
|||||||
|
# API Documentation - Video to MP3 Transcriptor
|
||||||
|
|
||||||
|
## Base URL
|
||||||
|
```
|
||||||
|
http://localhost:8888
|
||||||
|
```
|
||||||
|
|
||||||
|
## Table of Contents
|
||||||
|
- [Health & Info](#health--info)
|
||||||
|
- [Download Endpoints](#download-endpoints)
|
||||||
|
- [Transcription Endpoints](#transcription-endpoints)
|
||||||
|
- [Translation Endpoints](#translation-endpoints)
|
||||||
|
- [Summarization Endpoints](#summarization-endpoints)
|
||||||
|
- [File Management](#file-management)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Health & Info
|
||||||
|
|
||||||
|
### GET /health
|
||||||
|
Health check endpoint.
|
||||||
|
|
||||||
|
**Response:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"status": "ok",
|
||||||
|
"timestamp": "2025-11-28T12:00:00.000Z"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### GET /api
|
||||||
|
Get API information and available endpoints.
|
||||||
|
|
||||||
|
**Response:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"name": "Video to MP3 Transcriptor API",
|
||||||
|
"version": "1.0.0",
|
||||||
|
"endpoints": { ... }
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### GET /info
|
||||||
|
Get information about a YouTube video or playlist.
|
||||||
|
|
||||||
|
**Query Parameters:**
|
||||||
|
- `url` (required): YouTube URL
|
||||||
|
|
||||||
|
**Example:**
|
||||||
|
```bash
|
||||||
|
curl "http://localhost:8888/info?url=https://www.youtube.com/watch?v=VIDEO_ID"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Response:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"success": true,
|
||||||
|
"title": "Video Title",
|
||||||
|
"type": "video",
|
||||||
|
"duration": 300,
|
||||||
|
"channel": "Channel Name",
|
||||||
|
"videoCount": 1
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Download Endpoints
|
||||||
|
|
||||||
|
### GET /download-stream
|
||||||
|
Download YouTube video(s) to MP3 with Server-Sent Events (SSE) progress updates.
|
||||||
|
|
||||||
|
**Query Parameters:**
|
||||||
|
- `url` (required): YouTube URL
|
||||||
|
- `outputPath` (optional): Custom output directory path
|
||||||
|
|
||||||
|
**Example:**
|
||||||
|
```bash
|
||||||
|
curl "http://localhost:8888/download-stream?url=https://www.youtube.com/watch?v=VIDEO_ID"
|
||||||
|
```
|
||||||
|
|
||||||
|
**SSE Events:**
|
||||||
|
- `info`: Video/playlist information
|
||||||
|
- `progress`: Download progress updates
|
||||||
|
- `video-complete`: Individual video completion
|
||||||
|
- `complete`: All downloads complete
|
||||||
|
- `error`: Error occurred
|
||||||
|
|
||||||
|
### POST /download
|
||||||
|
Download YouTube video(s) to MP3 (non-streaming).
|
||||||
|
|
||||||
|
**Body Parameters:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"url": "https://www.youtube.com/watch?v=VIDEO_ID",
|
||||||
|
"outputPath": "./custom/path" // optional
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Example:**
|
||||||
|
```bash
|
||||||
|
curl -X POST http://localhost:8888/download \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{"url":"https://www.youtube.com/watch?v=VIDEO_ID"}'
|
||||||
|
```
|
||||||
|
|
||||||
|
**Response:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"success": true,
|
||||||
|
"playlistTitle": null,
|
||||||
|
"totalVideos": 1,
|
||||||
|
"successCount": 1,
|
||||||
|
"failCount": 0,
|
||||||
|
"videos": [
|
||||||
|
{
|
||||||
|
"success": true,
|
||||||
|
"title": "Video Title",
|
||||||
|
"filePath": "./output/video.mp3",
|
||||||
|
"fileUrl": "/files/video.mp3"
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Transcription Endpoints
|
||||||
|
|
||||||
|
### POST /transcribe
|
||||||
|
Transcribe an existing audio file.
|
||||||
|
|
||||||
|
**Body Parameters:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"filePath": "./output/audio.mp3",
|
||||||
|
"language": "en", // optional (auto-detect if not specified)
|
||||||
|
"format": "txt", // optional: txt, json, srt, vtt
|
||||||
|
"model": "gpt-4o-mini-transcribe", // optional: gpt-4o-mini-transcribe (default), gpt-4o-transcribe, whisper-1
|
||||||
|
"outputPath": "./custom/path" // optional
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Available Models:**
|
||||||
|
- `gpt-4o-mini-transcribe` (default) - Fast and cost-effective
|
||||||
|
- `gpt-4o-transcribe` - Higher quality
|
||||||
|
- `whisper-1` - Original Whisper model (supports more formats)
|
||||||
|
|
||||||
|
**Example:**
|
||||||
|
```bash
|
||||||
|
curl -X POST http://localhost:8888/transcribe \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{
|
||||||
|
"filePath": "./output/audio.mp3",
|
||||||
|
"language": "en",
|
||||||
|
"model": "gpt-4o-mini-transcribe"
|
||||||
|
}'
|
||||||
|
```
|
||||||
|
|
||||||
|
**Response:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"success": true,
|
||||||
|
"filePath": "./output/audio.mp3",
|
||||||
|
"transcriptionPath": "./output/audio.txt",
|
||||||
|
"transcriptionUrl": "/files/audio.txt",
|
||||||
|
"text": "Transcribed text content..."
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### POST /upload-transcribe
|
||||||
|
Upload and transcribe audio files.
|
||||||
|
|
||||||
|
**Form Data:**
|
||||||
|
- `files`: Audio file(s) (multiple files supported, max 50)
|
||||||
|
- `language`: Language code (optional)
|
||||||
|
- `model`: Transcription model (optional, default: gpt-4o-mini-transcribe)
|
||||||
|
- `outputPath`: Custom output directory (optional)
|
||||||
|
|
||||||
|
**Example:**
|
||||||
|
```bash
|
||||||
|
curl -X POST http://localhost:8888/upload-transcribe \
|
||||||
|
-F "files=@audio1.mp3" \
|
||||||
|
-F "files=@audio2.mp3" \
|
||||||
|
-F "language=en" \
|
||||||
|
-F "model=gpt-4o-mini-transcribe"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Response:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"success": true,
|
||||||
|
"totalFiles": 2,
|
||||||
|
"successCount": 2,
|
||||||
|
"failCount": 0,
|
||||||
|
"results": [
|
||||||
|
{
|
||||||
|
"success": true,
|
||||||
|
"fileName": "audio1.mp3",
|
||||||
|
"transcriptionPath": "./output/audio1.txt",
|
||||||
|
"transcriptionUrl": "/files/audio1.txt",
|
||||||
|
"text": "Transcription..."
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### GET /process-stream
|
||||||
|
Download + Transcribe with SSE progress updates.
|
||||||
|
|
||||||
|
**Query Parameters:**
|
||||||
|
- `url` (required): YouTube URL
|
||||||
|
- `language` (optional): Language code
|
||||||
|
- `model` (optional): Transcription model (default: gpt-4o-mini-transcribe)
|
||||||
|
- `outputPath` (optional): Custom output directory
|
||||||
|
|
||||||
|
**Example:**
|
||||||
|
```bash
|
||||||
|
curl "http://localhost:8888/process-stream?url=https://www.youtube.com/watch?v=VIDEO_ID&language=en&model=gpt-4o-mini-transcribe"
|
||||||
|
```
|
||||||
|
|
||||||
|
**SSE Events:**
|
||||||
|
- `info`: Video information
|
||||||
|
- `progress`: Progress updates (downloading or transcribing)
|
||||||
|
- `video-complete`: Download complete
|
||||||
|
- `transcribe-complete`: Transcription complete
|
||||||
|
- `complete`: All operations complete
|
||||||
|
- `error`: Error occurred
|
||||||
|
|
||||||
|
### POST /process
|
||||||
|
Download + Transcribe (non-streaming).
|
||||||
|
|
||||||
|
**Body Parameters:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"url": "https://www.youtube.com/watch?v=VIDEO_ID",
|
||||||
|
"language": "en", // optional
|
||||||
|
"format": "txt", // optional
|
||||||
|
"model": "gpt-4o-mini-transcribe", // optional
|
||||||
|
"outputPath": "./custom/path" // optional
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Example:**
|
||||||
|
```bash
|
||||||
|
curl -X POST http://localhost:8888/process \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{
|
||||||
|
"url": "https://www.youtube.com/watch?v=VIDEO_ID",
|
||||||
|
"language": "en",
|
||||||
|
"model": "gpt-4o-mini-transcribe"
|
||||||
|
}'
|
||||||
|
```
|
||||||
|
|
||||||
|
**Response:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"success": true,
|
||||||
|
"playlistTitle": null,
|
||||||
|
"totalVideos": 1,
|
||||||
|
"downloadedCount": 1,
|
||||||
|
"transcribedCount": 1,
|
||||||
|
"results": [
|
||||||
|
{
|
||||||
|
"title": "Video Title",
|
||||||
|
"downloadSuccess": true,
|
||||||
|
"audioPath": "./output/video.mp3",
|
||||||
|
"audioUrl": "/files/video.mp3",
|
||||||
|
"transcriptionSuccess": true,
|
||||||
|
"transcriptionPath": "./output/video.txt",
|
||||||
|
"transcriptionUrl": "/files/video.txt",
|
||||||
|
"text": "Transcription..."
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Translation Endpoints
|
||||||
|
|
||||||
|
### GET /languages
|
||||||
|
Get available translation languages.
|
||||||
|
|
||||||
|
**Response:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"languages": {
|
||||||
|
"en": "English",
|
||||||
|
"fr": "French",
|
||||||
|
"es": "Spanish",
|
||||||
|
"de": "German",
|
||||||
|
"zh": "Chinese",
|
||||||
|
"ja": "Japanese",
|
||||||
|
...
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### POST /translate
|
||||||
|
Translate text.
|
||||||
|
|
||||||
|
**Body Parameters:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"text": "Text to translate",
|
||||||
|
"targetLang": "fr", // required: target language code
|
||||||
|
"sourceLang": "en" // optional: source language (auto-detect if not specified)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Example:**
|
||||||
|
```bash
|
||||||
|
curl -X POST http://localhost:8888/translate \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{
|
||||||
|
"text": "Hello, how are you?",
|
||||||
|
"targetLang": "fr"
|
||||||
|
}'
|
||||||
|
```
|
||||||
|
|
||||||
|
**Response:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"success": true,
|
||||||
|
"originalText": "Hello, how are you?",
|
||||||
|
"translatedText": "Bonjour, comment allez-vous ?",
|
||||||
|
"targetLanguage": "French",
|
||||||
|
"sourceLanguage": "auto-detected",
|
||||||
|
"chunks": 1
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### POST /translate-file
|
||||||
|
Translate uploaded text files.
|
||||||
|
|
||||||
|
**Form Data:**
|
||||||
|
- `files`: Text file(s) (.txt, multiple files supported, max 50)
|
||||||
|
- `targetLang`: Target language code (required)
|
||||||
|
- `sourceLang`: Source language code (optional)
|
||||||
|
- `outputPath`: Custom output directory (optional)
|
||||||
|
|
||||||
|
**Example:**
|
||||||
|
```bash
|
||||||
|
curl -X POST http://localhost:8888/translate-file \
|
||||||
|
-F "files=@document.txt" \
|
||||||
|
-F "targetLang=fr" \
|
||||||
|
-F "sourceLang=en"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Response:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"success": true,
|
||||||
|
"totalFiles": 1,
|
||||||
|
"successCount": 1,
|
||||||
|
"failCount": 0,
|
||||||
|
"results": [
|
||||||
|
{
|
||||||
|
"success": true,
|
||||||
|
"fileName": "document.txt",
|
||||||
|
"translationPath": "./output/document_fr.txt",
|
||||||
|
"translationUrl": "/files/document_fr.txt",
|
||||||
|
"translatedText": "Translated content..."
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Summarization Endpoints
|
||||||
|
|
||||||
|
### GET /summary-styles
|
||||||
|
Get available summary styles.
|
||||||
|
|
||||||
|
**Response:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"styles": {
|
||||||
|
"concise": "A brief summary capturing main points",
|
||||||
|
"detailed": "A comprehensive summary with nuances",
|
||||||
|
"bullet": "Key points as bullet points"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### POST /summarize
|
||||||
|
Summarize text using GPT-5.1.
|
||||||
|
|
||||||
|
**Body Parameters:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"text": "Long text to summarize...",
|
||||||
|
"style": "concise", // optional: concise (default), detailed, bullet
|
||||||
|
"language": "same", // optional: 'same' (default) or language code
|
||||||
|
"model": "gpt-5.1" // optional: default is gpt-5.1
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Example:**
|
||||||
|
```bash
|
||||||
|
curl -X POST http://localhost:8888/summarize \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{
|
||||||
|
"text": "Long article content...",
|
||||||
|
"style": "bullet",
|
||||||
|
"language": "same"
|
||||||
|
}'
|
||||||
|
```
|
||||||
|
|
||||||
|
**Response:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"success": true,
|
||||||
|
"summary": "Summary content...",
|
||||||
|
"model": "gpt-5.1",
|
||||||
|
"style": "bullet",
|
||||||
|
"inputLength": 5000,
|
||||||
|
"chunks": 1
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### POST /summarize-file
|
||||||
|
Summarize uploaded text files using GPT-5.1.
|
||||||
|
|
||||||
|
**Form Data:**
|
||||||
|
- `files`: Text file(s) (.txt, multiple files supported, max 50)
|
||||||
|
- `style`: Summary style (optional, default: concise)
|
||||||
|
- `language`: Output language (optional, default: same)
|
||||||
|
- `model`: AI model (optional, default: gpt-5.1)
|
||||||
|
- `outputPath`: Custom output directory (optional)
|
||||||
|
|
||||||
|
**Example:**
|
||||||
|
```bash
|
||||||
|
curl -X POST http://localhost:8888/summarize-file \
|
||||||
|
-F "files=@article.txt" \
|
||||||
|
-F "style=detailed" \
|
||||||
|
-F "language=same"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Response:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"success": true,
|
||||||
|
"totalFiles": 1,
|
||||||
|
"successCount": 1,
|
||||||
|
"failCount": 0,
|
||||||
|
"results": [
|
||||||
|
{
|
||||||
|
"success": true,
|
||||||
|
"fileName": "article.txt",
|
||||||
|
"summaryPath": "./output/article_summary.txt",
|
||||||
|
"summaryUrl": "/files/article_summary.txt",
|
||||||
|
"summary": "Summary content...",
|
||||||
|
"model": "gpt-5.1",
|
||||||
|
"chunks": 1
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### GET /summarize-stream
|
||||||
|
Full pipeline: Download -> Transcribe -> Summarize with SSE progress.
|
||||||
|
|
||||||
|
**Query Parameters:**
|
||||||
|
- `url` (required): YouTube URL
|
||||||
|
- `style` (optional): Summary style (default: concise)
|
||||||
|
- `language` (optional): Output language (default: same)
|
||||||
|
- `model` (optional): Transcription model (default: gpt-4o-mini-transcribe)
|
||||||
|
- `outputPath` (optional): Custom output directory
|
||||||
|
|
||||||
|
**Example:**
|
||||||
|
```bash
|
||||||
|
curl "http://localhost:8888/summarize-stream?url=https://www.youtube.com/watch?v=VIDEO_ID&style=bullet&model=gpt-4o-mini-transcribe"
|
||||||
|
```
|
||||||
|
|
||||||
|
**SSE Events:**
|
||||||
|
- `info`: Video information
|
||||||
|
- `progress`: Progress updates (downloading, transcribing, or summarizing)
|
||||||
|
- `video-complete`: Download complete
|
||||||
|
- `transcribe-complete`: Transcription complete
|
||||||
|
- `summarize-complete`: Summary complete
|
||||||
|
- `complete`: All operations complete
|
||||||
|
- `error`: Error occurred
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## File Management
|
||||||
|
|
||||||
|
### GET /files-list
|
||||||
|
List all downloaded/generated files.
|
||||||
|
|
||||||
|
**Example:**
|
||||||
|
```bash
|
||||||
|
curl http://localhost:8888/files-list
|
||||||
|
```
|
||||||
|
|
||||||
|
**Response:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"files": [
|
||||||
|
{
|
||||||
|
"name": "video.mp3",
|
||||||
|
"url": "/files/video.mp3",
|
||||||
|
"path": "./output/video.mp3"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "video.txt",
|
||||||
|
"url": "/files/video.txt",
|
||||||
|
"path": "./output/video.txt"
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### GET /files/:filename
|
||||||
|
Serve a specific file.
|
||||||
|
|
||||||
|
**Example:**
|
||||||
|
```bash
|
||||||
|
curl http://localhost:8888/files/video.mp3 --output video.mp3
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Error Responses
|
||||||
|
|
||||||
|
All endpoints return error responses in the following format:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"error": "Error message describing what went wrong"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Common HTTP status codes:
|
||||||
|
- `400` - Bad Request (missing required parameters)
|
||||||
|
- `500` - Internal Server Error (processing failed)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
|
||||||
|
### Output Paths
|
||||||
|
All endpoints that support `outputPath` parameter:
|
||||||
|
- If not specified, files are saved to the default `OUTPUT_DIR` (./output)
|
||||||
|
- If specified, files are saved to the custom path provided
|
||||||
|
|
||||||
|
### Models
|
||||||
|
- **Transcription**: Default is `gpt-4o-mini-transcribe` (cost-effective)
|
||||||
|
- **Summarization**: Default is `gpt-5.1` (latest GPT model)
|
||||||
|
- **Translation**: Uses `gpt-4o-mini` (hardcoded)
|
||||||
|
|
||||||
|
### File Formats
|
||||||
|
- **Audio**: MP3, WAV, M4A, OGG, FLAC
|
||||||
|
- **Text**: TXT files
|
||||||
|
- **Transcription outputs**: TXT, JSON, SRT, VTT (depending on model)
|
||||||
|
|
||||||
|
### API Key
|
||||||
|
Ensure `OPENAI_API_KEY` is set in your `.env` file for transcription, translation, and summarization features to work.
|
||||||
2258
public/app.js
2258
public/app.js
File diff suppressed because it is too large
Load Diff
1154
public/index.html
1154
public/index.html
File diff suppressed because it is too large
Load Diff
1432
public/style.css
1432
public/style.css
File diff suppressed because it is too large
Load Diff
2264
src/server.js
2264
src/server.js
File diff suppressed because it is too large
Load Diff
@ -1,145 +1,145 @@
|
|||||||
import { exec } from 'child_process';
|
import { exec } from 'child_process';
|
||||||
import { promisify } from 'util';
|
import { promisify } from 'util';
|
||||||
import path from 'path';
|
import path from 'path';
|
||||||
import fs from 'fs';
|
import fs from 'fs';
|
||||||
|
|
||||||
const execPromise = promisify(exec);
|
const execPromise = promisify(exec);
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Convert a video/audio file to MP3 using FFmpeg
|
* Convert a video/audio file to MP3 using FFmpeg
|
||||||
* @param {string} inputPath - Path to input file
|
* @param {string} inputPath - Path to input file
|
||||||
* @param {object} options - Conversion options
|
* @param {object} options - Conversion options
|
||||||
* @param {string} options.outputDir - Output directory (default: same as input)
|
* @param {string} options.outputDir - Output directory (default: same as input)
|
||||||
* @param {string} options.bitrate - Audio bitrate (default: 192k)
|
* @param {string} options.bitrate - Audio bitrate (default: 192k)
|
||||||
* @param {string} options.quality - Audio quality 0-9 (default: 2, where 0 is best)
|
* @param {string} options.quality - Audio quality 0-9 (default: 2, where 0 is best)
|
||||||
* @returns {Promise<object>} Conversion result with output path
|
* @returns {Promise<object>} Conversion result with output path
|
||||||
*/
|
*/
|
||||||
export async function convertToMP3(inputPath, options = {}) {
|
export async function convertToMP3(inputPath, options = {}) {
|
||||||
const {
|
const {
|
||||||
outputDir = path.dirname(inputPath),
|
outputDir = path.dirname(inputPath),
|
||||||
bitrate = '192k',
|
bitrate = '192k',
|
||||||
quality = '2',
|
quality = '2',
|
||||||
} = options;
|
} = options;
|
||||||
|
|
||||||
// Ensure input file exists
|
// Ensure input file exists
|
||||||
if (!fs.existsSync(inputPath)) {
|
if (!fs.existsSync(inputPath)) {
|
||||||
throw new Error(`Input file not found: ${inputPath}`);
|
throw new Error(`Input file not found: ${inputPath}`);
|
||||||
}
|
}
|
||||||
|
|
||||||
// Generate output path
|
// Generate output path
|
||||||
const inputFilename = path.basename(inputPath, path.extname(inputPath));
|
const inputFilename = path.basename(inputPath, path.extname(inputPath));
|
||||||
const outputPath = path.join(outputDir, `${inputFilename}.mp3`);
|
const outputPath = path.join(outputDir, `${inputFilename}.mp3`);
|
||||||
|
|
||||||
// Check if output already exists
|
// Check if output already exists
|
||||||
if (fs.existsSync(outputPath)) {
|
if (fs.existsSync(outputPath)) {
|
||||||
// Add timestamp to make it unique
|
// Add timestamp to make it unique
|
||||||
const timestamp = Date.now();
|
const timestamp = Date.now();
|
||||||
const uniqueOutputPath = path.join(outputDir, `${inputFilename}_${timestamp}.mp3`);
|
const uniqueOutputPath = path.join(outputDir, `${inputFilename}_${timestamp}.mp3`);
|
||||||
return convertToMP3Internal(inputPath, uniqueOutputPath, bitrate, quality);
|
return convertToMP3Internal(inputPath, uniqueOutputPath, bitrate, quality);
|
||||||
}
|
}
|
||||||
|
|
||||||
return convertToMP3Internal(inputPath, outputPath, bitrate, quality);
|
return convertToMP3Internal(inputPath, outputPath, bitrate, quality);
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Internal conversion function
|
* Internal conversion function
|
||||||
*/
|
*/
|
||||||
async function convertToMP3Internal(inputPath, outputPath, bitrate, quality) {
|
async function convertToMP3Internal(inputPath, outputPath, bitrate, quality) {
|
||||||
try {
|
try {
|
||||||
// FFmpeg command to convert to MP3
|
// FFmpeg command to convert to MP3
|
||||||
// -i: input file
|
// -i: input file
|
||||||
// -vn: no video (audio only)
|
// -vn: no video (audio only)
|
||||||
// -ar 44100: audio sample rate 44.1kHz
|
// -ar 44100: audio sample rate 44.1kHz
|
||||||
// -ac 2: stereo
|
// -ac 2: stereo
|
||||||
// -b:a: audio bitrate
|
// -b:a: audio bitrate
|
||||||
// -q:a: audio quality (VBR)
|
// -q:a: audio quality (VBR)
|
||||||
const command = `ffmpeg -i "${inputPath}" -vn -ar 44100 -ac 2 -b:a ${bitrate} -q:a ${quality} "${outputPath}"`;
|
const command = `ffmpeg -i "${inputPath}" -vn -ar 44100 -ac 2 -b:a ${bitrate} -q:a ${quality} "${outputPath}"`;
|
||||||
|
|
||||||
console.log(`Converting: ${path.basename(inputPath)} -> ${path.basename(outputPath)}`);
|
console.log(`Converting: ${path.basename(inputPath)} -> ${path.basename(outputPath)}`);
|
||||||
|
|
||||||
const { stdout, stderr } = await execPromise(command);
|
const { stdout, stderr } = await execPromise(command);
|
||||||
|
|
||||||
// Verify output file was created
|
// Verify output file was created
|
||||||
if (!fs.existsSync(outputPath)) {
|
if (!fs.existsSync(outputPath)) {
|
||||||
throw new Error('Conversion failed: output file not created');
|
throw new Error('Conversion failed: output file not created');
|
||||||
}
|
}
|
||||||
|
|
||||||
const stats = fs.statSync(outputPath);
|
const stats = fs.statSync(outputPath);
|
||||||
|
|
||||||
return {
|
return {
|
||||||
success: true,
|
success: true,
|
||||||
inputPath,
|
inputPath,
|
||||||
outputPath,
|
outputPath,
|
||||||
filename: path.basename(outputPath),
|
filename: path.basename(outputPath),
|
||||||
size: stats.size,
|
size: stats.size,
|
||||||
sizeHuman: formatBytes(stats.size),
|
sizeHuman: formatBytes(stats.size),
|
||||||
};
|
};
|
||||||
} catch (error) {
|
} catch (error) {
|
||||||
console.error(`Conversion error: ${error.message}`);
|
console.error(`Conversion error: ${error.message}`);
|
||||||
throw new Error(`FFmpeg conversion failed: ${error.message}`);
|
throw new Error(`FFmpeg conversion failed: ${error.message}`);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Convert multiple files to MP3
|
* Convert multiple files to MP3
|
||||||
* @param {string[]} inputPaths - Array of input file paths
|
* @param {string[]} inputPaths - Array of input file paths
|
||||||
* @param {object} options - Conversion options
|
* @param {object} options - Conversion options
|
||||||
* @returns {Promise<object>} Batch conversion results
|
* @returns {Promise<object>} Batch conversion results
|
||||||
*/
|
*/
|
||||||
export async function convertMultipleToMP3(inputPaths, options = {}) {
|
export async function convertMultipleToMP3(inputPaths, options = {}) {
|
||||||
const results = [];
|
const results = [];
|
||||||
let successCount = 0;
|
let successCount = 0;
|
||||||
let failCount = 0;
|
let failCount = 0;
|
||||||
|
|
||||||
for (let i = 0; i < inputPaths.length; i++) {
|
for (let i = 0; i < inputPaths.length; i++) {
|
||||||
const inputPath = inputPaths[i];
|
const inputPath = inputPaths[i];
|
||||||
console.log(`[${i + 1}/${inputPaths.length}] Converting: ${path.basename(inputPath)}`);
|
console.log(`[${i + 1}/${inputPaths.length}] Converting: ${path.basename(inputPath)}`);
|
||||||
|
|
||||||
try {
|
try {
|
||||||
const result = await convertToMP3(inputPath, options);
|
const result = await convertToMP3(inputPath, options);
|
||||||
results.push({ ...result, index: i });
|
results.push({ ...result, index: i });
|
||||||
successCount++;
|
successCount++;
|
||||||
} catch (error) {
|
} catch (error) {
|
||||||
results.push({
|
results.push({
|
||||||
success: false,
|
success: false,
|
||||||
inputPath,
|
inputPath,
|
||||||
error: error.message,
|
error: error.message,
|
||||||
index: i,
|
index: i,
|
||||||
});
|
});
|
||||||
failCount++;
|
failCount++;
|
||||||
console.error(`Failed to convert ${path.basename(inputPath)}: ${error.message}`);
|
console.error(`Failed to convert ${path.basename(inputPath)}: ${error.message}`);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
return {
|
return {
|
||||||
totalFiles: inputPaths.length,
|
totalFiles: inputPaths.length,
|
||||||
successCount,
|
successCount,
|
||||||
failCount,
|
failCount,
|
||||||
results,
|
results,
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Format bytes to human readable format
|
* Format bytes to human readable format
|
||||||
*/
|
*/
|
||||||
function formatBytes(bytes, decimals = 2) {
|
function formatBytes(bytes, decimals = 2) {
|
||||||
if (bytes === 0) return '0 Bytes';
|
if (bytes === 0) return '0 Bytes';
|
||||||
|
|
||||||
const k = 1024;
|
const k = 1024;
|
||||||
const dm = decimals < 0 ? 0 : decimals;
|
const dm = decimals < 0 ? 0 : decimals;
|
||||||
const sizes = ['Bytes', 'KB', 'MB', 'GB'];
|
const sizes = ['Bytes', 'KB', 'MB', 'GB'];
|
||||||
|
|
||||||
const i = Math.floor(Math.log(bytes) / Math.log(k));
|
const i = Math.floor(Math.log(bytes) / Math.log(k));
|
||||||
|
|
||||||
return parseFloat((bytes / Math.pow(k, i)).toFixed(dm)) + ' ' + sizes[i];
|
return parseFloat((bytes / Math.pow(k, i)).toFixed(dm)) + ' ' + sizes[i];
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Get supported input formats
|
* Get supported input formats
|
||||||
*/
|
*/
|
||||||
export function getSupportedFormats() {
|
export function getSupportedFormats() {
|
||||||
return {
|
return {
|
||||||
video: ['.mp4', '.avi', '.mkv', '.mov', '.wmv', '.flv', '.webm', '.m4v'],
|
video: ['.mp4', '.avi', '.mkv', '.mov', '.wmv', '.flv', '.webm', '.m4v'],
|
||||||
audio: ['.m4a', '.wav', '.flac', '.ogg', '.aac', '.wma', '.opus'],
|
audio: ['.m4a', '.wav', '.flac', '.ogg', '.aac', '.wma', '.opus'],
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|||||||
@ -1,193 +1,195 @@
|
|||||||
import OpenAI from 'openai';
|
import OpenAI from 'openai';
|
||||||
import fs from 'fs';
|
import fs from 'fs';
|
||||||
import path from 'path';
|
import path from 'path';
|
||||||
|
|
||||||
let openai = null;
|
let openai = null;
|
||||||
|
|
||||||
// Max characters per chunk for summarization
|
// Max characters per chunk for summarization
|
||||||
const MAX_CHUNK_CHARS = 30000;
|
const MAX_CHUNK_CHARS = 30000;
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Get OpenAI client (lazy initialization)
|
* Get OpenAI client (lazy initialization)
|
||||||
*/
|
*/
|
||||||
function getOpenAI() {
|
function getOpenAI() {
|
||||||
if (!openai) {
|
if (!openai) {
|
||||||
if (!process.env.OPENAI_API_KEY) {
|
if (!process.env.OPENAI_API_KEY) {
|
||||||
throw new Error('OPENAI_API_KEY environment variable is not set');
|
throw new Error('OPENAI_API_KEY environment variable is not set');
|
||||||
}
|
}
|
||||||
openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
|
openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
|
||||||
}
|
}
|
||||||
return openai;
|
return openai;
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Summarize text using GPT-4o
|
* Summarize text using GPT-4o
|
||||||
*/
|
*/
|
||||||
export async function summarizeText(text, options = {}) {
|
export async function summarizeText(text, options = {}) {
|
||||||
const {
|
const {
|
||||||
model = 'gpt-5.1', // GPT-5.1 - latest OpenAI model (Nov 2025)
|
model = 'gpt-5.1', // GPT-5.1 - latest OpenAI model (Nov 2025)
|
||||||
language = 'same', // 'same' = same as input, or specify language code
|
language = 'same', // 'same' = same as input, or specify language code
|
||||||
style = 'concise', // 'concise', 'detailed', 'bullet'
|
style = 'concise', // 'concise', 'detailed', 'bullet'
|
||||||
maxLength = null, // optional max length in words
|
maxLength = null, // optional max length in words
|
||||||
} = options;
|
} = options;
|
||||||
|
|
||||||
const client = getOpenAI();
|
const client = getOpenAI();
|
||||||
|
|
||||||
let styleInstruction = '';
|
let styleInstruction = '';
|
||||||
switch (style) {
|
switch (style) {
|
||||||
case 'detailed':
|
case 'detailed':
|
||||||
styleInstruction = 'Provide a detailed summary that captures all important points and nuances.';
|
styleInstruction = 'Provide a detailed summary that captures all important points and nuances.';
|
||||||
break;
|
break;
|
||||||
case 'bullet':
|
case 'bullet':
|
||||||
styleInstruction = 'Provide the summary as bullet points, highlighting the key points.';
|
styleInstruction = 'Provide the summary as bullet points, highlighting the key points.';
|
||||||
break;
|
break;
|
||||||
case 'concise':
|
case 'concise':
|
||||||
default:
|
default:
|
||||||
styleInstruction = 'Provide a concise summary that captures the main points.';
|
styleInstruction = 'Provide a concise summary that captures the main points.';
|
||||||
}
|
}
|
||||||
|
|
||||||
let languageInstruction = '';
|
let languageInstruction = '';
|
||||||
if (language === 'same') {
|
if (language === 'same') {
|
||||||
languageInstruction = 'Write the summary in the same language as the input text.';
|
languageInstruction = 'Write the summary in the same language as the input text.';
|
||||||
} else {
|
} else {
|
||||||
languageInstruction = `Write the summary in ${language}.`;
|
languageInstruction = `Write the summary in ${language}.`;
|
||||||
}
|
}
|
||||||
|
|
||||||
let lengthInstruction = '';
|
let lengthInstruction = '';
|
||||||
if (maxLength) {
|
if (maxLength) {
|
||||||
lengthInstruction = `Keep the summary under ${maxLength} words.`;
|
lengthInstruction = `Keep the summary under ${maxLength} words.`;
|
||||||
}
|
}
|
||||||
|
|
||||||
const systemPrompt = `You are an expert summarizer. ${styleInstruction} ${languageInstruction} ${lengthInstruction}
|
const systemPrompt = `You are an expert summarizer. ${styleInstruction} ${languageInstruction} ${lengthInstruction}
|
||||||
Focus on the most important information and main ideas. Be accurate and objective.`;
|
Focus on the most important information and main ideas. Be accurate and objective.`;
|
||||||
|
|
||||||
// Handle long texts by chunking
|
// Handle long texts by chunking
|
||||||
if (text.length > MAX_CHUNK_CHARS) {
|
if (text.length > MAX_CHUNK_CHARS) {
|
||||||
return await summarizeLongText(text, { model, systemPrompt, style });
|
return await summarizeLongText(text, { model, systemPrompt, style });
|
||||||
}
|
}
|
||||||
|
|
||||||
const response = await client.chat.completions.create({
|
const response = await client.chat.completions.create({
|
||||||
model,
|
model,
|
||||||
messages: [
|
messages: [
|
||||||
{ role: 'system', content: systemPrompt },
|
{ role: 'system', content: systemPrompt },
|
||||||
{ role: 'user', content: `Please summarize the following text:\n\n${text}` },
|
{ role: 'user', content: `Please summarize the following text:\n\n${text}` },
|
||||||
],
|
],
|
||||||
temperature: 0.3,
|
temperature: 0.3,
|
||||||
});
|
});
|
||||||
|
|
||||||
return {
|
return {
|
||||||
summary: response.choices[0].message.content,
|
summary: response.choices[0].message.content,
|
||||||
model,
|
model,
|
||||||
style,
|
style,
|
||||||
inputLength: text.length,
|
inputLength: text.length,
|
||||||
chunks: 1,
|
chunks: 1,
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Summarize long text by chunking and combining summaries
|
* Summarize long text by chunking and combining summaries
|
||||||
*/
|
*/
|
||||||
async function summarizeLongText(text, options) {
|
async function summarizeLongText(text, options) {
|
||||||
const { model, systemPrompt, style } = options;
|
const { model, systemPrompt, style } = options;
|
||||||
const client = getOpenAI();
|
const client = getOpenAI();
|
||||||
|
|
||||||
// Split into chunks
|
// Split into chunks
|
||||||
const chunks = [];
|
const chunks = [];
|
||||||
let currentChunk = '';
|
let currentChunk = '';
|
||||||
const sentences = text.split(/(?<=[.!?。!?\n])\s*/);
|
const sentences = text.split(/(?<=[.!?。!?\n])\s*/);
|
||||||
|
|
||||||
for (const sentence of sentences) {
|
for (const sentence of sentences) {
|
||||||
if ((currentChunk + sentence).length > MAX_CHUNK_CHARS && currentChunk) {
|
if ((currentChunk + sentence).length > MAX_CHUNK_CHARS && currentChunk) {
|
||||||
chunks.push(currentChunk.trim());
|
chunks.push(currentChunk.trim());
|
||||||
currentChunk = sentence;
|
currentChunk = sentence;
|
||||||
} else {
|
} else {
|
||||||
currentChunk += ' ' + sentence;
|
currentChunk += ' ' + sentence;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
if (currentChunk.trim()) {
|
if (currentChunk.trim()) {
|
||||||
chunks.push(currentChunk.trim());
|
chunks.push(currentChunk.trim());
|
||||||
}
|
}
|
||||||
|
|
||||||
console.log(`Summarizing ${chunks.length} chunks...`);
|
console.log(`Summarizing ${chunks.length} chunks...`);
|
||||||
|
|
||||||
// Summarize each chunk
|
// Summarize each chunk
|
||||||
const chunkSummaries = [];
|
const chunkSummaries = [];
|
||||||
for (let i = 0; i < chunks.length; i++) {
|
for (let i = 0; i < chunks.length; i++) {
|
||||||
console.log(`[${i + 1}/${chunks.length}] Summarizing chunk...`);
|
console.log(`[${i + 1}/${chunks.length}] Summarizing chunk...`);
|
||||||
const response = await client.chat.completions.create({
|
const response = await client.chat.completions.create({
|
||||||
model,
|
model,
|
||||||
messages: [
|
messages: [
|
||||||
{ role: 'system', content: systemPrompt },
|
{ role: 'system', content: systemPrompt },
|
||||||
{ role: 'user', content: `Please summarize the following text (part ${i + 1} of ${chunks.length}):\n\n${chunks[i]}` },
|
{ role: 'user', content: `Please summarize the following text (part ${i + 1} of ${chunks.length}):\n\n${chunks[i]}` },
|
||||||
],
|
],
|
||||||
temperature: 0.3,
|
temperature: 0.3,
|
||||||
});
|
});
|
||||||
chunkSummaries.push(response.choices[0].message.content);
|
chunkSummaries.push(response.choices[0].message.content);
|
||||||
}
|
}
|
||||||
|
|
||||||
// Combine summaries if multiple chunks
|
// Combine summaries if multiple chunks
|
||||||
if (chunkSummaries.length === 1) {
|
if (chunkSummaries.length === 1) {
|
||||||
return {
|
return {
|
||||||
summary: chunkSummaries[0],
|
summary: chunkSummaries[0],
|
||||||
model,
|
model,
|
||||||
style,
|
style,
|
||||||
inputLength: text.length,
|
inputLength: text.length,
|
||||||
chunks: 1,
|
chunks: 1,
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|
||||||
// Create final combined summary
|
// Create final combined summary
|
||||||
const combinedText = chunkSummaries.join('\n\n---\n\n');
|
const combinedText = chunkSummaries.join('\n\n---\n\n');
|
||||||
const finalResponse = await client.chat.completions.create({
|
const finalResponse = await client.chat.completions.create({
|
||||||
model,
|
model,
|
||||||
messages: [
|
messages: [
|
||||||
{ role: 'system', content: `You are an expert summarizer. Combine and synthesize the following partial summaries into a single coherent ${style} summary. Remove redundancy and ensure a smooth flow.` },
|
{ role: 'system', content: `You are an expert summarizer. Combine and synthesize the following partial summaries into a single coherent ${style} summary. Remove redundancy and ensure a smooth flow.` },
|
||||||
{ role: 'user', content: `Please combine these summaries into one:\n\n${combinedText}` },
|
{ role: 'user', content: `Please combine these summaries into one:\n\n${combinedText}` },
|
||||||
],
|
],
|
||||||
temperature: 0.3,
|
temperature: 0.3,
|
||||||
});
|
});
|
||||||
|
|
||||||
return {
|
return {
|
||||||
summary: finalResponse.choices[0].message.content,
|
summary: finalResponse.choices[0].message.content,
|
||||||
model,
|
model,
|
||||||
style,
|
style,
|
||||||
inputLength: text.length,
|
inputLength: text.length,
|
||||||
chunks: chunks.length,
|
chunks: chunks.length,
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Summarize a text file
|
* Summarize a text file
|
||||||
*/
|
*/
|
||||||
export async function summarizeFile(filePath, options = {}) {
|
export async function summarizeFile(filePath, options = {}) {
|
||||||
if (!fs.existsSync(filePath)) {
|
if (!fs.existsSync(filePath)) {
|
||||||
throw new Error(`File not found: ${filePath}`);
|
throw new Error(`File not found: ${filePath}`);
|
||||||
}
|
}
|
||||||
|
|
||||||
const text = fs.readFileSync(filePath, 'utf-8');
|
const { outputDir, ...otherOptions } = options;
|
||||||
const result = await summarizeText(text, options);
|
|
||||||
|
const text = fs.readFileSync(filePath, 'utf-8');
|
||||||
// Save summary to file
|
const result = await summarizeText(text, otherOptions);
|
||||||
const dir = path.dirname(filePath);
|
|
||||||
const baseName = path.basename(filePath, path.extname(filePath));
|
// Save summary to file
|
||||||
const summaryPath = path.join(dir, `${baseName}_summary.txt`);
|
const dir = outputDir || path.dirname(filePath);
|
||||||
|
const baseName = path.basename(filePath, path.extname(filePath));
|
||||||
fs.writeFileSync(summaryPath, result.summary, 'utf-8');
|
const summaryPath = path.join(dir, `${baseName}_summary.txt`);
|
||||||
|
|
||||||
return {
|
fs.writeFileSync(summaryPath, result.summary, 'utf-8');
|
||||||
...result,
|
|
||||||
filePath,
|
return {
|
||||||
summaryPath,
|
...result,
|
||||||
};
|
filePath,
|
||||||
}
|
summaryPath,
|
||||||
|
};
|
||||||
/**
|
}
|
||||||
* Get available summary styles
|
|
||||||
*/
|
/**
|
||||||
export function getSummaryStyles() {
|
* Get available summary styles
|
||||||
return {
|
*/
|
||||||
concise: 'A brief summary capturing main points',
|
export function getSummaryStyles() {
|
||||||
detailed: 'A comprehensive summary with nuances',
|
return {
|
||||||
bullet: 'Key points as bullet points',
|
concise: 'A brief summary capturing main points',
|
||||||
};
|
detailed: 'A comprehensive summary with nuances',
|
||||||
}
|
bullet: 'Key points as bullet points',
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|||||||
@ -1,178 +1,178 @@
|
|||||||
import OpenAI from 'openai';
|
import OpenAI from 'openai';
|
||||||
import fs from 'fs';
|
import fs from 'fs';
|
||||||
import path from 'path';
|
import path from 'path';
|
||||||
|
|
||||||
let openai = null;
|
let openai = null;
|
||||||
|
|
||||||
// Available transcription models
|
// Available transcription models
|
||||||
const MODELS = {
|
const MODELS = {
|
||||||
'gpt-4o-transcribe': {
|
'gpt-4o-transcribe': {
|
||||||
name: 'gpt-4o-transcribe',
|
name: 'gpt-4o-transcribe',
|
||||||
formats: ['json', 'text'],
|
formats: ['json', 'text'],
|
||||||
supportsLanguage: true,
|
supportsLanguage: true,
|
||||||
},
|
},
|
||||||
'gpt-4o-mini-transcribe': {
|
'gpt-4o-mini-transcribe': {
|
||||||
name: 'gpt-4o-mini-transcribe',
|
name: 'gpt-4o-mini-transcribe',
|
||||||
formats: ['json', 'text'],
|
formats: ['json', 'text'],
|
||||||
supportsLanguage: true,
|
supportsLanguage: true,
|
||||||
},
|
},
|
||||||
'whisper-1': {
|
'whisper-1': {
|
||||||
name: 'whisper-1',
|
name: 'whisper-1',
|
||||||
formats: ['json', 'text', 'srt', 'vtt', 'verbose_json'],
|
formats: ['json', 'text', 'srt', 'vtt', 'verbose_json'],
|
||||||
supportsLanguage: true,
|
supportsLanguage: true,
|
||||||
},
|
},
|
||||||
};
|
};
|
||||||
|
|
||||||
const DEFAULT_MODEL = 'gpt-4o-transcribe';
|
const DEFAULT_MODEL = 'gpt-4o-mini-transcribe';
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Get OpenAI client (lazy initialization)
|
* Get OpenAI client (lazy initialization)
|
||||||
*/
|
*/
|
||||||
function getOpenAI() {
|
function getOpenAI() {
|
||||||
if (!openai) {
|
if (!openai) {
|
||||||
if (!process.env.OPENAI_API_KEY) {
|
if (!process.env.OPENAI_API_KEY) {
|
||||||
throw new Error('OPENAI_API_KEY environment variable is not set');
|
throw new Error('OPENAI_API_KEY environment variable is not set');
|
||||||
}
|
}
|
||||||
openai = new OpenAI({
|
openai = new OpenAI({
|
||||||
apiKey: process.env.OPENAI_API_KEY,
|
apiKey: process.env.OPENAI_API_KEY,
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
return openai;
|
return openai;
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Get available models
|
* Get available models
|
||||||
*/
|
*/
|
||||||
export function getAvailableModels() {
|
export function getAvailableModels() {
|
||||||
return Object.keys(MODELS);
|
return Object.keys(MODELS);
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Transcribe an audio file using OpenAI API
|
* Transcribe an audio file using OpenAI API
|
||||||
* @param {string} filePath - Path to audio file
|
* @param {string} filePath - Path to audio file
|
||||||
* @param {Object} options - Transcription options
|
* @param {Object} options - Transcription options
|
||||||
* @param {string} options.language - Language code (e.g., 'en', 'fr', 'es', 'zh')
|
* @param {string} options.language - Language code (e.g., 'en', 'fr', 'es', 'zh')
|
||||||
* @param {string} options.responseFormat - Output format: 'json' or 'text' (gpt-4o models), or 'srt'/'vtt' (whisper-1 only)
|
* @param {string} options.responseFormat - Output format: 'json' or 'text' (gpt-4o models), or 'srt'/'vtt' (whisper-1 only)
|
||||||
* @param {string} options.prompt - Optional context prompt for better accuracy
|
* @param {string} options.prompt - Optional context prompt for better accuracy
|
||||||
* @param {string} options.model - Model to use (default: gpt-4o-transcribe)
|
* @param {string} options.model - Model to use (default: gpt-4o-transcribe)
|
||||||
*/
|
*/
|
||||||
export async function transcribeFile(filePath, options = {}) {
|
export async function transcribeFile(filePath, options = {}) {
|
||||||
const {
|
const {
|
||||||
language = null, // Auto-detect if null
|
language = null, // Auto-detect if null
|
||||||
responseFormat = 'text', // json or text for gpt-4o models
|
responseFormat = 'text', // json or text for gpt-4o models
|
||||||
prompt = null, // Optional context prompt
|
prompt = null, // Optional context prompt
|
||||||
model = DEFAULT_MODEL,
|
model = DEFAULT_MODEL,
|
||||||
} = options;
|
} = options;
|
||||||
|
|
||||||
if (!fs.existsSync(filePath)) {
|
if (!fs.existsSync(filePath)) {
|
||||||
throw new Error(`File not found: ${filePath}`);
|
throw new Error(`File not found: ${filePath}`);
|
||||||
}
|
}
|
||||||
|
|
||||||
const modelConfig = MODELS[model] || MODELS[DEFAULT_MODEL];
|
const modelConfig = MODELS[model] || MODELS[DEFAULT_MODEL];
|
||||||
const actualModel = modelConfig.name;
|
const actualModel = modelConfig.name;
|
||||||
|
|
||||||
// Validate response format for model
|
// Validate response format for model
|
||||||
let actualFormat = responseFormat;
|
let actualFormat = responseFormat;
|
||||||
if (!modelConfig.formats.includes(responseFormat)) {
|
if (!modelConfig.formats.includes(responseFormat)) {
|
||||||
console.warn(`Format '${responseFormat}' not supported by ${actualModel}, using 'text'`);
|
console.warn(`Format '${responseFormat}' not supported by ${actualModel}, using 'text'`);
|
||||||
actualFormat = 'text';
|
actualFormat = 'text';
|
||||||
}
|
}
|
||||||
|
|
||||||
try {
|
try {
|
||||||
const transcriptionOptions = {
|
const transcriptionOptions = {
|
||||||
file: fs.createReadStream(filePath),
|
file: fs.createReadStream(filePath),
|
||||||
model: actualModel,
|
model: actualModel,
|
||||||
response_format: actualFormat,
|
response_format: actualFormat,
|
||||||
};
|
};
|
||||||
|
|
||||||
if (language) {
|
if (language) {
|
||||||
transcriptionOptions.language = language;
|
transcriptionOptions.language = language;
|
||||||
}
|
}
|
||||||
|
|
||||||
if (prompt) {
|
if (prompt) {
|
||||||
transcriptionOptions.prompt = prompt;
|
transcriptionOptions.prompt = prompt;
|
||||||
}
|
}
|
||||||
|
|
||||||
console.log(`Using model: ${actualModel}, format: ${actualFormat}${language ? `, language: ${language}` : ''}`);
|
console.log(`Using model: ${actualModel}, format: ${actualFormat}${language ? `, language: ${language}` : ''}`);
|
||||||
|
|
||||||
const transcription = await getOpenAI().audio.transcriptions.create(transcriptionOptions);
|
const transcription = await getOpenAI().audio.transcriptions.create(transcriptionOptions);
|
||||||
|
|
||||||
return {
|
return {
|
||||||
success: true,
|
success: true,
|
||||||
filePath,
|
filePath,
|
||||||
text: actualFormat === 'json' || actualFormat === 'verbose_json'
|
text: actualFormat === 'json' || actualFormat === 'verbose_json'
|
||||||
? transcription.text
|
? transcription.text
|
||||||
: transcription,
|
: transcription,
|
||||||
format: actualFormat,
|
format: actualFormat,
|
||||||
model: actualModel,
|
model: actualModel,
|
||||||
};
|
};
|
||||||
} catch (error) {
|
} catch (error) {
|
||||||
throw new Error(`Transcription failed: ${error.message}`);
|
throw new Error(`Transcription failed: ${error.message}`);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Transcribe and save to file
|
* Transcribe and save to file
|
||||||
*/
|
*/
|
||||||
export async function transcribeAndSave(filePath, options = {}) {
|
export async function transcribeAndSave(filePath, options = {}) {
|
||||||
const { outputFormat = 'txt', outputDir = null } = options;
|
const { outputFormat = 'txt', outputDir = null } = options;
|
||||||
|
|
||||||
const result = await transcribeFile(filePath, options);
|
const result = await transcribeFile(filePath, options);
|
||||||
|
|
||||||
// Determine output path
|
// Determine output path
|
||||||
const baseName = path.basename(filePath, path.extname(filePath));
|
const baseName = path.basename(filePath, path.extname(filePath));
|
||||||
const outputPath = path.join(
|
const outputPath = path.join(
|
||||||
outputDir || path.dirname(filePath),
|
outputDir || path.dirname(filePath),
|
||||||
`${baseName}.${outputFormat}`
|
`${baseName}.${outputFormat}`
|
||||||
);
|
);
|
||||||
|
|
||||||
// Save transcription
|
// Save transcription
|
||||||
fs.writeFileSync(outputPath, result.text, 'utf-8');
|
fs.writeFileSync(outputPath, result.text, 'utf-8');
|
||||||
|
|
||||||
return {
|
return {
|
||||||
...result,
|
...result,
|
||||||
transcriptionPath: outputPath,
|
transcriptionPath: outputPath,
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Transcribe multiple files
|
* Transcribe multiple files
|
||||||
*/
|
*/
|
||||||
export async function transcribeMultiple(filePaths, options = {}) {
|
export async function transcribeMultiple(filePaths, options = {}) {
|
||||||
const { onProgress, onFileComplete } = options;
|
const { onProgress, onFileComplete } = options;
|
||||||
const results = [];
|
const results = [];
|
||||||
|
|
||||||
for (let i = 0; i < filePaths.length; i++) {
|
for (let i = 0; i < filePaths.length; i++) {
|
||||||
const filePath = filePaths[i];
|
const filePath = filePaths[i];
|
||||||
|
|
||||||
if (onProgress) {
|
if (onProgress) {
|
||||||
onProgress({ current: i + 1, total: filePaths.length, filePath });
|
onProgress({ current: i + 1, total: filePaths.length, filePath });
|
||||||
}
|
}
|
||||||
|
|
||||||
console.log(`[${i + 1}/${filePaths.length}] Transcribing: ${path.basename(filePath)}`);
|
console.log(`[${i + 1}/${filePaths.length}] Transcribing: ${path.basename(filePath)}`);
|
||||||
|
|
||||||
try {
|
try {
|
||||||
const result = await transcribeAndSave(filePath, options);
|
const result = await transcribeAndSave(filePath, options);
|
||||||
results.push(result);
|
results.push(result);
|
||||||
|
|
||||||
if (onFileComplete) {
|
if (onFileComplete) {
|
||||||
onFileComplete(result);
|
onFileComplete(result);
|
||||||
}
|
}
|
||||||
} catch (error) {
|
} catch (error) {
|
||||||
console.error(`Failed to transcribe ${filePath}: ${error.message}`);
|
console.error(`Failed to transcribe ${filePath}: ${error.message}`);
|
||||||
results.push({
|
results.push({
|
||||||
success: false,
|
success: false,
|
||||||
filePath,
|
filePath,
|
||||||
error: error.message,
|
error: error.message,
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
return {
|
return {
|
||||||
success: true,
|
success: true,
|
||||||
results,
|
results,
|
||||||
totalFiles: filePaths.length,
|
totalFiles: filePaths.length,
|
||||||
successCount: results.filter(r => r.success).length,
|
successCount: results.filter(r => r.success).length,
|
||||||
failCount: results.filter(r => !r.success).length,
|
failCount: results.filter(r => !r.success).length,
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|||||||
@ -1,270 +1,271 @@
|
|||||||
import OpenAI from 'openai';
|
import OpenAI from 'openai';
|
||||||
import fs from 'fs';
|
import fs from 'fs';
|
||||||
import path from 'path';
|
import path from 'path';
|
||||||
|
|
||||||
let openai = null;
|
let openai = null;
|
||||||
|
|
||||||
// Max characters per chunk (~6000 tokens ≈ 24000 characters for most languages)
|
// Max characters per chunk (~6000 tokens ≈ 24000 characters for most languages)
|
||||||
const MAX_CHUNK_CHARS = 20000;
|
const MAX_CHUNK_CHARS = 20000;
|
||||||
|
|
||||||
const LANGUAGES = {
|
const LANGUAGES = {
|
||||||
en: 'English',
|
en: 'English',
|
||||||
fr: 'French',
|
fr: 'French',
|
||||||
es: 'Spanish',
|
es: 'Spanish',
|
||||||
de: 'German',
|
de: 'German',
|
||||||
it: 'Italian',
|
it: 'Italian',
|
||||||
pt: 'Portuguese',
|
pt: 'Portuguese',
|
||||||
zh: 'Chinese',
|
zh: 'Chinese',
|
||||||
ja: 'Japanese',
|
ja: 'Japanese',
|
||||||
ko: 'Korean',
|
ko: 'Korean',
|
||||||
ru: 'Russian',
|
ru: 'Russian',
|
||||||
ar: 'Arabic',
|
ar: 'Arabic',
|
||||||
hi: 'Hindi',
|
hi: 'Hindi',
|
||||||
nl: 'Dutch',
|
nl: 'Dutch',
|
||||||
pl: 'Polish',
|
pl: 'Polish',
|
||||||
tr: 'Turkish',
|
tr: 'Turkish',
|
||||||
vi: 'Vietnamese',
|
vi: 'Vietnamese',
|
||||||
th: 'Thai',
|
th: 'Thai',
|
||||||
sv: 'Swedish',
|
sv: 'Swedish',
|
||||||
da: 'Danish',
|
da: 'Danish',
|
||||||
fi: 'Finnish',
|
fi: 'Finnish',
|
||||||
no: 'Norwegian',
|
no: 'Norwegian',
|
||||||
cs: 'Czech',
|
cs: 'Czech',
|
||||||
el: 'Greek',
|
el: 'Greek',
|
||||||
he: 'Hebrew',
|
he: 'Hebrew',
|
||||||
id: 'Indonesian',
|
id: 'Indonesian',
|
||||||
ms: 'Malay',
|
ms: 'Malay',
|
||||||
ro: 'Romanian',
|
ro: 'Romanian',
|
||||||
uk: 'Ukrainian',
|
uk: 'Ukrainian',
|
||||||
};
|
};
|
||||||
|
|
||||||
// Sentence ending patterns for different languages
|
// Sentence ending patterns for different languages
|
||||||
const SENTENCE_ENDINGS = /[.!?。!?。\n]/g;
|
const SENTENCE_ENDINGS = /[.!?。!?。\n]/g;
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Get OpenAI client (lazy initialization)
|
* Get OpenAI client (lazy initialization)
|
||||||
*/
|
*/
|
||||||
function getOpenAI() {
|
function getOpenAI() {
|
||||||
if (!openai) {
|
if (!openai) {
|
||||||
if (!process.env.OPENAI_API_KEY) {
|
if (!process.env.OPENAI_API_KEY) {
|
||||||
throw new Error('OPENAI_API_KEY environment variable is not set');
|
throw new Error('OPENAI_API_KEY environment variable is not set');
|
||||||
}
|
}
|
||||||
openai = new OpenAI({
|
openai = new OpenAI({
|
||||||
apiKey: process.env.OPENAI_API_KEY,
|
apiKey: process.env.OPENAI_API_KEY,
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
return openai;
|
return openai;
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Split text into chunks at sentence boundaries
|
* Split text into chunks at sentence boundaries
|
||||||
* @param {string} text - Text to split
|
* @param {string} text - Text to split
|
||||||
* @param {number} maxChars - Maximum characters per chunk
|
* @param {number} maxChars - Maximum characters per chunk
|
||||||
* @returns {string[]} Array of text chunks
|
* @returns {string[]} Array of text chunks
|
||||||
*/
|
*/
|
||||||
function splitIntoChunks(text, maxChars = MAX_CHUNK_CHARS) {
|
function splitIntoChunks(text, maxChars = MAX_CHUNK_CHARS) {
|
||||||
if (text.length <= maxChars) {
|
if (text.length <= maxChars) {
|
||||||
return [text];
|
return [text];
|
||||||
}
|
}
|
||||||
|
|
||||||
const chunks = [];
|
const chunks = [];
|
||||||
let currentPos = 0;
|
let currentPos = 0;
|
||||||
|
|
||||||
while (currentPos < text.length) {
|
while (currentPos < text.length) {
|
||||||
let endPos = currentPos + maxChars;
|
let endPos = currentPos + maxChars;
|
||||||
|
|
||||||
// If we're at the end, just take the rest
|
// If we're at the end, just take the rest
|
||||||
if (endPos >= text.length) {
|
if (endPos >= text.length) {
|
||||||
chunks.push(text.slice(currentPos));
|
chunks.push(text.slice(currentPos));
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
|
|
||||||
// Find the last sentence ending before maxChars
|
// Find the last sentence ending before maxChars
|
||||||
const searchText = text.slice(currentPos, endPos);
|
const searchText = text.slice(currentPos, endPos);
|
||||||
let lastSentenceEnd = -1;
|
let lastSentenceEnd = -1;
|
||||||
|
|
||||||
// Find all sentence endings in the search range
|
// Find all sentence endings in the search range
|
||||||
let match;
|
let match;
|
||||||
SENTENCE_ENDINGS.lastIndex = 0;
|
SENTENCE_ENDINGS.lastIndex = 0;
|
||||||
while ((match = SENTENCE_ENDINGS.exec(searchText)) !== null) {
|
while ((match = SENTENCE_ENDINGS.exec(searchText)) !== null) {
|
||||||
lastSentenceEnd = match.index + 1; // Include the punctuation
|
lastSentenceEnd = match.index + 1; // Include the punctuation
|
||||||
}
|
}
|
||||||
|
|
||||||
// If we found a sentence ending, cut there
|
// If we found a sentence ending, cut there
|
||||||
// Otherwise, look for the next sentence ending after maxChars (up to 20% more)
|
// Otherwise, look for the next sentence ending after maxChars (up to 20% more)
|
||||||
if (lastSentenceEnd > maxChars * 0.5) {
|
if (lastSentenceEnd > maxChars * 0.5) {
|
||||||
endPos = currentPos + lastSentenceEnd;
|
endPos = currentPos + lastSentenceEnd;
|
||||||
} else {
|
} else {
|
||||||
// Look forward for a sentence ending (up to 20% more characters)
|
// Look forward for a sentence ending (up to 20% more characters)
|
||||||
const extendedSearch = text.slice(endPos, endPos + maxChars * 0.2);
|
const extendedSearch = text.slice(endPos, endPos + maxChars * 0.2);
|
||||||
SENTENCE_ENDINGS.lastIndex = 0;
|
SENTENCE_ENDINGS.lastIndex = 0;
|
||||||
const forwardMatch = SENTENCE_ENDINGS.exec(extendedSearch);
|
const forwardMatch = SENTENCE_ENDINGS.exec(extendedSearch);
|
||||||
if (forwardMatch) {
|
if (forwardMatch) {
|
||||||
endPos = endPos + forwardMatch.index + 1;
|
endPos = endPos + forwardMatch.index + 1;
|
||||||
}
|
}
|
||||||
// If still no sentence ending found, just cut at maxChars
|
// If still no sentence ending found, just cut at maxChars
|
||||||
}
|
}
|
||||||
|
|
||||||
chunks.push(text.slice(currentPos, endPos).trim());
|
chunks.push(text.slice(currentPos, endPos).trim());
|
||||||
currentPos = endPos;
|
currentPos = endPos;
|
||||||
|
|
||||||
// Skip any leading whitespace for the next chunk
|
// Skip any leading whitespace for the next chunk
|
||||||
while (currentPos < text.length && /\s/.test(text[currentPos])) {
|
while (currentPos < text.length && /\s/.test(text[currentPos])) {
|
||||||
currentPos++;
|
currentPos++;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
return chunks.filter(chunk => chunk.length > 0);
|
return chunks.filter(chunk => chunk.length > 0);
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Get available languages
|
* Get available languages
|
||||||
*/
|
*/
|
||||||
export function getLanguages() {
|
export function getLanguages() {
|
||||||
return LANGUAGES;
|
return LANGUAGES;
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Translate a single chunk of text
|
* Translate a single chunk of text
|
||||||
*/
|
*/
|
||||||
async function translateChunk(text, targetLanguage, sourceLanguage) {
|
async function translateChunk(text, targetLanguage, sourceLanguage) {
|
||||||
const prompt = sourceLanguage
|
const prompt = sourceLanguage
|
||||||
? `Translate the following text from ${sourceLanguage} to ${targetLanguage}. Only output the translation, nothing else:\n\n${text}`
|
? `Translate the following text from ${sourceLanguage} to ${targetLanguage}. Only output the translation, nothing else:\n\n${text}`
|
||||||
: `Translate the following text to ${targetLanguage}. Only output the translation, nothing else:\n\n${text}`;
|
: `Translate the following text to ${targetLanguage}. Only output the translation, nothing else:\n\n${text}`;
|
||||||
|
|
||||||
const response = await getOpenAI().chat.completions.create({
|
const response = await getOpenAI().chat.completions.create({
|
||||||
model: 'gpt-4o-mini',
|
model: 'gpt-4o-mini',
|
||||||
max_tokens: 16384,
|
max_tokens: 16384,
|
||||||
messages: [
|
messages: [
|
||||||
{
|
{
|
||||||
role: 'user',
|
role: 'user',
|
||||||
content: prompt,
|
content: prompt,
|
||||||
},
|
},
|
||||||
],
|
],
|
||||||
});
|
});
|
||||||
|
|
||||||
return response.choices[0].message.content;
|
return response.choices[0].message.content;
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Translate text using GPT-4o-mini with chunking for long texts
|
* Translate text using GPT-4o-mini with chunking for long texts
|
||||||
* @param {string} text - Text to translate
|
* @param {string} text - Text to translate
|
||||||
* @param {string} targetLang - Target language code (e.g., 'en', 'fr')
|
* @param {string} targetLang - Target language code (e.g., 'en', 'fr')
|
||||||
* @param {string} sourceLang - Source language code (optional, auto-detect if null)
|
* @param {string} sourceLang - Source language code (optional, auto-detect if null)
|
||||||
*/
|
*/
|
||||||
export async function translateText(text, targetLang, sourceLang = null) {
|
export async function translateText(text, targetLang, sourceLang = null) {
|
||||||
if (!text || !text.trim()) {
|
if (!text || !text.trim()) {
|
||||||
throw new Error('No text provided for translation');
|
throw new Error('No text provided for translation');
|
||||||
}
|
}
|
||||||
|
|
||||||
const targetLanguage = LANGUAGES[targetLang] || targetLang;
|
const targetLanguage = LANGUAGES[targetLang] || targetLang;
|
||||||
const sourceLanguage = sourceLang ? (LANGUAGES[sourceLang] || sourceLang) : null;
|
const sourceLanguage = sourceLang ? (LANGUAGES[sourceLang] || sourceLang) : null;
|
||||||
|
|
||||||
try {
|
try {
|
||||||
// Split text into chunks
|
// Split text into chunks
|
||||||
const chunks = splitIntoChunks(text);
|
const chunks = splitIntoChunks(text);
|
||||||
|
|
||||||
if (chunks.length === 1) {
|
if (chunks.length === 1) {
|
||||||
// Single chunk - translate directly
|
// Single chunk - translate directly
|
||||||
const translation = await translateChunk(text, targetLanguage, sourceLanguage);
|
const translation = await translateChunk(text, targetLanguage, sourceLanguage);
|
||||||
return {
|
return {
|
||||||
success: true,
|
success: true,
|
||||||
originalText: text,
|
originalText: text,
|
||||||
translatedText: translation,
|
translatedText: translation,
|
||||||
targetLanguage: targetLanguage,
|
targetLanguage: targetLanguage,
|
||||||
sourceLanguage: sourceLanguage || 'auto-detected',
|
sourceLanguage: sourceLanguage || 'auto-detected',
|
||||||
chunks: 1,
|
chunks: 1,
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|
||||||
// Multiple chunks - translate each and combine
|
// Multiple chunks - translate each and combine
|
||||||
console.log(`Splitting text into ${chunks.length} chunks for translation...`);
|
console.log(`Splitting text into ${chunks.length} chunks for translation...`);
|
||||||
const translations = [];
|
const translations = [];
|
||||||
|
|
||||||
for (let i = 0; i < chunks.length; i++) {
|
for (let i = 0; i < chunks.length; i++) {
|
||||||
console.log(` Translating chunk ${i + 1}/${chunks.length} (${chunks[i].length} chars)...`);
|
console.log(` Translating chunk ${i + 1}/${chunks.length} (${chunks[i].length} chars)...`);
|
||||||
const translation = await translateChunk(chunks[i], targetLanguage, sourceLanguage);
|
const translation = await translateChunk(chunks[i], targetLanguage, sourceLanguage);
|
||||||
translations.push(translation);
|
translations.push(translation);
|
||||||
}
|
}
|
||||||
|
|
||||||
const combinedTranslation = translations.join('\n\n');
|
const combinedTranslation = translations.join('\n\n');
|
||||||
|
|
||||||
return {
|
return {
|
||||||
success: true,
|
success: true,
|
||||||
originalText: text,
|
originalText: text,
|
||||||
translatedText: combinedTranslation,
|
translatedText: combinedTranslation,
|
||||||
targetLanguage: targetLanguage,
|
targetLanguage: targetLanguage,
|
||||||
sourceLanguage: sourceLanguage || 'auto-detected',
|
sourceLanguage: sourceLanguage || 'auto-detected',
|
||||||
chunks: chunks.length,
|
chunks: chunks.length,
|
||||||
};
|
};
|
||||||
} catch (error) {
|
} catch (error) {
|
||||||
throw new Error(`Translation failed: ${error.message}`);
|
throw new Error(`Translation failed: ${error.message}`);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Translate a text file
|
* Translate a text file
|
||||||
* @param {string} filePath - Path to text file
|
* @param {string} filePath - Path to text file
|
||||||
* @param {string} targetLang - Target language code
|
* @param {string} targetLang - Target language code
|
||||||
* @param {string} sourceLang - Source language code (optional)
|
* @param {string} sourceLang - Source language code (optional)
|
||||||
*/
|
* @param {string} outputDir - Output directory (optional)
|
||||||
export async function translateFile(filePath, targetLang, sourceLang = null) {
|
*/
|
||||||
if (!fs.existsSync(filePath)) {
|
export async function translateFile(filePath, targetLang, sourceLang = null, outputDir = null) {
|
||||||
throw new Error(`File not found: ${filePath}`);
|
if (!fs.existsSync(filePath)) {
|
||||||
}
|
throw new Error(`File not found: ${filePath}`);
|
||||||
|
}
|
||||||
const text = fs.readFileSync(filePath, 'utf-8');
|
|
||||||
const result = await translateText(text, targetLang, sourceLang);
|
const text = fs.readFileSync(filePath, 'utf-8');
|
||||||
|
const result = await translateText(text, targetLang, sourceLang);
|
||||||
// Save translation
|
|
||||||
const baseName = path.basename(filePath, path.extname(filePath));
|
// Save translation
|
||||||
const outputPath = path.join(
|
const baseName = path.basename(filePath, path.extname(filePath));
|
||||||
path.dirname(filePath),
|
const outputPath = path.join(
|
||||||
`${baseName}_${targetLang}.txt`
|
outputDir || path.dirname(filePath),
|
||||||
);
|
`${baseName}_${targetLang}.txt`
|
||||||
|
);
|
||||||
fs.writeFileSync(outputPath, result.translatedText, 'utf-8');
|
|
||||||
|
fs.writeFileSync(outputPath, result.translatedText, 'utf-8');
|
||||||
return {
|
|
||||||
...result,
|
return {
|
||||||
originalPath: filePath,
|
...result,
|
||||||
translationPath: outputPath,
|
originalPath: filePath,
|
||||||
};
|
translationPath: outputPath,
|
||||||
}
|
};
|
||||||
|
}
|
||||||
/**
|
|
||||||
* Translate multiple files
|
/**
|
||||||
*/
|
* Translate multiple files
|
||||||
export async function translateMultiple(filePaths, targetLang, sourceLang = null, onProgress = null) {
|
*/
|
||||||
const results = [];
|
export async function translateMultiple(filePaths, targetLang, sourceLang = null, outputDir = null, onProgress = null) {
|
||||||
|
const results = [];
|
||||||
for (let i = 0; i < filePaths.length; i++) {
|
|
||||||
const filePath = filePaths[i];
|
for (let i = 0; i < filePaths.length; i++) {
|
||||||
|
const filePath = filePaths[i];
|
||||||
if (onProgress) {
|
|
||||||
onProgress({ current: i + 1, total: filePaths.length, filePath });
|
if (onProgress) {
|
||||||
}
|
onProgress({ current: i + 1, total: filePaths.length, filePath });
|
||||||
|
}
|
||||||
console.log(`[${i + 1}/${filePaths.length}] Translating: ${path.basename(filePath)}`);
|
|
||||||
|
console.log(`[${i + 1}/${filePaths.length}] Translating: ${path.basename(filePath)}`);
|
||||||
try {
|
|
||||||
const result = await translateFile(filePath, targetLang, sourceLang);
|
try {
|
||||||
results.push(result);
|
const result = await translateFile(filePath, targetLang, sourceLang, outputDir);
|
||||||
} catch (error) {
|
results.push(result);
|
||||||
console.error(`Failed to translate ${filePath}: ${error.message}`);
|
} catch (error) {
|
||||||
results.push({
|
console.error(`Failed to translate ${filePath}: ${error.message}`);
|
||||||
success: false,
|
results.push({
|
||||||
originalPath: filePath,
|
success: false,
|
||||||
error: error.message,
|
originalPath: filePath,
|
||||||
});
|
error: error.message,
|
||||||
}
|
});
|
||||||
}
|
}
|
||||||
|
}
|
||||||
return {
|
|
||||||
success: true,
|
return {
|
||||||
results,
|
success: true,
|
||||||
totalFiles: filePaths.length,
|
results,
|
||||||
successCount: results.filter(r => r.success).length,
|
totalFiles: filePaths.length,
|
||||||
failCount: results.filter(r => !r.success).length,
|
successCount: results.filter(r => r.success).length,
|
||||||
};
|
failCount: results.filter(r => !r.success).length,
|
||||||
}
|
};
|
||||||
|
}
|
||||||
|
|||||||
@ -1,291 +1,291 @@
|
|||||||
import { createRequire } from 'module';
|
import { createRequire } from 'module';
|
||||||
import path from 'path';
|
import path from 'path';
|
||||||
import fs from 'fs';
|
import fs from 'fs';
|
||||||
import { spawn } from 'child_process';
|
import { spawn } from 'child_process';
|
||||||
|
|
||||||
// Use system yt-dlp binary (check common paths)
|
// Use system yt-dlp binary (check common paths)
|
||||||
const YTDLP_PATH = process.env.YTDLP_PATH || 'yt-dlp';
|
const YTDLP_PATH = process.env.YTDLP_PATH || 'yt-dlp';
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Execute yt-dlp command and return parsed JSON
|
* Execute yt-dlp command and return parsed JSON
|
||||||
*/
|
*/
|
||||||
async function ytdlp(url, args = []) {
|
async function ytdlp(url, args = []) {
|
||||||
return new Promise((resolve, reject) => {
|
return new Promise((resolve, reject) => {
|
||||||
const proc = spawn(YTDLP_PATH, [...args, url]);
|
const proc = spawn(YTDLP_PATH, [...args, url]);
|
||||||
let stdout = '';
|
let stdout = '';
|
||||||
let stderr = '';
|
let stderr = '';
|
||||||
|
|
||||||
proc.stdout.on('data', (data) => { stdout += data; });
|
proc.stdout.on('data', (data) => { stdout += data; });
|
||||||
proc.stderr.on('data', (data) => { stderr += data; });
|
proc.stderr.on('data', (data) => { stderr += data; });
|
||||||
|
|
||||||
proc.on('close', (code) => {
|
proc.on('close', (code) => {
|
||||||
if (code === 0) {
|
if (code === 0) {
|
||||||
try {
|
try {
|
||||||
resolve(JSON.parse(stdout));
|
resolve(JSON.parse(stdout));
|
||||||
} catch {
|
} catch {
|
||||||
resolve(stdout);
|
resolve(stdout);
|
||||||
}
|
}
|
||||||
} else {
|
} else {
|
||||||
reject(new Error(stderr || `yt-dlp exited with code ${code}`));
|
reject(new Error(stderr || `yt-dlp exited with code ${code}`));
|
||||||
}
|
}
|
||||||
});
|
});
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Execute yt-dlp command with progress callback
|
* Execute yt-dlp command with progress callback
|
||||||
*/
|
*/
|
||||||
function ytdlpExec(url, args = [], onProgress) {
|
function ytdlpExec(url, args = [], onProgress) {
|
||||||
return new Promise((resolve, reject) => {
|
return new Promise((resolve, reject) => {
|
||||||
const proc = spawn(YTDLP_PATH, [...args, url]);
|
const proc = spawn(YTDLP_PATH, [...args, url]);
|
||||||
let stderr = '';
|
let stderr = '';
|
||||||
|
|
||||||
proc.stdout.on('data', (data) => {
|
proc.stdout.on('data', (data) => {
|
||||||
const line = data.toString();
|
const line = data.toString();
|
||||||
if (onProgress) {
|
if (onProgress) {
|
||||||
const progressMatch = line.match(/\[download\]\s+(\d+\.?\d*)%/);
|
const progressMatch = line.match(/\[download\]\s+(\d+\.?\d*)%/);
|
||||||
const etaMatch = line.match(/ETA\s+(\d+:\d+)/);
|
const etaMatch = line.match(/ETA\s+(\d+:\d+)/);
|
||||||
const speedMatch = line.match(/at\s+([\d.]+\w+\/s)/);
|
const speedMatch = line.match(/at\s+([\d.]+\w+\/s)/);
|
||||||
|
|
||||||
if (progressMatch) {
|
if (progressMatch) {
|
||||||
onProgress({
|
onProgress({
|
||||||
percent: parseFloat(progressMatch[1]),
|
percent: parseFloat(progressMatch[1]),
|
||||||
eta: etaMatch ? etaMatch[1] : null,
|
eta: etaMatch ? etaMatch[1] : null,
|
||||||
speed: speedMatch ? speedMatch[1] : null,
|
speed: speedMatch ? speedMatch[1] : null,
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
});
|
});
|
||||||
|
|
||||||
proc.stderr.on('data', (data) => { stderr += data; });
|
proc.stderr.on('data', (data) => { stderr += data; });
|
||||||
|
|
||||||
proc.on('close', (code) => {
|
proc.on('close', (code) => {
|
||||||
if (code === 0) {
|
if (code === 0) {
|
||||||
resolve();
|
resolve();
|
||||||
} else {
|
} else {
|
||||||
reject(new Error(stderr || `yt-dlp exited with code ${code}`));
|
reject(new Error(stderr || `yt-dlp exited with code ${code}`));
|
||||||
}
|
}
|
||||||
});
|
});
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
|
|
||||||
const OUTPUT_DIR = process.env.OUTPUT_DIR || './output';
|
const OUTPUT_DIR = process.env.OUTPUT_DIR || './output';
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Sanitize filename to remove invalid characters
|
* Sanitize filename to remove invalid characters
|
||||||
*/
|
*/
|
||||||
function sanitizeFilename(filename) {
|
function sanitizeFilename(filename) {
|
||||||
return filename
|
return filename
|
||||||
.replace(/[<>:"/\\|?*]/g, '')
|
.replace(/[<>:"/\\|?*]/g, '')
|
||||||
.replace(/\s+/g, '_')
|
.replace(/\s+/g, '_')
|
||||||
.substring(0, 200);
|
.substring(0, 200);
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Check if URL contains a playlist parameter
|
* Check if URL contains a playlist parameter
|
||||||
*/
|
*/
|
||||||
function hasPlaylistParam(url) {
|
function hasPlaylistParam(url) {
|
||||||
try {
|
try {
|
||||||
const urlObj = new URL(url);
|
const urlObj = new URL(url);
|
||||||
return urlObj.searchParams.has('list');
|
return urlObj.searchParams.has('list');
|
||||||
} catch {
|
} catch {
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Extract playlist URL if present in the URL
|
* Extract playlist URL if present in the URL
|
||||||
*/
|
*/
|
||||||
function extractPlaylistUrl(url) {
|
function extractPlaylistUrl(url) {
|
||||||
const urlObj = new URL(url);
|
const urlObj = new URL(url);
|
||||||
const listId = urlObj.searchParams.get('list');
|
const listId = urlObj.searchParams.get('list');
|
||||||
if (listId) {
|
if (listId) {
|
||||||
return `https://www.youtube.com/playlist?list=${listId}`;
|
return `https://www.youtube.com/playlist?list=${listId}`;
|
||||||
}
|
}
|
||||||
return null;
|
return null;
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Get video/playlist info without downloading
|
* Get video/playlist info without downloading
|
||||||
*/
|
*/
|
||||||
export async function getInfo(url, forcePlaylist = false) {
|
export async function getInfo(url, forcePlaylist = false) {
|
||||||
try {
|
try {
|
||||||
// If URL contains a playlist ID and we want to force playlist mode
|
// If URL contains a playlist ID and we want to force playlist mode
|
||||||
const playlistUrl = extractPlaylistUrl(url);
|
const playlistUrl = extractPlaylistUrl(url);
|
||||||
const targetUrl = (forcePlaylist && playlistUrl) ? playlistUrl : url;
|
const targetUrl = (forcePlaylist && playlistUrl) ? playlistUrl : url;
|
||||||
|
|
||||||
const info = await ytdlp(targetUrl, [
|
const info = await ytdlp(targetUrl, [
|
||||||
'--dump-single-json',
|
'--dump-single-json',
|
||||||
'--no-download',
|
'--no-download',
|
||||||
'--no-warnings',
|
'--no-warnings',
|
||||||
'--flat-playlist',
|
'--flat-playlist',
|
||||||
]);
|
]);
|
||||||
return info;
|
return info;
|
||||||
} catch (error) {
|
} catch (error) {
|
||||||
throw new Error(`Failed to get info: ${error.message}`);
|
throw new Error(`Failed to get info: ${error.message}`);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Check if URL is a playlist
|
* Check if URL is a playlist
|
||||||
*/
|
*/
|
||||||
export async function isPlaylist(url) {
|
export async function isPlaylist(url) {
|
||||||
const info = await getInfo(url);
|
const info = await getInfo(url);
|
||||||
return info._type === 'playlist';
|
return info._type === 'playlist';
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Download a single video as MP3
|
* Download a single video as MP3
|
||||||
*/
|
*/
|
||||||
export async function downloadVideo(url, options = {}) {
|
export async function downloadVideo(url, options = {}) {
|
||||||
const { outputDir = OUTPUT_DIR, onProgress, onDownloadProgress } = options;
|
const { outputDir = OUTPUT_DIR, onProgress, onDownloadProgress } = options;
|
||||||
|
|
||||||
// Ensure output directory exists
|
// Ensure output directory exists
|
||||||
if (!fs.existsSync(outputDir)) {
|
if (!fs.existsSync(outputDir)) {
|
||||||
fs.mkdirSync(outputDir, { recursive: true });
|
fs.mkdirSync(outputDir, { recursive: true });
|
||||||
}
|
}
|
||||||
|
|
||||||
try {
|
try {
|
||||||
// Get video info first
|
// Get video info first
|
||||||
const info = await ytdlp(url, [
|
const info = await ytdlp(url, [
|
||||||
'--dump-single-json',
|
'--dump-single-json',
|
||||||
'--no-download',
|
'--no-download',
|
||||||
'--no-warnings',
|
'--no-warnings',
|
||||||
]);
|
]);
|
||||||
|
|
||||||
const title = sanitizeFilename(info.title);
|
const title = sanitizeFilename(info.title);
|
||||||
const outputPath = path.join(outputDir, `${title}.mp3`);
|
const outputPath = path.join(outputDir, `${title}.mp3`);
|
||||||
|
|
||||||
// Download and convert to MP3 with progress
|
// Download and convert to MP3 with progress
|
||||||
await ytdlpExec(url, [
|
await ytdlpExec(url, [
|
||||||
'--extract-audio',
|
'--extract-audio',
|
||||||
'--audio-format', 'mp3',
|
'--audio-format', 'mp3',
|
||||||
'--audio-quality', '0',
|
'--audio-quality', '0',
|
||||||
'-o', outputPath,
|
'-o', outputPath,
|
||||||
'--no-warnings',
|
'--no-warnings',
|
||||||
'--newline',
|
'--newline',
|
||||||
], (progress) => {
|
], (progress) => {
|
||||||
if (onDownloadProgress) {
|
if (onDownloadProgress) {
|
||||||
onDownloadProgress({
|
onDownloadProgress({
|
||||||
...progress,
|
...progress,
|
||||||
title: info.title,
|
title: info.title,
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
});
|
});
|
||||||
|
|
||||||
return {
|
return {
|
||||||
success: true,
|
success: true,
|
||||||
title: info.title,
|
title: info.title,
|
||||||
duration: info.duration,
|
duration: info.duration,
|
||||||
filePath: outputPath,
|
filePath: outputPath,
|
||||||
url: url,
|
url: url,
|
||||||
};
|
};
|
||||||
} catch (error) {
|
} catch (error) {
|
||||||
throw new Error(`Failed to download: ${error.message}`);
|
throw new Error(`Failed to download: ${error.message}`);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Download all videos from a playlist as MP3
|
* Download all videos from a playlist as MP3
|
||||||
*/
|
*/
|
||||||
export async function downloadPlaylist(url, options = {}) {
|
export async function downloadPlaylist(url, options = {}) {
|
||||||
const { outputDir = OUTPUT_DIR, onProgress, onVideoComplete, onDownloadProgress, forcePlaylist = false } = options;
|
const { outputDir = OUTPUT_DIR, onProgress, onVideoComplete, onDownloadProgress, forcePlaylist = false } = options;
|
||||||
|
|
||||||
// Ensure output directory exists
|
// Ensure output directory exists
|
||||||
if (!fs.existsSync(outputDir)) {
|
if (!fs.existsSync(outputDir)) {
|
||||||
fs.mkdirSync(outputDir, { recursive: true });
|
fs.mkdirSync(outputDir, { recursive: true });
|
||||||
}
|
}
|
||||||
|
|
||||||
try {
|
try {
|
||||||
// Get playlist info (force playlist mode if URL has list= param)
|
// Get playlist info (force playlist mode if URL has list= param)
|
||||||
const info = await getInfo(url, forcePlaylist || hasPlaylistParam(url));
|
const info = await getInfo(url, forcePlaylist || hasPlaylistParam(url));
|
||||||
|
|
||||||
if (info._type !== 'playlist') {
|
if (info._type !== 'playlist') {
|
||||||
// Single video, redirect to downloadVideo
|
// Single video, redirect to downloadVideo
|
||||||
const result = await downloadVideo(url, { ...options, onDownloadProgress });
|
const result = await downloadVideo(url, { ...options, onDownloadProgress });
|
||||||
return {
|
return {
|
||||||
success: true,
|
success: true,
|
||||||
playlistTitle: result.title,
|
playlistTitle: result.title,
|
||||||
videos: [result],
|
videos: [result],
|
||||||
totalVideos: 1,
|
totalVideos: 1,
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|
||||||
const results = [];
|
const results = [];
|
||||||
const entries = info.entries || [];
|
const entries = info.entries || [];
|
||||||
|
|
||||||
console.log(`Playlist: ${info.title} (${entries.length} videos)`);
|
console.log(`Playlist: ${info.title} (${entries.length} videos)`);
|
||||||
|
|
||||||
for (let i = 0; i < entries.length; i++) {
|
for (let i = 0; i < entries.length; i++) {
|
||||||
const entry = entries[i];
|
const entry = entries[i];
|
||||||
const videoUrl = entry.url || `https://www.youtube.com/watch?v=${entry.id}`;
|
const videoUrl = entry.url || `https://www.youtube.com/watch?v=${entry.id}`;
|
||||||
|
|
||||||
try {
|
try {
|
||||||
if (onProgress) {
|
if (onProgress) {
|
||||||
onProgress({ current: i + 1, total: entries.length, title: entry.title });
|
onProgress({ current: i + 1, total: entries.length, title: entry.title });
|
||||||
}
|
}
|
||||||
|
|
||||||
console.log(`[${i + 1}/${entries.length}] Downloading: ${entry.title}`);
|
console.log(`[${i + 1}/${entries.length}] Downloading: ${entry.title}`);
|
||||||
|
|
||||||
// Wrap progress callback to include playlist context
|
// Wrap progress callback to include playlist context
|
||||||
const wrappedProgress = onDownloadProgress ? (progress) => {
|
const wrappedProgress = onDownloadProgress ? (progress) => {
|
||||||
onDownloadProgress({
|
onDownloadProgress({
|
||||||
...progress,
|
...progress,
|
||||||
videoIndex: i + 1,
|
videoIndex: i + 1,
|
||||||
totalVideos: entries.length,
|
totalVideos: entries.length,
|
||||||
playlistTitle: info.title,
|
playlistTitle: info.title,
|
||||||
});
|
});
|
||||||
} : undefined;
|
} : undefined;
|
||||||
|
|
||||||
const result = await downloadVideo(videoUrl, { outputDir, onDownloadProgress: wrappedProgress });
|
const result = await downloadVideo(videoUrl, { outputDir, onDownloadProgress: wrappedProgress });
|
||||||
results.push(result);
|
results.push(result);
|
||||||
|
|
||||||
if (onVideoComplete) {
|
if (onVideoComplete) {
|
||||||
onVideoComplete(result);
|
onVideoComplete(result);
|
||||||
}
|
}
|
||||||
} catch (error) {
|
} catch (error) {
|
||||||
console.error(`Failed to download ${entry.title}: ${error.message}`);
|
console.error(`Failed to download ${entry.title}: ${error.message}`);
|
||||||
results.push({
|
results.push({
|
||||||
success: false,
|
success: false,
|
||||||
title: entry.title,
|
title: entry.title,
|
||||||
url: videoUrl,
|
url: videoUrl,
|
||||||
error: error.message,
|
error: error.message,
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
return {
|
return {
|
||||||
success: true,
|
success: true,
|
||||||
playlistTitle: info.title,
|
playlistTitle: info.title,
|
||||||
videos: results,
|
videos: results,
|
||||||
totalVideos: entries.length,
|
totalVideos: entries.length,
|
||||||
successCount: results.filter(r => r.success).length,
|
successCount: results.filter(r => r.success).length,
|
||||||
failCount: results.filter(r => !r.success).length,
|
failCount: results.filter(r => !r.success).length,
|
||||||
};
|
};
|
||||||
} catch (error) {
|
} catch (error) {
|
||||||
throw new Error(`Failed to download playlist: ${error.message}`);
|
throw new Error(`Failed to download playlist: ${error.message}`);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Smart download - detects if URL is video or playlist
|
* Smart download - detects if URL is video or playlist
|
||||||
*/
|
*/
|
||||||
export async function download(url, options = {}) {
|
export async function download(url, options = {}) {
|
||||||
// If URL contains list= parameter, treat it as a playlist
|
// If URL contains list= parameter, treat it as a playlist
|
||||||
const isPlaylistUrl = hasPlaylistParam(url);
|
const isPlaylistUrl = hasPlaylistParam(url);
|
||||||
const info = await getInfo(url, isPlaylistUrl);
|
const info = await getInfo(url, isPlaylistUrl);
|
||||||
|
|
||||||
if (info._type === 'playlist') {
|
if (info._type === 'playlist') {
|
||||||
return downloadPlaylist(url, { ...options, forcePlaylist: true });
|
return downloadPlaylist(url, { ...options, forcePlaylist: true });
|
||||||
} else {
|
} else {
|
||||||
const result = await downloadVideo(url, options);
|
const result = await downloadVideo(url, options);
|
||||||
return {
|
return {
|
||||||
success: true,
|
success: true,
|
||||||
playlistTitle: null,
|
playlistTitle: null,
|
||||||
videos: [result],
|
videos: [result],
|
||||||
totalVideos: 1,
|
totalVideos: 1,
|
||||||
successCount: 1,
|
successCount: 1,
|
||||||
failCount: 0,
|
failCount: 0,
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
61
start-server.bat
Normal file
61
start-server.bat
Normal file
@ -0,0 +1,61 @@
|
|||||||
|
@echo off
|
||||||
|
REM Video to MP3 Transcriptor Server Starter
|
||||||
|
REM This script starts the API server on port 8888
|
||||||
|
|
||||||
|
echo ==========================================
|
||||||
|
echo Video to MP3 Transcriptor API
|
||||||
|
echo ==========================================
|
||||||
|
echo.
|
||||||
|
|
||||||
|
REM Check if node is installed
|
||||||
|
where node >nul 2>nul
|
||||||
|
if %ERRORLEVEL% NEQ 0 (
|
||||||
|
echo Error: Node.js is not installed
|
||||||
|
echo Please install Node.js from https://nodejs.org/
|
||||||
|
pause
|
||||||
|
exit /b 1
|
||||||
|
)
|
||||||
|
|
||||||
|
REM Check if npm is installed
|
||||||
|
where npm >nul 2>nul
|
||||||
|
if %ERRORLEVEL% NEQ 0 (
|
||||||
|
echo Error: npm is not installed
|
||||||
|
echo Please install npm
|
||||||
|
pause
|
||||||
|
exit /b 1
|
||||||
|
)
|
||||||
|
|
||||||
|
REM Check if .env file exists
|
||||||
|
if not exist .env (
|
||||||
|
echo Warning: .env file not found
|
||||||
|
echo Creating .env file...
|
||||||
|
(
|
||||||
|
echo OPENAI_API_KEY=
|
||||||
|
echo PORT=8888
|
||||||
|
echo OUTPUT_DIR=./output
|
||||||
|
) > .env
|
||||||
|
echo.
|
||||||
|
echo Please edit .env and add your OPENAI_API_KEY
|
||||||
|
echo.
|
||||||
|
)
|
||||||
|
|
||||||
|
REM Check if node_modules exists
|
||||||
|
if not exist node_modules (
|
||||||
|
echo Installing dependencies...
|
||||||
|
call npm install
|
||||||
|
echo.
|
||||||
|
)
|
||||||
|
|
||||||
|
REM Kill any process using port 8888
|
||||||
|
echo Checking port 8888...
|
||||||
|
npx kill-port 8888 >nul 2>nul
|
||||||
|
|
||||||
|
echo.
|
||||||
|
echo Starting server on http://localhost:8888
|
||||||
|
echo Press Ctrl+C to stop the server
|
||||||
|
echo.
|
||||||
|
echo ==========================================
|
||||||
|
echo.
|
||||||
|
|
||||||
|
REM Start the server
|
||||||
|
call npm run server
|
||||||
58
start-server.sh
Normal file
58
start-server.sh
Normal file
@ -0,0 +1,58 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
|
||||||
|
# Video to MP3 Transcriptor Server Starter
|
||||||
|
# This script starts the API server on port 8888
|
||||||
|
|
||||||
|
echo "=========================================="
|
||||||
|
echo "Video to MP3 Transcriptor API"
|
||||||
|
echo "=========================================="
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# Check if node is installed
|
||||||
|
if ! command -v node &> /dev/null
|
||||||
|
then
|
||||||
|
echo "Error: Node.js is not installed"
|
||||||
|
echo "Please install Node.js from https://nodejs.org/"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Check if npm is installed
|
||||||
|
if ! command -v npm &> /dev/null
|
||||||
|
then
|
||||||
|
echo "Error: npm is not installed"
|
||||||
|
echo "Please install npm"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Check if .env file exists
|
||||||
|
if [ ! -f .env ]; then
|
||||||
|
echo "Warning: .env file not found"
|
||||||
|
echo "Creating .env file..."
|
||||||
|
echo "OPENAI_API_KEY=" > .env
|
||||||
|
echo "PORT=8888" >> .env
|
||||||
|
echo "OUTPUT_DIR=./output" >> .env
|
||||||
|
echo ""
|
||||||
|
echo "Please edit .env and add your OPENAI_API_KEY"
|
||||||
|
echo ""
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Check if node_modules exists
|
||||||
|
if [ ! -d "node_modules" ]; then
|
||||||
|
echo "Installing dependencies..."
|
||||||
|
npm install
|
||||||
|
echo ""
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Kill any process using port 8888
|
||||||
|
echo "Checking port 8888..."
|
||||||
|
npx kill-port 8888 2>/dev/null
|
||||||
|
|
||||||
|
echo ""
|
||||||
|
echo "Starting server on http://localhost:8888"
|
||||||
|
echo "Press Ctrl+C to stop the server"
|
||||||
|
echo ""
|
||||||
|
echo "=========================================="
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# Start the server
|
||||||
|
npm run server
|
||||||
Loading…
Reference in New Issue
Block a user