Initial commit: Video to MP3 Transcriptor

- YouTube video/playlist download as MP3 (yt-dlp)
- Audio transcription with OpenAI (gpt-4o-transcribe, whisper-1)
- Translation with GPT-4o-mini (chunking for long texts)
- Web interface with progress bars and drag & drop
- CLI and REST API interfaces
- Linux shell scripts

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
StillHammer 2025-11-24 11:40:23 +08:00
commit 849412c3bd
18 changed files with 5537 additions and 0 deletions

11
.env.example Normal file
View File

@ -0,0 +1,11 @@
# OpenAI API Key for Whisper transcription
OPENAI_API_KEY=your_openai_api_key_here
# Anthropic API Key for Claude Haiku translation (optional)
ANTHROPIC_API_KEY=your_anthropic_api_key_here
# Server port (optional, default: 3000)
PORT=3000
# Output directory (optional, default: ./output)
OUTPUT_DIR=./output

43
.gitignore vendored Normal file
View File

@ -0,0 +1,43 @@
# Dependencies
node_modules/
# Environment
.env
# Output directory
output/
# Audio files
*.mp3
*.wav
*.m4a
*.ogg
*.flac
*.aac
# Video files
*.mp4
*.webm
*.mkv
*.avi
# Text/transcription files
*.txt
# Logs
*.log
npm-debug.log*
# OS files
.DS_Store
Thumbs.db
# IDE
.vscode/
.idea/
*.swp
*.swo
# Temporary files
*.tmp
*.temp

235
README.md Normal file
View File

@ -0,0 +1,235 @@
# Video to MP3 Transcriptor
Download YouTube videos/playlists to MP3 and transcribe them using OpenAI Whisper API.
## Features
- Download single YouTube videos as MP3
- Download entire playlists as MP3
- Transcribe audio files using OpenAI Whisper API
- CLI interface for quick operations
- REST API for integration with other systems
## Prerequisites
- **Node.js** 18+
- **yt-dlp** installed on your system
- **ffmpeg** installed (for audio conversion)
- **OpenAI API key** (for transcription)
### Installing yt-dlp
```bash
# Windows (winget)
winget install yt-dlp
# macOS
brew install yt-dlp
# Linux
sudo apt install yt-dlp
# or
pip install yt-dlp
```
### Installing ffmpeg
```bash
# Windows (winget)
winget install ffmpeg
# macOS
brew install ffmpeg
# Linux
sudo apt install ffmpeg
```
## Installation
```bash
# Clone and install
cd videotoMP3Transcriptor
npm install
# Configure environment
cp .env.example .env
# Edit .env and add your OPENAI_API_KEY
```
## Usage
### CLI
```bash
# Download a video as MP3
npm run cli download "https://youtube.com/watch?v=VIDEO_ID"
# Download a playlist
npm run cli download "https://youtube.com/playlist?list=PLAYLIST_ID"
# Download with custom output directory
npm run cli download "URL" -o ./my-folder
# Get info about a video/playlist
npm run cli info "URL"
# Transcribe an existing MP3
npm run cli transcribe ./output/video.mp3
# Transcribe with specific language
npm run cli transcribe ./output/video.mp3 -l fr
# Transcribe with specific model
npm run cli transcribe ./output/video.mp3 -m gpt-4o-mini-transcribe
# Download AND transcribe
npm run cli process "URL"
# Download and transcribe with options
npm run cli process "URL" -l en -m gpt-4o-transcribe
```
### Linux Scripts
Convenience scripts are available in the `scripts/` directory:
```bash
# Make scripts executable (first time only)
chmod +x scripts/*.sh
# Download video/playlist
./scripts/download.sh "https://youtube.com/watch?v=VIDEO_ID"
# Transcribe a file
./scripts/transcribe.sh ./output/video.mp3 fr
# Download + transcribe
./scripts/process.sh "https://youtube.com/watch?v=VIDEO_ID" en
# Start the API server
./scripts/server.sh
# Get video info
./scripts/info.sh "https://youtube.com/watch?v=VIDEO_ID"
```
### API Server
```bash
# Start the server
npm run server
```
Server runs on `http://localhost:3000` by default.
#### Endpoints
##### GET /health
Health check endpoint.
##### GET /info?url=YOUTUBE_URL
Get info about a video or playlist.
```bash
curl "http://localhost:3000/info?url=https://youtube.com/watch?v=VIDEO_ID"
```
##### POST /download
Download video(s) as MP3.
```bash
curl -X POST http://localhost:3000/download \
-H "Content-Type: application/json" \
-d '{"url": "https://youtube.com/watch?v=VIDEO_ID"}'
```
##### POST /transcribe
Transcribe an existing audio file.
```bash
curl -X POST http://localhost:3000/transcribe \
-H "Content-Type: application/json" \
-d '{"filePath": "./output/video.mp3", "language": "en"}'
```
##### POST /process
Download and transcribe in one call.
```bash
curl -X POST http://localhost:3000/process \
-H "Content-Type: application/json" \
-d '{"url": "https://youtube.com/watch?v=VIDEO_ID", "language": "en", "format": "txt"}'
```
##### GET /files-list
List all downloaded files.
##### GET /files/:filename
Download/stream a specific file.
## Configuration
Environment variables (`.env`):
| Variable | Description | Default |
|----------|-------------|---------|
| `OPENAI_API_KEY` | Your OpenAI API key | Required for transcription |
| `PORT` | Server port | 3000 |
| `OUTPUT_DIR` | Download directory | ./output |
## Transcription Models
| Model | Description | Formats |
|-------|-------------|---------|
| `gpt-4o-transcribe` | Best quality, latest GPT-4o (default) | txt, json |
| `gpt-4o-mini-transcribe` | Faster, cheaper, good quality | txt, json |
| `whisper-1` | Legacy Whisper model | txt, json, srt, vtt |
## Transcription Formats
- `txt` - Plain text (all models)
- `json` - JSON response (all models)
- `srt` - SubRip subtitles (whisper-1 only)
- `vtt` - WebVTT subtitles (whisper-1 only)
## Language Codes
Common language codes for the `-l` option:
- `en` - English
- `fr` - French
- `es` - Spanish
- `de` - German
- `it` - Italian
- `pt` - Portuguese
- `zh` - Chinese
- `ja` - Japanese
- `ko` - Korean
- `ru` - Russian
Leave empty for auto-detection.
## Project Structure
```
videotoMP3Transcriptor/
├── src/
│ ├── services/
│ │ ├── youtube.js # YouTube download service
│ │ └── transcription.js # OpenAI transcription service
│ ├── cli.js # CLI entry point
│ └── server.js # Express API server
├── scripts/ # Linux convenience scripts
│ ├── download.sh # Download video/playlist
│ ├── transcribe.sh # Transcribe audio file
│ ├── process.sh # Download + transcribe
│ ├── server.sh # Start API server
│ └── info.sh # Get video info
├── output/ # Downloaded files
├── .env # Configuration
└── package.json
```
## License
MIT

1834
package-lock.json generated Normal file

File diff suppressed because it is too large Load Diff

34
package.json Normal file
View File

@ -0,0 +1,34 @@
{
"name": "video-to-mp3-transcriptor",
"version": "1.0.0",
"description": "Download YouTube videos/playlists to MP3 and transcribe them using OpenAI Whisper API",
"main": "src/index.js",
"type": "module",
"bin": {
"ytmp3": "./src/cli.js"
},
"scripts": {
"start": "node src/index.js",
"cli": "node src/cli.js",
"server": "node src/server.js"
},
"keywords": [
"youtube",
"mp3",
"transcription",
"whisper",
"openai"
],
"author": "",
"license": "MIT",
"dependencies": {
"@anthropic-ai/sdk": "^0.70.1",
"commander": "^12.1.0",
"cors": "^2.8.5",
"dotenv": "^16.4.5",
"express": "^4.21.0",
"multer": "^2.0.2",
"openai": "^4.67.0",
"youtube-dl-exec": "^3.0.7"
}
}

636
public/app.js Normal file
View File

@ -0,0 +1,636 @@
// API Base URL
const API_URL = '';
// Tab switching
document.querySelectorAll('.tab').forEach(tab => {
tab.addEventListener('click', () => {
document.querySelectorAll('.tab').forEach(t => t.classList.remove('active'));
document.querySelectorAll('.tab-content').forEach(c => c.classList.remove('active'));
tab.classList.add('active');
document.getElementById(tab.dataset.tab).classList.add('active');
});
});
// Helper: Show result
function showResult(elementId, success, content) {
const el = document.getElementById(elementId);
el.className = `result show ${success ? 'success' : 'error'}`;
el.innerHTML = content;
}
// Helper: Set loading state
function setLoading(button, loading) {
button.disabled = loading;
button.classList.toggle('loading', loading);
}
// Format seconds to MM:SS or HH:MM:SS
function formatTime(seconds) {
if (!seconds || seconds < 0) return '--:--';
const hrs = Math.floor(seconds / 3600);
const mins = Math.floor((seconds % 3600) / 60);
const secs = Math.floor(seconds % 60);
if (hrs > 0) {
return `${hrs}:${String(mins).padStart(2, '0')}:${String(secs).padStart(2, '0')}`;
}
return `${mins}:${String(secs).padStart(2, '0')}`;
}
// Format file size
function formatSize(bytes) {
if (!bytes) return '';
const units = ['B', 'KB', 'MB', 'GB'];
let i = 0;
while (bytes >= 1024 && i < units.length - 1) {
bytes /= 1024;
i++;
}
return `${bytes.toFixed(1)} ${units[i]}`;
}
// ==================== DOWNLOAD TAB ====================
const progressContainer = document.getElementById('download-progress');
const progressFill = document.getElementById('progress-fill');
const progressPercent = document.getElementById('progress-percent');
const progressTitle = document.getElementById('progress-title');
const progressEta = document.getElementById('progress-eta');
const progressInfo = document.getElementById('progress-info');
const progressSpeed = document.getElementById('progress-speed');
const progressCurrent = document.getElementById('progress-current');
function updateDownloadProgress(data) {
progressFill.style.width = `${data.percent}%`;
progressPercent.textContent = `${data.percent}%`;
if (data.totalVideos > 1) {
progressInfo.textContent = `Video ${data.currentVideo}/${data.totalVideos}`;
} else {
progressInfo.textContent = '';
}
if (data.speed) progressSpeed.textContent = data.speed;
if (data.estimatedRemaining) {
progressEta.textContent = `ETA: ${formatTime(data.estimatedRemaining)}`;
} else if (data.eta) {
progressEta.textContent = `ETA: ${data.eta}`;
}
if (data.title) {
progressCurrent.innerHTML = `Downloading: <span class="video-title">${data.title}</span>`;
}
}
function resetDownloadProgress() {
progressFill.style.width = '0%';
progressPercent.textContent = '0%';
progressTitle.textContent = 'Downloading...';
progressEta.textContent = '';
progressInfo.textContent = '';
progressSpeed.textContent = '';
progressCurrent.textContent = '';
}
document.getElementById('download-form').addEventListener('submit', async (e) => {
e.preventDefault();
const button = e.target.querySelector('button[type="submit"]');
const url = document.getElementById('download-url').value;
const resultDiv = document.getElementById('download-result');
setLoading(button, true);
resetDownloadProgress();
progressContainer.style.display = 'block';
resultDiv.classList.remove('show');
const eventSource = new EventSource(`${API_URL}/download-stream?url=${encodeURIComponent(url)}`);
eventSource.addEventListener('status', (e) => {
progressTitle.textContent = JSON.parse(e.data).message;
});
eventSource.addEventListener('info', (e) => {
const data = JSON.parse(e.data);
progressTitle.textContent = data.totalVideos > 1
? `Downloading playlist: ${data.playlistTitle} (${data.totalVideos} videos)`
: `Downloading: ${data.title}`;
});
eventSource.addEventListener('progress', (e) => updateDownloadProgress(JSON.parse(e.data)));
eventSource.addEventListener('video-complete', (e) => {
const data = JSON.parse(e.data);
progressCurrent.innerHTML = `Completed: <span class="video-title">${data.title}</span> (${data.videosCompleted}/${data.totalVideos})`;
});
eventSource.addEventListener('complete', (e) => {
const data = JSON.parse(e.data);
eventSource.close();
progressFill.style.width = '100%';
progressPercent.textContent = '100%';
progressTitle.textContent = 'Download Complete!';
progressEta.textContent = `Total: ${formatTime(data.totalTime)}`;
showResult('download-result', true, `
<h3>Download Complete!</h3>
<p>${data.successCount}/${data.totalVideos} videos downloaded</p>
${data.playlistTitle ? `<p>Playlist: ${data.playlistTitle}</p>` : ''}
<ul>${data.videos.map(v => `
<li>
<span class="${v.success ? 'icon-success' : 'icon-error'}">${v.success ? '✓' : '✗'}</span>
${v.title}
${v.success && v.fileUrl ? `<a href="${v.fileUrl}" target="_blank">Download</a>` : ''}
${v.error ? `<small>(${v.error})</small>` : ''}
</li>
`).join('')}</ul>
`);
setLoading(button, false);
});
eventSource.addEventListener('error', (e) => {
let errorMsg = 'Download failed';
try { errorMsg = JSON.parse(e.data).message || errorMsg; } catch {}
eventSource.close();
progressContainer.style.display = 'none';
showResult('download-result', false, `<h3>Error</h3><p>${errorMsg}</p>`);
setLoading(button, false);
});
eventSource.onerror = () => {
eventSource.close();
progressContainer.style.display = 'none';
showResult('download-result', false, `<h3>Error</h3><p>Connection lost</p>`);
setLoading(button, false);
};
});
// ==================== TRANSCRIBE TAB (Drag & Drop) ====================
let selectedFiles = [];
const dropZone = document.getElementById('drop-zone');
const fileInput = document.getElementById('file-input');
const selectedFilesDiv = document.getElementById('selected-files');
const filesList = document.getElementById('files-list');
const transcribeBtn = document.getElementById('transcribe-btn');
const clearFilesBtn = document.getElementById('clear-files');
function updateFilesList() {
if (selectedFiles.length === 0) {
selectedFilesDiv.style.display = 'none';
transcribeBtn.disabled = true;
return;
}
selectedFilesDiv.style.display = 'block';
transcribeBtn.disabled = false;
filesList.innerHTML = selectedFiles.map((file, index) => `
<li>
<span class="file-name">${file.name}</span>
<span class="file-size">${formatSize(file.size)}</span>
<button type="button" class="remove-file" data-index="${index}">×</button>
</li>
`).join('');
// Add remove handlers
filesList.querySelectorAll('.remove-file').forEach(btn => {
btn.addEventListener('click', () => {
selectedFiles.splice(parseInt(btn.dataset.index), 1);
updateFilesList();
});
});
}
function addFiles(files) {
const audioFiles = Array.from(files).filter(f =>
f.type.startsWith('audio/') || f.name.match(/\.(mp3|wav|m4a|ogg|flac)$/i)
);
selectedFiles = [...selectedFiles, ...audioFiles];
updateFilesList();
}
// Drag & Drop events
dropZone.addEventListener('click', () => fileInput.click());
dropZone.addEventListener('dragover', (e) => {
e.preventDefault();
dropZone.classList.add('drag-over');
});
dropZone.addEventListener('dragleave', () => {
dropZone.classList.remove('drag-over');
});
dropZone.addEventListener('drop', (e) => {
e.preventDefault();
dropZone.classList.remove('drag-over');
addFiles(e.dataTransfer.files);
});
fileInput.addEventListener('change', () => {
addFiles(fileInput.files);
fileInput.value = '';
});
clearFilesBtn.addEventListener('click', () => {
selectedFiles = [];
updateFilesList();
});
// Transcribe form submit
document.getElementById('transcribe-form').addEventListener('submit', async (e) => {
e.preventDefault();
if (selectedFiles.length === 0) return;
const button = transcribeBtn;
const language = document.getElementById('transcribe-lang').value;
const model = document.getElementById('transcribe-model').value;
const transcribeProgress = document.getElementById('transcribe-progress');
const transcribeProgressFill = document.getElementById('transcribe-progress-fill');
const transcribeProgressTitle = document.getElementById('transcribe-progress-title');
const transcribeProgressPercent = document.getElementById('transcribe-progress-percent');
const transcribeProgressInfo = document.getElementById('transcribe-progress-info');
const transcribeProgressCurrent = document.getElementById('transcribe-progress-current');
setLoading(button, true);
transcribeProgress.style.display = 'block';
transcribeProgressFill.style.width = '0%';
transcribeProgressTitle.textContent = 'Uploading and transcribing...';
transcribeProgressPercent.textContent = '0%';
transcribeProgressInfo.textContent = `0/${selectedFiles.length} files`;
document.getElementById('transcribe-result').classList.remove('show');
const formData = new FormData();
selectedFiles.forEach(file => formData.append('files', file));
if (language) formData.append('language', language);
formData.append('model', model);
try {
const response = await fetch(`${API_URL}/upload-transcribe`, {
method: 'POST',
body: formData
});
const data = await response.json();
if (!response.ok) throw new Error(data.error || 'Transcription failed');
transcribeProgressFill.style.width = '100%';
transcribeProgressPercent.textContent = '100%';
transcribeProgressTitle.textContent = 'Transcription Complete!';
transcribeProgressInfo.textContent = `${data.successCount}/${data.totalFiles} files`;
showResult('transcribe-result', true, `
<h3>Transcription Complete!</h3>
<p>${data.successCount}/${data.totalFiles} files transcribed</p>
<ul>${data.results.map(r => `
<li>
<span class="${r.success ? 'icon-success' : 'icon-error'}">${r.success ? '✓' : '✗'}</span>
${r.fileName}
${r.success && r.transcriptionUrl ? `<a href="${r.transcriptionUrl}" target="_blank">View</a>` : ''}
${r.error ? `<small>(${r.error})</small>` : ''}
</li>
`).join('')}</ul>
${data.results[0]?.text ? `
<h4>Preview (first file):</h4>
<div class="preview">${data.results[0].text.substring(0, 1000)}${data.results[0].text.length > 1000 ? '...' : ''}</div>
` : ''}
`);
selectedFiles = [];
updateFilesList();
} catch (error) {
transcribeProgress.style.display = 'none';
showResult('transcribe-result', false, `<h3>Error</h3><p>${error.message}</p>`);
} finally {
setLoading(button, false);
}
});
// ==================== PROCESS TAB (Download + Transcribe) ====================
const processProgress = document.getElementById('process-progress');
const processProgressFill = document.getElementById('process-progress-fill');
const processProgressTitle = document.getElementById('process-progress-title');
const processProgressPercent = document.getElementById('process-progress-percent');
const processProgressPhase = document.getElementById('process-progress-phase');
const processProgressSpeed = document.getElementById('process-progress-speed');
const processProgressCurrent = document.getElementById('process-progress-current');
const processProgressEta = document.getElementById('process-progress-eta');
function resetProcessProgress() {
processProgressFill.style.width = '0%';
processProgressPercent.textContent = '0%';
processProgressTitle.textContent = 'Processing...';
processProgressPhase.textContent = '';
processProgressSpeed.textContent = '';
processProgressCurrent.textContent = '';
processProgressEta.textContent = '';
}
document.getElementById('process-form').addEventListener('submit', async (e) => {
e.preventDefault();
const button = e.target.querySelector('button[type="submit"]');
const url = document.getElementById('process-url').value;
const language = document.getElementById('process-lang').value;
const model = document.getElementById('process-model').value;
const resultDiv = document.getElementById('process-result');
setLoading(button, true);
resetProcessProgress();
processProgress.style.display = 'block';
resultDiv.classList.remove('show');
const params = new URLSearchParams({ url });
if (language) params.append('language', language);
params.append('model', model);
const eventSource = new EventSource(`${API_URL}/process-stream?${params}`);
eventSource.addEventListener('status', (e) => {
const data = JSON.parse(e.data);
processProgressTitle.textContent = data.message;
if (data.phase === 'transcribing') {
processProgressPhase.textContent = 'Transcribing';
}
});
eventSource.addEventListener('info', (e) => {
const data = JSON.parse(e.data);
processProgressTitle.textContent = data.totalVideos > 1
? `Processing playlist: ${data.playlistTitle} (${data.totalVideos} videos)`
: `Processing: ${data.title}`;
});
eventSource.addEventListener('progress', (e) => {
const data = JSON.parse(e.data);
processProgressFill.style.width = `${data.percent}%`;
processProgressPercent.textContent = `${Math.round(data.percent)}%`;
processProgressPhase.textContent = data.phaseLabel || '';
if (data.speed) processProgressSpeed.textContent = data.speed;
if (data.title) {
processProgressCurrent.innerHTML = `${data.phaseLabel}: <span class="video-title">${data.title}</span>`;
}
if (data.totalVideos > 1) {
processProgressCurrent.innerHTML += ` (${data.currentVideo}/${data.totalVideos})`;
}
});
eventSource.addEventListener('video-complete', (e) => {
const data = JSON.parse(e.data);
processProgressCurrent.innerHTML = `Downloaded: <span class="video-title">${data.title}</span>`;
});
eventSource.addEventListener('transcribe-complete', (e) => {
const data = JSON.parse(e.data);
processProgressCurrent.innerHTML = `Transcribed: <span class="video-title">${data.title}</span> (${data.videosCompleted}/${data.totalFiles})`;
});
eventSource.addEventListener('complete', (e) => {
const data = JSON.parse(e.data);
eventSource.close();
processProgressFill.style.width = '100%';
processProgressPercent.textContent = '100%';
processProgressTitle.textContent = 'Processing Complete!';
processProgressPhase.textContent = '';
processProgressEta.textContent = `Total: ${formatTime(data.totalTime)}`;
showResult('process-result', true, `
<h3>Processing Complete!</h3>
${data.playlistTitle ? `<p>Playlist: ${data.playlistTitle}</p>` : ''}
<p>Downloaded: ${data.downloadedCount}/${data.totalVideos}</p>
<p>Transcribed: ${data.transcribedCount}/${data.totalVideos}</p>
<ul>${data.results.map(r => `
<li>
<span class="${r.transcriptionSuccess ? 'icon-success' : 'icon-error'}">${r.transcriptionSuccess ? '✓' : '✗'}</span>
${r.title}
${r.audioUrl ? `<a href="${r.audioUrl}" target="_blank">MP3</a>` : ''}
${r.transcriptionUrl ? `<a href="${r.transcriptionUrl}" target="_blank">TXT</a>` : ''}
${r.error ? `<small>(${r.error})</small>` : ''}
</li>
`).join('')}</ul>
${data.results[0]?.text ? `
<h4>Preview (first file):</h4>
<div class="preview">${data.results[0].text.substring(0, 1000)}${data.results[0].text.length > 1000 ? '...' : ''}</div>
` : ''}
`);
setLoading(button, false);
});
eventSource.addEventListener('error', (e) => {
let errorMsg = 'Processing failed';
try { errorMsg = JSON.parse(e.data).message || errorMsg; } catch {}
eventSource.close();
processProgress.style.display = 'none';
showResult('process-result', false, `<h3>Error</h3><p>${errorMsg}</p>`);
setLoading(button, false);
});
eventSource.onerror = () => {
eventSource.close();
processProgress.style.display = 'none';
showResult('process-result', false, `<h3>Error</h3><p>Connection lost</p>`);
setLoading(button, false);
};
});
// ==================== TRANSLATE CHECKBOXES (Transcribe & Process tabs) ====================
// Transcribe tab checkbox
const transcribeTranslateCheckbox = document.getElementById('transcribe-translate');
const transcribeTranslateLang = document.getElementById('transcribe-translate-lang');
transcribeTranslateCheckbox.addEventListener('change', () => {
transcribeTranslateLang.disabled = !transcribeTranslateCheckbox.checked;
});
// Process tab checkbox
const processTranslateCheckbox = document.getElementById('process-translate');
const processTranslateLang = document.getElementById('process-translate-lang');
processTranslateCheckbox.addEventListener('change', () => {
processTranslateLang.disabled = !processTranslateCheckbox.checked;
});
// ==================== TRANSLATE TAB ====================
// Mode switching
const translateModeBtns = document.querySelectorAll('.mode-btn');
const translateTextMode = document.getElementById('translate-text-mode');
const translateFileMode = document.getElementById('translate-file-mode');
translateModeBtns.forEach(btn => {
btn.addEventListener('click', () => {
translateModeBtns.forEach(b => b.classList.remove('active'));
btn.classList.add('active');
if (btn.dataset.mode === 'text') {
translateTextMode.style.display = 'block';
translateFileMode.style.display = 'none';
} else {
translateTextMode.style.display = 'none';
translateFileMode.style.display = 'block';
}
});
});
// Text translation form
document.getElementById('translate-text-form').addEventListener('submit', async (e) => {
e.preventDefault();
const button = document.getElementById('translate-text-btn');
const text = document.getElementById('translate-input').value;
const sourceLang = document.getElementById('translate-source').value;
const targetLang = document.getElementById('translate-target').value;
if (!text.trim()) {
showResult('translate-text-result', false, '<h3>Error</h3><p>Please enter text to translate</p>');
return;
}
setLoading(button, true);
document.getElementById('translate-text-result').classList.remove('show');
try {
const response = await fetch(`${API_URL}/translate`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ text, targetLang, sourceLang: sourceLang || null })
});
const data = await response.json();
if (!response.ok) throw new Error(data.error || 'Translation failed');
showResult('translate-text-result', true, `
<h3>Translation Complete!</h3>
<p><strong>From:</strong> ${data.sourceLanguage} <strong>To:</strong> ${data.targetLanguage}</p>
<div class="translation-output">${data.translatedText}</div>
`);
} catch (error) {
showResult('translate-text-result', false, `<h3>Error</h3><p>${error.message}</p>`);
} finally {
setLoading(button, false);
}
});
// File translation - Drag & Drop
let translateSelectedFiles = [];
const translateDropZone = document.getElementById('translate-drop-zone');
const translateFileInput = document.getElementById('translate-file-input');
const translateSelectedFilesDiv = document.getElementById('translate-selected-files');
const translateFilesList = document.getElementById('translate-files-list');
const translateFileBtn = document.getElementById('translate-file-btn');
const translateClearFilesBtn = document.getElementById('translate-clear-files');
function updateTranslateFilesList() {
if (translateSelectedFiles.length === 0) {
translateSelectedFilesDiv.style.display = 'none';
translateFileBtn.disabled = true;
return;
}
translateSelectedFilesDiv.style.display = 'block';
translateFileBtn.disabled = false;
translateFilesList.innerHTML = translateSelectedFiles.map((file, index) => `
<li>
<span class="file-name">${file.name}</span>
<span class="file-size">${formatSize(file.size)}</span>
<button type="button" class="remove-file" data-index="${index}">x</button>
</li>
`).join('');
translateFilesList.querySelectorAll('.remove-file').forEach(btn => {
btn.addEventListener('click', () => {
translateSelectedFiles.splice(parseInt(btn.dataset.index), 1);
updateTranslateFilesList();
});
});
}
function addTranslateFiles(files) {
const textFiles = Array.from(files).filter(f =>
f.type === 'text/plain' || f.name.endsWith('.txt')
);
translateSelectedFiles = [...translateSelectedFiles, ...textFiles];
updateTranslateFilesList();
}
translateDropZone.addEventListener('click', () => translateFileInput.click());
translateDropZone.addEventListener('dragover', (e) => {
e.preventDefault();
translateDropZone.classList.add('drag-over');
});
translateDropZone.addEventListener('dragleave', () => {
translateDropZone.classList.remove('drag-over');
});
translateDropZone.addEventListener('drop', (e) => {
e.preventDefault();
translateDropZone.classList.remove('drag-over');
addTranslateFiles(e.dataTransfer.files);
});
translateFileInput.addEventListener('change', () => {
addTranslateFiles(translateFileInput.files);
translateFileInput.value = '';
});
translateClearFilesBtn.addEventListener('click', () => {
translateSelectedFiles = [];
updateTranslateFilesList();
});
// File translation form submit
document.getElementById('translate-file-form').addEventListener('submit', async (e) => {
e.preventDefault();
if (translateSelectedFiles.length === 0) return;
const button = translateFileBtn;
const sourceLang = document.getElementById('translate-file-source').value;
const targetLang = document.getElementById('translate-file-target').value;
setLoading(button, true);
document.getElementById('translate-file-result').classList.remove('show');
const formData = new FormData();
translateSelectedFiles.forEach(file => formData.append('files', file));
formData.append('targetLang', targetLang);
if (sourceLang) formData.append('sourceLang', sourceLang);
try {
const response = await fetch(`${API_URL}/translate-file`, {
method: 'POST',
body: formData
});
const data = await response.json();
if (!response.ok) throw new Error(data.error || 'Translation failed');
showResult('translate-file-result', true, `
<h3>Translation Complete!</h3>
<p>${data.successCount}/${data.totalFiles} files translated</p>
<ul>${data.results.map(r => `
<li>
<span class="${r.success ? 'icon-success' : 'icon-error'}">${r.success ? '✓' : '✗'}</span>
${r.fileName || r.originalPath}
${r.success && r.translationUrl ? `<a href="${r.translationUrl}" target="_blank">View</a>` : ''}
${r.error ? `<small>(${r.error})</small>` : ''}
</li>
`).join('')}</ul>
${data.results[0]?.translatedText ? `
<h4>Preview (first file):</h4>
<div class="translation-output">${data.results[0].translatedText.substring(0, 1000)}${data.results[0].translatedText.length > 1000 ? '...' : ''}</div>
` : ''}
`);
translateSelectedFiles = [];
updateTranslateFilesList();
} catch (error) {
showResult('translate-file-result', false, `<h3>Error</h3><p>${error.message}</p>`);
} finally {
setLoading(button, false);
}
});

345
public/index.html Normal file
View File

@ -0,0 +1,345 @@
<!DOCTYPE html>
<html lang="fr">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Video to MP3 Transcriptor</title>
<link rel="stylesheet" href="style.css">
</head>
<body>
<div class="container">
<header>
<h1>Video to MP3 Transcriptor</h1>
<p class="subtitle">Download YouTube videos, transcribe and translate them</p>
</header>
<!-- Tabs -->
<nav class="tabs">
<button class="tab active" data-tab="download">Download</button>
<button class="tab" data-tab="transcribe">Transcribe</button>
<button class="tab" data-tab="process">Download + Transcribe</button>
<button class="tab" data-tab="translate">Translate</button>
</nav>
<!-- Download Tab -->
<section id="download" class="tab-content active">
<h2>Download YouTube Video/Playlist</h2>
<form id="download-form">
<div class="form-group">
<label for="download-url">YouTube URL</label>
<input type="url" id="download-url" placeholder="https://youtube.com/watch?v=..." required>
</div>
<button type="submit" class="btn btn-primary">
<span class="btn-text">Download MP3</span>
<span class="btn-loading">Downloading...</span>
</button>
</form>
<div id="download-progress" class="progress-container" style="display: none;">
<div class="progress-header">
<span id="progress-title">Downloading...</span>
<span id="progress-eta"></span>
</div>
<div class="progress-bar">
<div id="progress-fill" class="progress-fill"></div>
</div>
<div class="progress-details">
<span id="progress-percent">0%</span>
<span id="progress-info"></span>
<span id="progress-speed"></span>
</div>
<div id="progress-current" class="progress-current"></div>
</div>
<div id="download-result" class="result"></div>
</section>
<!-- Transcribe Tab -->
<section id="transcribe" class="tab-content">
<h2>Transcribe Audio File</h2>
<div id="drop-zone" class="drop-zone">
<div class="drop-zone-content">
<div class="drop-zone-icon">
<svg xmlns="http://www.w3.org/2000/svg" width="48" height="48" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
<path d="M21 15v4a2 2 0 0 1-2 2H5a2 2 0 0 1-2-2v-4"/>
<polyline points="17 8 12 3 7 8"/>
<line x1="12" y1="3" x2="12" y2="15"/>
</svg>
</div>
<p class="drop-zone-text">Drag & drop audio files here</p>
<p class="drop-zone-hint">or click to select files</p>
<input type="file" id="file-input" multiple accept="audio/*,.mp3,.wav,.m4a,.ogg,.flac" style="display: none;">
</div>
</div>
<div id="selected-files" class="selected-files" style="display: none;">
<h3>Selected Files</h3>
<ul id="files-list"></ul>
<button type="button" id="clear-files" class="btn btn-small btn-secondary">Clear</button>
</div>
<form id="transcribe-form">
<div class="form-row">
<div class="form-group">
<label for="transcribe-lang">Language</label>
<select id="transcribe-lang">
<option value="">Auto-detect</option>
<option value="en">English</option>
<option value="fr">French</option>
<option value="es">Spanish</option>
<option value="de">German</option>
<option value="it">Italian</option>
<option value="pt">Portuguese</option>
<option value="zh">Chinese</option>
<option value="ja">Japanese</option>
<option value="ko">Korean</option>
<option value="ru">Russian</option>
</select>
</div>
<div class="form-group">
<label for="transcribe-model">Model</label>
<select id="transcribe-model">
<option value="gpt-4o-transcribe">gpt-4o-transcribe (Best)</option>
<option value="gpt-4o-mini-transcribe">gpt-4o-mini-transcribe (Faster)</option>
<option value="whisper-1">whisper-1 (Legacy)</option>
</select>
</div>
</div>
<!-- Translate option -->
<div class="form-group checkbox-group">
<label class="checkbox-label">
<input type="checkbox" id="transcribe-translate">
<span>Translate after transcription (GPT-4o-mini)</span>
</label>
<select id="transcribe-translate-lang" class="translate-lang-select" disabled>
<option value="en">English</option>
<option value="fr">French</option>
<option value="es">Spanish</option>
<option value="de">German</option>
<option value="it">Italian</option>
<option value="pt">Portuguese</option>
<option value="zh">Chinese</option>
<option value="ja">Japanese</option>
<option value="ko">Korean</option>
<option value="ru">Russian</option>
</select>
</div>
<button type="submit" class="btn btn-primary" id="transcribe-btn" disabled>
<span class="btn-text">Transcribe</span>
<span class="btn-loading">Transcribing...</span>
</button>
</form>
<div id="transcribe-progress" class="progress-container" style="display: none;">
<div class="progress-header">
<span id="transcribe-progress-title">Transcribing...</span>
<span id="transcribe-progress-info"></span>
</div>
<div class="progress-bar">
<div id="transcribe-progress-fill" class="progress-fill"></div>
</div>
<div class="progress-details">
<span id="transcribe-progress-percent">0%</span>
<span id="transcribe-progress-current"></span>
</div>
</div>
<div id="transcribe-result" class="result"></div>
</section>
<!-- Process Tab -->
<section id="process" class="tab-content">
<h2>Download + Transcribe</h2>
<form id="process-form">
<div class="form-group">
<label for="process-url">YouTube URL</label>
<input type="url" id="process-url" placeholder="https://youtube.com/watch?v=..." required>
</div>
<div class="form-row">
<div class="form-group">
<label for="process-lang">Language</label>
<select id="process-lang">
<option value="">Auto-detect</option>
<option value="en">English</option>
<option value="fr">French</option>
<option value="es">Spanish</option>
<option value="de">German</option>
<option value="it">Italian</option>
<option value="pt">Portuguese</option>
<option value="zh">Chinese</option>
<option value="ja">Japanese</option>
<option value="ko">Korean</option>
<option value="ru">Russian</option>
</select>
</div>
<div class="form-group">
<label for="process-model">Model</label>
<select id="process-model">
<option value="gpt-4o-transcribe">gpt-4o-transcribe (Best)</option>
<option value="gpt-4o-mini-transcribe">gpt-4o-mini-transcribe (Faster)</option>
<option value="whisper-1">whisper-1 (Legacy)</option>
</select>
</div>
</div>
<!-- Translate option -->
<div class="form-group checkbox-group">
<label class="checkbox-label">
<input type="checkbox" id="process-translate">
<span>Translate after transcription (GPT-4o-mini)</span>
</label>
<select id="process-translate-lang" class="translate-lang-select" disabled>
<option value="en">English</option>
<option value="fr">French</option>
<option value="es">Spanish</option>
<option value="de">German</option>
<option value="it">Italian</option>
<option value="pt">Portuguese</option>
<option value="zh">Chinese</option>
<option value="ja">Japanese</option>
<option value="ko">Korean</option>
<option value="ru">Russian</option>
</select>
</div>
<button type="submit" class="btn btn-primary">
<span class="btn-text">Download + Transcribe</span>
<span class="btn-loading">Processing...</span>
</button>
</form>
<div id="process-progress" class="progress-container" style="display: none;">
<div class="progress-header">
<span id="process-progress-title">Processing...</span>
<span id="process-progress-eta"></span>
</div>
<div class="progress-bar">
<div id="process-progress-fill" class="progress-fill"></div>
</div>
<div class="progress-details">
<span id="process-progress-percent">0%</span>
<span id="process-progress-phase"></span>
<span id="process-progress-speed"></span>
</div>
<div id="process-progress-current" class="progress-current"></div>
</div>
<div id="process-result" class="result"></div>
</section>
<!-- Translate Tab -->
<section id="translate" class="tab-content">
<h2>Translate Text</h2>
<!-- Mode selector -->
<div class="translate-mode-selector">
<button type="button" class="mode-btn active" data-mode="text">Text</button>
<button type="button" class="mode-btn" data-mode="file">Files</button>
</div>
<!-- Text mode -->
<div id="translate-text-mode" class="translate-mode">
<form id="translate-text-form">
<div class="form-group">
<label for="translate-input">Text to translate</label>
<textarea id="translate-input" rows="8" placeholder="Enter text to translate..."></textarea>
</div>
<div class="form-row">
<div class="form-group">
<label for="translate-source">Source language</label>
<select id="translate-source">
<option value="">Auto-detect</option>
<option value="en">English</option>
<option value="fr">French</option>
<option value="es">Spanish</option>
<option value="de">German</option>
<option value="it">Italian</option>
<option value="pt">Portuguese</option>
<option value="zh">Chinese</option>
<option value="ja">Japanese</option>
<option value="ko">Korean</option>
<option value="ru">Russian</option>
</select>
</div>
<div class="form-group">
<label for="translate-target">Target language</label>
<select id="translate-target" required>
<option value="en">English</option>
<option value="fr">French</option>
<option value="es">Spanish</option>
<option value="de">German</option>
<option value="it">Italian</option>
<option value="pt">Portuguese</option>
<option value="zh">Chinese</option>
<option value="ja">Japanese</option>
<option value="ko">Korean</option>
<option value="ru">Russian</option>
</select>
</div>
</div>
<button type="submit" class="btn btn-primary" id="translate-text-btn">
<span class="btn-text">Translate</span>
<span class="btn-loading">Translating...</span>
</button>
</form>
<div id="translate-text-result" class="result"></div>
</div>
<!-- File mode -->
<div id="translate-file-mode" class="translate-mode" style="display: none;">
<div id="translate-drop-zone" class="drop-zone">
<div class="drop-zone-content">
<div class="drop-zone-icon">
<svg xmlns="http://www.w3.org/2000/svg" width="48" height="48" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
<path d="M21 15v4a2 2 0 0 1-2 2H5a2 2 0 0 1-2-2v-4"/>
<polyline points="17 8 12 3 7 8"/>
<line x1="12" y1="3" x2="12" y2="15"/>
</svg>
</div>
<p class="drop-zone-text">Drag & drop text files here</p>
<p class="drop-zone-hint">or click to select files (.txt)</p>
<input type="file" id="translate-file-input" multiple accept=".txt,text/plain" style="display: none;">
</div>
</div>
<div id="translate-selected-files" class="selected-files" style="display: none;">
<h3>Selected Files</h3>
<ul id="translate-files-list"></ul>
<button type="button" id="translate-clear-files" class="btn btn-small btn-secondary">Clear</button>
</div>
<form id="translate-file-form">
<div class="form-row">
<div class="form-group">
<label for="translate-file-source">Source language</label>
<select id="translate-file-source">
<option value="">Auto-detect</option>
<option value="en">English</option>
<option value="fr">French</option>
<option value="es">Spanish</option>
<option value="de">German</option>
<option value="it">Italian</option>
<option value="pt">Portuguese</option>
<option value="zh">Chinese</option>
<option value="ja">Japanese</option>
<option value="ko">Korean</option>
<option value="ru">Russian</option>
</select>
</div>
<div class="form-group">
<label for="translate-file-target">Target language</label>
<select id="translate-file-target" required>
<option value="en">English</option>
<option value="fr">French</option>
<option value="es">Spanish</option>
<option value="de">German</option>
<option value="it">Italian</option>
<option value="pt">Portuguese</option>
<option value="zh">Chinese</option>
<option value="ja">Japanese</option>
<option value="ko">Korean</option>
<option value="ru">Russian</option>
</select>
</div>
</div>
<button type="submit" class="btn btn-primary" id="translate-file-btn" disabled>
<span class="btn-text">Translate Files</span>
<span class="btn-loading">Translating...</span>
</button>
</form>
<div id="translate-file-result" class="result"></div>
</div>
</section>
</div>
<script src="app.js"></script>
</body>
</html>

685
public/style.css Normal file
View File

@ -0,0 +1,685 @@
* {
margin: 0;
padding: 0;
box-sizing: border-box;
}
body {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, sans-serif;
background: linear-gradient(135deg, #1a1a2e 0%, #16213e 100%);
min-height: 100vh;
color: #fff;
}
.container {
max-width: 900px;
margin: 0 auto;
padding: 2rem;
}
header {
text-align: center;
margin-bottom: 2rem;
}
header h1 {
font-size: 2.5rem;
background: linear-gradient(90deg, #e94560, #0f3460);
-webkit-background-clip: text;
-webkit-text-fill-color: transparent;
background-clip: text;
}
.subtitle {
color: #8892b0;
margin-top: 0.5rem;
}
/* Tabs */
.tabs {
display: flex;
gap: 0.5rem;
margin-bottom: 2rem;
border-bottom: 2px solid #233554;
padding-bottom: 0.5rem;
flex-wrap: wrap;
}
.tab {
padding: 0.75rem 1.5rem;
background: transparent;
border: none;
color: #8892b0;
cursor: pointer;
font-size: 1rem;
border-radius: 8px 8px 0 0;
transition: all 0.3s ease;
}
.tab:hover {
color: #e94560;
background: rgba(233, 69, 96, 0.1);
}
.tab.active {
color: #e94560;
background: rgba(233, 69, 96, 0.2);
border-bottom: 2px solid #e94560;
margin-bottom: -2px;
}
/* Tab Content */
.tab-content {
display: none;
background: rgba(255, 255, 255, 0.05);
padding: 2rem;
border-radius: 12px;
backdrop-filter: blur(10px);
}
.tab-content.active {
display: block;
animation: fadeIn 0.3s ease;
}
@keyframes fadeIn {
from { opacity: 0; transform: translateY(10px); }
to { opacity: 1; transform: translateY(0); }
}
.tab-content h2 {
margin-bottom: 1.5rem;
color: #ccd6f6;
}
/* Forms */
.form-group {
margin-bottom: 1.5rem;
}
.form-group label {
display: block;
margin-bottom: 0.5rem;
color: #8892b0;
font-size: 0.9rem;
}
.form-row {
display: grid;
grid-template-columns: 1fr 1fr;
gap: 1rem;
}
@media (max-width: 600px) {
.form-row {
grid-template-columns: 1fr;
}
}
input[type="url"],
input[type="text"],
select {
width: 100%;
padding: 0.75rem 1rem;
background: rgba(255, 255, 255, 0.1);
border: 1px solid #233554;
border-radius: 8px;
color: #fff;
font-size: 1rem;
transition: all 0.3s ease;
}
input[type="url"]:focus,
input[type="text"]:focus,
select:focus {
outline: none;
border-color: #e94560;
background: rgba(233, 69, 96, 0.1);
}
select option {
background: #1a1a2e;
color: #fff;
}
/* Buttons */
.btn {
padding: 0.75rem 2rem;
border: none;
border-radius: 8px;
font-size: 1rem;
cursor: pointer;
transition: all 0.3s ease;
display: inline-flex;
align-items: center;
gap: 0.5rem;
}
.btn-primary {
background: linear-gradient(90deg, #e94560, #0f3460);
color: #fff;
}
.btn-primary:hover {
transform: translateY(-2px);
box-shadow: 0 5px 20px rgba(233, 69, 96, 0.4);
}
.btn-primary:disabled {
opacity: 0.6;
cursor: not-allowed;
transform: none;
}
.btn-secondary {
background: rgba(255, 255, 255, 0.1);
color: #ccd6f6;
border: 1px solid #233554;
}
.btn-secondary:hover {
background: rgba(255, 255, 255, 0.2);
}
.btn-small {
padding: 0.5rem 1rem;
font-size: 0.85rem;
margin-left: 0.5rem;
}
.btn-loading {
display: none;
}
.btn.loading .btn-text {
display: none;
}
.btn.loading .btn-loading {
display: inline;
}
/* Results */
.result {
margin-top: 1.5rem;
padding: 1rem;
border-radius: 8px;
display: none;
}
.result.show {
display: block;
animation: fadeIn 0.3s ease;
}
.result.success {
background: rgba(16, 185, 129, 0.2);
border: 1px solid #10b981;
}
.result.error {
background: rgba(239, 68, 68, 0.2);
border: 1px solid #ef4444;
}
.result h3 {
margin-bottom: 0.75rem;
font-size: 1.1rem;
}
.result ul {
list-style: none;
margin-top: 0.5rem;
}
.result li {
padding: 0.5rem 0;
border-bottom: 1px solid rgba(255, 255, 255, 0.1);
display: flex;
align-items: center;
gap: 0.5rem;
}
.result li:last-child {
border-bottom: none;
}
.result .icon-success {
color: #10b981;
}
.result .icon-error {
color: #ef4444;
}
.result a {
color: #e94560;
text-decoration: none;
}
.result a:hover {
text-decoration: underline;
}
.result .preview {
margin-top: 1rem;
padding: 1rem;
background: rgba(0, 0, 0, 0.3);
border-radius: 8px;
font-family: monospace;
font-size: 0.9rem;
white-space: pre-wrap;
max-height: 300px;
overflow-y: auto;
}
/* Files Grid */
.files-grid {
display: grid;
grid-template-columns: repeat(auto-fill, minmax(250px, 1fr));
gap: 1rem;
margin-top: 1.5rem;
}
.file-card {
background: rgba(255, 255, 255, 0.05);
padding: 1rem;
border-radius: 8px;
border: 1px solid #233554;
transition: all 0.3s ease;
}
.file-card:hover {
border-color: #e94560;
transform: translateY(-2px);
}
.file-card .file-name {
font-weight: 600;
margin-bottom: 0.5rem;
word-break: break-all;
color: #ccd6f6;
}
.file-card .file-type {
font-size: 0.8rem;
color: #8892b0;
margin-bottom: 0.75rem;
}
.file-card .file-actions {
display: flex;
gap: 0.5rem;
}
.file-card .file-actions a {
padding: 0.4rem 0.8rem;
background: rgba(233, 69, 96, 0.2);
color: #e94560;
text-decoration: none;
border-radius: 4px;
font-size: 0.85rem;
transition: all 0.3s ease;
}
.file-card .file-actions a:hover {
background: #e94560;
color: #fff;
}
/* Loading spinner */
@keyframes spin {
to { transform: rotate(360deg); }
}
.spinner {
display: inline-block;
width: 16px;
height: 16px;
border: 2px solid rgba(255, 255, 255, 0.3);
border-top-color: #fff;
border-radius: 50%;
animation: spin 0.8s linear infinite;
margin-right: 0.5rem;
}
/* Status indicator */
.status {
display: inline-block;
padding: 0.25rem 0.5rem;
border-radius: 4px;
font-size: 0.8rem;
margin-left: 0.5rem;
}
.status.downloading {
background: rgba(59, 130, 246, 0.2);
color: #3b82f6;
}
.status.transcribing {
background: rgba(168, 85, 247, 0.2);
color: #a855f7;
}
.status.complete {
background: rgba(16, 185, 129, 0.2);
color: #10b981;
}
/* Empty state */
.empty-state {
text-align: center;
padding: 3rem;
color: #8892b0;
}
.empty-state p {
margin-top: 1rem;
}
/* Progress Bar */
.progress-container {
margin-top: 1.5rem;
padding: 1.5rem;
background: rgba(255, 255, 255, 0.05);
border-radius: 12px;
border: 1px solid #233554;
}
.progress-header {
display: flex;
justify-content: space-between;
align-items: center;
margin-bottom: 1rem;
}
.progress-header #progress-title {
font-weight: 600;
color: #ccd6f6;
font-size: 1rem;
}
.progress-header #progress-eta {
color: #e94560;
font-size: 0.9rem;
font-weight: 500;
}
.progress-bar {
width: 100%;
height: 12px;
background: rgba(255, 255, 255, 0.1);
border-radius: 6px;
overflow: hidden;
position: relative;
}
.progress-fill {
height: 100%;
background: linear-gradient(90deg, #e94560, #ff6b8a);
border-radius: 6px;
width: 0%;
transition: width 0.3s ease;
position: relative;
}
.progress-fill::after {
content: '';
position: absolute;
top: 0;
left: 0;
right: 0;
bottom: 0;
background: linear-gradient(
90deg,
transparent,
rgba(255, 255, 255, 0.3),
transparent
);
animation: shimmer 2s infinite;
}
@keyframes shimmer {
0% { transform: translateX(-100%); }
100% { transform: translateX(100%); }
}
.progress-details {
display: flex;
justify-content: space-between;
margin-top: 0.75rem;
font-size: 0.85rem;
color: #8892b0;
}
.progress-details #progress-percent {
color: #ccd6f6;
font-weight: 600;
}
.progress-details #progress-speed {
color: #10b981;
}
.progress-current {
margin-top: 1rem;
padding-top: 1rem;
border-top: 1px solid rgba(255, 255, 255, 0.1);
font-size: 0.9rem;
color: #8892b0;
}
.progress-current .video-title {
color: #ccd6f6;
font-weight: 500;
}
/* Drag and Drop Zone */
.drop-zone {
border: 2px dashed #233554;
border-radius: 12px;
padding: 3rem 2rem;
text-align: center;
cursor: pointer;
transition: all 0.3s ease;
margin-bottom: 1.5rem;
background: rgba(255, 255, 255, 0.02);
}
.drop-zone:hover {
border-color: #e94560;
background: rgba(233, 69, 96, 0.05);
}
.drop-zone.drag-over {
border-color: #e94560;
background: rgba(233, 69, 96, 0.1);
transform: scale(1.02);
}
.drop-zone-icon {
color: #8892b0;
margin-bottom: 1rem;
transition: color 0.3s ease;
}
.drop-zone:hover .drop-zone-icon {
color: #e94560;
}
.drop-zone-text {
color: #ccd6f6;
font-size: 1.1rem;
margin-bottom: 0.5rem;
}
.drop-zone-hint {
color: #8892b0;
font-size: 0.9rem;
}
/* Selected Files List */
.selected-files {
background: rgba(255, 255, 255, 0.05);
border-radius: 8px;
padding: 1rem;
margin-bottom: 1.5rem;
}
.selected-files h3 {
font-size: 0.9rem;
color: #8892b0;
margin-bottom: 0.75rem;
text-transform: uppercase;
letter-spacing: 0.5px;
}
.selected-files ul {
list-style: none;
margin-bottom: 1rem;
}
.selected-files li {
display: flex;
align-items: center;
justify-content: space-between;
padding: 0.5rem 0;
border-bottom: 1px solid rgba(255, 255, 255, 0.1);
color: #ccd6f6;
}
.selected-files li:last-child {
border-bottom: none;
}
.selected-files .file-name {
flex: 1;
overflow: hidden;
text-overflow: ellipsis;
white-space: nowrap;
}
.selected-files .file-size {
color: #8892b0;
font-size: 0.85rem;
margin-left: 1rem;
}
.selected-files .remove-file {
background: none;
border: none;
color: #ef4444;
cursor: pointer;
padding: 0.25rem;
margin-left: 0.5rem;
font-size: 1.2rem;
line-height: 1;
}
.selected-files .remove-file:hover {
color: #ff6b6b;
}
/* Checkbox Group */
.checkbox-group {
margin-bottom: 1rem;
}
.checkbox-label {
display: flex;
align-items: center;
gap: 0.5rem;
cursor: pointer;
color: #ccd6f6;
font-size: 0.95rem;
}
.checkbox-label input[type="checkbox"] {
width: 18px;
height: 18px;
accent-color: #e94560;
cursor: pointer;
}
.translate-lang-select {
margin-top: 0.75rem;
width: auto;
min-width: 150px;
}
.translate-lang-select:disabled {
opacity: 0.5;
cursor: not-allowed;
}
/* Translate Tab */
.translate-mode-selector {
display: flex;
gap: 0.5rem;
margin-bottom: 1.5rem;
}
.mode-btn {
padding: 0.6rem 1.25rem;
background: rgba(255, 255, 255, 0.05);
border: 1px solid #233554;
color: #8892b0;
border-radius: 8px;
cursor: pointer;
font-size: 0.95rem;
transition: all 0.3s ease;
}
.mode-btn:hover {
border-color: #e94560;
color: #ccd6f6;
}
.mode-btn.active {
background: rgba(233, 69, 96, 0.2);
border-color: #e94560;
color: #e94560;
}
.translate-mode {
animation: fadeIn 0.3s ease;
}
/* Textarea */
textarea {
width: 100%;
padding: 1rem;
background: rgba(255, 255, 255, 0.1);
border: 1px solid #233554;
border-radius: 8px;
color: #fff;
font-size: 1rem;
font-family: inherit;
resize: vertical;
min-height: 150px;
transition: all 0.3s ease;
}
textarea:focus {
outline: none;
border-color: #e94560;
background: rgba(233, 69, 96, 0.1);
}
textarea::placeholder {
color: #8892b0;
}
/* Translation Result */
.translation-output {
margin-top: 1rem;
padding: 1rem;
background: rgba(0, 0, 0, 0.3);
border-radius: 8px;
white-space: pre-wrap;
max-height: 400px;
overflow-y: auto;
color: #ccd6f6;
line-height: 1.6;
}

20
scripts/download.sh Normal file
View File

@ -0,0 +1,20 @@
#!/bin/bash
# Download YouTube video/playlist as MP3
# Usage: ./download.sh <url> [output_dir]
cd "$(dirname "$0")/.."
URL="$1"
OUTPUT_DIR="${2:-./output}"
if [ -z "$URL" ]; then
echo "Usage: ./download.sh <youtube_url> [output_dir]"
echo ""
echo "Examples:"
echo " ./download.sh 'https://youtube.com/watch?v=VIDEO_ID'"
echo " ./download.sh 'https://youtube.com/playlist?list=PLAYLIST_ID'"
echo " ./download.sh 'https://youtube.com/watch?v=VIDEO_ID' ./my-folder"
exit 1
fi
npm run cli download "$URL" -o "$OUTPUT_DIR"

18
scripts/info.sh Normal file
View File

@ -0,0 +1,18 @@
#!/bin/bash
# Get info about a YouTube video/playlist
# Usage: ./info.sh <url>
cd "$(dirname "$0")/.."
URL="$1"
if [ -z "$URL" ]; then
echo "Usage: ./info.sh <youtube_url>"
echo ""
echo "Examples:"
echo " ./info.sh 'https://youtube.com/watch?v=VIDEO_ID'"
echo " ./info.sh 'https://youtube.com/playlist?list=PLAYLIST_ID'"
exit 1
fi
npm run cli info "$URL"

32
scripts/process.sh Normal file
View File

@ -0,0 +1,32 @@
#!/bin/bash
# Download AND transcribe a YouTube video/playlist
# Usage: ./process.sh <url> [language] [model]
cd "$(dirname "$0")/.."
URL="$1"
LANGUAGE="${2:-}"
MODEL="${3:-gpt-4o-transcribe}"
if [ -z "$URL" ]; then
echo "Usage: ./process.sh <youtube_url> [language] [model]"
echo ""
echo "Languages: en, fr, es, de, it, pt, zh, ja, ko, ru, etc."
echo "Models: gpt-4o-transcribe (default), gpt-4o-mini-transcribe, whisper-1"
echo ""
echo "Examples:"
echo " ./process.sh 'https://youtube.com/watch?v=VIDEO_ID'"
echo " ./process.sh 'https://youtube.com/watch?v=VIDEO_ID' fr"
echo " ./process.sh 'https://youtube.com/watch?v=VIDEO_ID' en gpt-4o-mini-transcribe"
exit 1
fi
ARGS="\"$URL\""
if [ -n "$LANGUAGE" ]; then
ARGS="$ARGS -l $LANGUAGE"
fi
if [ -n "$MODEL" ] && [ "$MODEL" != "gpt-4o-transcribe" ]; then
ARGS="$ARGS -m $MODEL"
fi
eval "npm run cli process $ARGS"

10
scripts/server.sh Normal file
View File

@ -0,0 +1,10 @@
#!/bin/bash
# Start the API server
# Usage: ./server.sh [port]
cd "$(dirname "$0")/.."
PORT="${1:-3000}"
export PORT="$PORT"
npm run server

32
scripts/transcribe.sh Normal file
View File

@ -0,0 +1,32 @@
#!/bin/bash
# Transcribe an audio file
# Usage: ./transcribe.sh <file> [language] [model]
cd "$(dirname "$0")/.."
FILE="$1"
LANGUAGE="${2:-}"
MODEL="${3:-gpt-4o-transcribe}"
if [ -z "$FILE" ]; then
echo "Usage: ./transcribe.sh <audio_file> [language] [model]"
echo ""
echo "Languages: en, fr, es, de, it, pt, zh, ja, ko, ru, etc."
echo "Models: gpt-4o-transcribe (default), gpt-4o-mini-transcribe, whisper-1"
echo ""
echo "Examples:"
echo " ./transcribe.sh ./output/video.mp3"
echo " ./transcribe.sh ./output/video.mp3 fr"
echo " ./transcribe.sh ./output/video.mp3 en gpt-4o-mini-transcribe"
exit 1
fi
ARGS="$FILE"
if [ -n "$LANGUAGE" ]; then
ARGS="$ARGS -l $LANGUAGE"
fi
if [ -n "$MODEL" ] && [ "$MODEL" != "gpt-4o-transcribe" ]; then
ARGS="$ARGS -m $MODEL"
fi
npm run cli transcribe $ARGS

169
src/cli.js Normal file
View File

@ -0,0 +1,169 @@
#!/usr/bin/env node
import { Command } from 'commander';
import dotenv from 'dotenv';
import path from 'path';
import { download, downloadVideo, downloadPlaylist, getInfo } from './services/youtube.js';
import { transcribeFile, transcribeAndSave, transcribeMultiple, getAvailableModels } from './services/transcription.js';
// Load environment variables
dotenv.config();
const program = new Command();
program
.name('ytmp3')
.description('Download YouTube videos/playlists to MP3 and transcribe them')
.version('1.0.0');
// Download command
program
.command('download <url>')
.alias('dl')
.description('Download a YouTube video or playlist as MP3')
.option('-o, --output <dir>', 'Output directory', './output')
.action(async (url, options) => {
try {
console.log('Fetching video info...');
const result = await download(url, { outputDir: options.output });
console.log('\n--- Download Complete ---');
if (result.playlistTitle) {
console.log(`Playlist: ${result.playlistTitle}`);
}
console.log(`Downloaded: ${result.successCount}/${result.totalVideos} videos`);
result.videos.forEach(v => {
if (v.success) {
console.log(`${v.title}`);
} else {
console.log(`${v.title} - ${v.error}`);
}
});
} catch (error) {
console.error(`Error: ${error.message}`);
process.exit(1);
}
});
// Transcribe command (from existing MP3)
program
.command('transcribe <file>')
.alias('tr')
.description('Transcribe an existing audio file')
.option('-l, --language <lang>', 'Language code (e.g., en, fr, zh)')
.option('-f, --format <format>', 'Output format (txt, srt, vtt)', 'txt')
.option('-m, --model <model>', 'Transcription model (gpt-4o-transcribe, gpt-4o-mini-transcribe, whisper-1)', 'gpt-4o-transcribe')
.action(async (file, options) => {
try {
if (!process.env.OPENAI_API_KEY) {
console.error('Error: OPENAI_API_KEY not set in environment');
process.exit(1);
}
console.log(`Transcribing: ${file}`);
const result = await transcribeAndSave(file, {
language: options.language,
responseFormat: options.format === 'txt' ? 'text' : options.format,
outputFormat: options.format,
model: options.model,
});
console.log('\n--- Transcription Complete ---');
console.log(`Model: ${result.model}`);
console.log(`Output: ${result.transcriptionPath}`);
console.log('\nPreview:');
console.log(result.text.substring(0, 500) + (result.text.length > 500 ? '...' : ''));
} catch (error) {
console.error(`Error: ${error.message}`);
process.exit(1);
}
});
// Download + Transcribe command
program
.command('process <url>')
.alias('p')
.description('Download and transcribe a YouTube video or playlist')
.option('-o, --output <dir>', 'Output directory', './output')
.option('-l, --language <lang>', 'Language code for transcription')
.option('-f, --format <format>', 'Transcription format (txt, srt, vtt)', 'txt')
.option('-m, --model <model>', 'Transcription model (gpt-4o-transcribe, gpt-4o-mini-transcribe, whisper-1)', 'gpt-4o-transcribe')
.action(async (url, options) => {
try {
if (!process.env.OPENAI_API_KEY) {
console.error('Error: OPENAI_API_KEY not set in environment');
process.exit(1);
}
// Step 1: Download
console.log('Step 1: Downloading...');
const downloadResult = await download(url, { outputDir: options.output });
console.log(`Downloaded: ${downloadResult.successCount}/${downloadResult.totalVideos} videos\n`);
// Step 2: Transcribe
console.log(`Step 2: Transcribing with ${options.model}...`);
const successfulDownloads = downloadResult.videos.filter(v => v.success);
const filePaths = successfulDownloads.map(v => v.filePath);
const transcribeResult = await transcribeMultiple(filePaths, {
language: options.language,
responseFormat: options.format === 'txt' ? 'text' : options.format,
outputFormat: options.format,
model: options.model,
});
console.log('\n--- Process Complete ---');
if (downloadResult.playlistTitle) {
console.log(`Playlist: ${downloadResult.playlistTitle}`);
}
console.log(`Downloaded: ${downloadResult.successCount}/${downloadResult.totalVideos}`);
console.log(`Transcribed: ${transcribeResult.successCount}/${transcribeResult.totalFiles}`);
transcribeResult.results.forEach(r => {
if (r.success) {
console.log(`${path.basename(r.transcriptionPath)}`);
} else {
console.log(`${path.basename(r.filePath)} - ${r.error}`);
}
});
} catch (error) {
console.error(`Error: ${error.message}`);
process.exit(1);
}
});
// Info command
program
.command('info <url>')
.description('Get info about a YouTube video or playlist')
.action(async (url) => {
try {
const info = await getInfo(url);
console.log('\n--- Video/Playlist Info ---');
console.log(`Title: ${info.title}`);
console.log(`Type: ${info._type || 'video'}`);
if (info._type === 'playlist') {
console.log(`Videos: ${info.entries?.length || 0}`);
if (info.entries) {
info.entries.slice(0, 10).forEach((e, i) => {
console.log(` ${i + 1}. ${e.title}`);
});
if (info.entries.length > 10) {
console.log(` ... and ${info.entries.length - 10} more`);
}
}
} else {
console.log(`Duration: ${Math.floor(info.duration / 60)}:${String(info.duration % 60).padStart(2, '0')}`);
console.log(`Channel: ${info.channel}`);
}
} catch (error) {
console.error(`Error: ${error.message}`);
process.exit(1);
}
});
program.parse();

746
src/server.js Normal file
View File

@ -0,0 +1,746 @@
import express from 'express';
import cors from 'cors';
import dotenv from 'dotenv';
import path from 'path';
import fs from 'fs';
import multer from 'multer';
import { download, getInfo } from './services/youtube.js';
import { transcribeFile, transcribeAndSave, transcribeMultiple } from './services/transcription.js';
import { translateText, translateFile, translateMultiple, getLanguages } from './services/translation.js';
dotenv.config();
const app = express();
const PORT = process.env.PORT || 3000;
const OUTPUT_DIR = process.env.OUTPUT_DIR || './output';
// Ensure output directory exists
if (!fs.existsSync(OUTPUT_DIR)) {
fs.mkdirSync(OUTPUT_DIR, { recursive: true });
}
// Configure multer for file uploads
const storage = multer.diskStorage({
destination: (req, file, cb) => {
cb(null, OUTPUT_DIR);
},
filename: (req, file, cb) => {
// Keep original filename but sanitize it
const safeName = file.originalname.replace(/[^a-zA-Z0-9._-]/g, '_');
cb(null, safeName);
}
});
const upload = multer({
storage,
fileFilter: (req, file, cb) => {
const allowedTypes = ['audio/mpeg', 'audio/mp3', 'audio/wav', 'audio/m4a', 'audio/ogg', 'audio/flac', 'audio/x-m4a'];
if (allowedTypes.includes(file.mimetype) || file.originalname.match(/\.(mp3|wav|m4a|ogg|flac)$/i)) {
cb(null, true);
} else {
cb(new Error('Invalid file type. Only audio files are allowed.'));
}
}
});
// Upload handler for text files (for translation)
const uploadText = multer({
storage,
fileFilter: (req, file, cb) => {
if (file.mimetype === 'text/plain' || file.originalname.endsWith('.txt')) {
cb(null, true);
} else {
cb(new Error('Invalid file type. Only text files (.txt) are allowed.'));
}
}
});
app.use(cors());
app.use(express.json());
// Serve static files (HTML interface)
const __dirname = path.dirname(new URL(import.meta.url).pathname);
app.use(express.static(path.join(__dirname, '../public')));
// Serve downloaded files
app.use('/files', express.static(OUTPUT_DIR));
// API info endpoint
app.get('/api', (req, res) => {
res.json({
name: 'Video to MP3 Transcriptor API',
version: '1.0.0',
endpoints: {
'GET /health': 'Health check',
'GET /info?url=': 'Get video/playlist info',
'POST /download': 'Download as MP3',
'POST /transcribe': 'Transcribe audio file',
'POST /process': 'Download + transcribe',
'GET /files-list': 'List downloaded files',
'GET /files/<name>': 'Serve downloaded files',
},
});
});
// Health check
app.get('/health', (req, res) => {
res.json({ status: 'ok', timestamp: new Date().toISOString() });
});
/**
* GET /info?url=<youtube_url>
* Get info about a video or playlist
*/
app.get('/info', async (req, res) => {
try {
const { url } = req.query;
if (!url) {
return res.status(400).json({ error: 'URL parameter required' });
}
// Check if URL contains playlist parameter
const hasPlaylist = url.includes('list=');
const info = await getInfo(url, hasPlaylist);
res.json({
success: true,
title: info.title,
type: info._type || 'video',
duration: info.duration,
channel: info.channel,
entries: info._type === 'playlist'
? info.entries?.map(e => ({ id: e.id, title: e.title }))
: null,
videoCount: info._type === 'playlist' ? info.entries?.length : 1,
});
} catch (error) {
res.status(500).json({ error: error.message });
}
});
/**
* GET /download-stream
* Download with SSE progress updates
* Query: url (required)
*/
app.get('/download-stream', async (req, res) => {
const { url } = req.query;
if (!url) {
return res.status(400).json({ error: 'URL parameter required' });
}
// Set up SSE
res.setHeader('Content-Type', 'text/event-stream');
res.setHeader('Cache-Control', 'no-cache');
res.setHeader('Connection', 'keep-alive');
res.setHeader('Access-Control-Allow-Origin', '*');
const sendEvent = (event, data) => {
res.write(`event: ${event}\ndata: ${JSON.stringify(data)}\n\n`);
};
// Track timing for estimation
const startTime = Date.now();
let videosCompleted = 0;
let totalVideos = 1;
const videoTimes = [];
try {
// First, get info to know total videos
sendEvent('status', { message: 'Fetching video info...', phase: 'info' });
const hasPlaylist = url.includes('list=');
const info = await getInfo(url, hasPlaylist);
totalVideos = info._type === 'playlist' ? (info.entries?.length || 1) : 1;
sendEvent('info', {
title: info.title,
type: info._type || 'video',
totalVideos,
playlistTitle: info._type === 'playlist' ? info.title : null,
});
console.log(`Downloading: ${url}`);
let videoStartTime = Date.now();
const result = await download(url, {
outputDir: OUTPUT_DIR,
onDownloadProgress: (progress) => {
// Calculate overall progress
const videoProgress = progress.percent || 0;
const overallPercent = ((videosCompleted + (videoProgress / 100)) / totalVideos) * 100;
// Estimate remaining time
let estimatedRemaining = null;
if (videosCompleted > 0 && videoTimes.length > 0) {
const avgTimePerVideo = videoTimes.reduce((a, b) => a + b, 0) / videoTimes.length;
const remainingVideos = totalVideos - videosCompleted - (videoProgress / 100);
estimatedRemaining = Math.round(avgTimePerVideo * remainingVideos / 1000);
} else if (progress.eta) {
// Parse ETA from yt-dlp (format: MM:SS)
const [mins, secs] = progress.eta.split(':').map(Number);
const currentVideoRemaining = mins * 60 + secs;
const remainingVideos = totalVideos - videosCompleted - 1;
// Estimate based on current video
if (videoProgress > 10) {
const elapsed = (Date.now() - videoStartTime) / 1000;
const estimatedVideoTime = (elapsed / videoProgress) * 100;
estimatedRemaining = Math.round(currentVideoRemaining + (remainingVideos * estimatedVideoTime));
}
}
sendEvent('progress', {
percent: Math.round(overallPercent * 10) / 10,
videoPercent: Math.round(videoProgress * 10) / 10,
currentVideo: progress.videoIndex || 1,
totalVideos: progress.totalVideos || totalVideos,
title: progress.title,
speed: progress.speed,
eta: progress.eta,
estimatedRemaining,
phase: 'downloading',
});
},
onVideoComplete: (video) => {
const videoTime = Date.now() - videoStartTime;
videoTimes.push(videoTime);
videosCompleted++;
videoStartTime = Date.now();
sendEvent('video-complete', {
title: video.title,
success: video.success,
videosCompleted,
totalVideos,
});
},
});
// Send final result
sendEvent('complete', {
success: true,
playlistTitle: result.playlistTitle,
totalVideos: result.totalVideos,
successCount: result.successCount,
failCount: result.failCount,
totalTime: Math.round((Date.now() - startTime) / 1000),
videos: result.videos.map(v => ({
success: v.success,
title: v.title,
filePath: v.filePath,
fileUrl: v.filePath ? `/files/${path.basename(v.filePath)}` : null,
error: v.error,
})),
});
} catch (error) {
sendEvent('error', { message: error.message });
} finally {
res.end();
}
});
/**
* POST /download
* Download a video or playlist as MP3 (non-streaming version)
* Body: { url: string, outputDir?: string }
*/
app.post('/download', async (req, res) => {
try {
const { url, outputDir = OUTPUT_DIR } = req.body;
if (!url) {
return res.status(400).json({ error: 'URL required in request body' });
}
console.log(`Downloading: ${url}`);
const result = await download(url, { outputDir });
res.json({
success: true,
playlistTitle: result.playlistTitle,
totalVideos: result.totalVideos,
successCount: result.successCount,
failCount: result.failCount,
videos: result.videos.map(v => ({
success: v.success,
title: v.title,
filePath: v.filePath,
fileUrl: v.filePath ? `/files/${path.basename(v.filePath)}` : null,
error: v.error,
})),
});
} catch (error) {
res.status(500).json({ error: error.message });
}
});
/**
* POST /transcribe
* Transcribe an existing audio file
* Body: { filePath: string, language?: string, format?: string }
*/
app.post('/transcribe', async (req, res) => {
try {
const { filePath, language, format = 'txt', model = 'gpt-4o-transcribe' } = req.body;
if (!filePath) {
return res.status(400).json({ error: 'filePath required in request body' });
}
if (!process.env.OPENAI_API_KEY) {
return res.status(500).json({ error: 'OPENAI_API_KEY not configured' });
}
console.log(`Transcribing: ${filePath} with model ${model}`);
const result = await transcribeAndSave(filePath, {
language,
responseFormat: format === 'txt' ? 'text' : format,
outputFormat: format,
model,
});
res.json({
success: true,
filePath: result.filePath,
transcriptionPath: result.transcriptionPath,
transcriptionUrl: `/files/${path.basename(result.transcriptionPath)}`,
text: result.text,
});
} catch (error) {
res.status(500).json({ error: error.message });
}
});
/**
* POST /upload-transcribe
* Upload audio files and transcribe them
*/
app.post('/upload-transcribe', upload.array('files', 50), async (req, res) => {
try {
if (!process.env.OPENAI_API_KEY) {
return res.status(500).json({ error: 'OPENAI_API_KEY not configured' });
}
if (!req.files || req.files.length === 0) {
return res.status(400).json({ error: 'No files uploaded' });
}
const { language, model = 'gpt-4o-transcribe' } = req.body;
const results = [];
console.log(`Transcribing ${req.files.length} uploaded files with model ${model}`);
for (let i = 0; i < req.files.length; i++) {
const file = req.files[i];
console.log(`[${i + 1}/${req.files.length}] Transcribing: ${file.originalname}`);
try {
const result = await transcribeAndSave(file.path, {
language: language || undefined,
responseFormat: 'text',
outputFormat: 'txt',
model,
});
results.push({
success: true,
fileName: file.originalname,
filePath: file.path,
transcriptionPath: result.transcriptionPath,
transcriptionUrl: `/files/${path.basename(result.transcriptionPath)}`,
text: result.text,
});
} catch (error) {
console.error(`Failed to transcribe ${file.originalname}: ${error.message}`);
results.push({
success: false,
fileName: file.originalname,
error: error.message,
});
}
}
res.json({
success: true,
totalFiles: req.files.length,
successCount: results.filter(r => r.success).length,
failCount: results.filter(r => !r.success).length,
results,
});
} catch (error) {
res.status(500).json({ error: error.message });
}
});
/**
* GET /process-stream
* Download and transcribe with SSE progress updates
* Query: url, language?, model?
*/
app.get('/process-stream', async (req, res) => {
const { url, language, model = 'gpt-4o-transcribe' } = req.query;
if (!url) {
return res.status(400).json({ error: 'URL parameter required' });
}
if (!process.env.OPENAI_API_KEY) {
return res.status(500).json({ error: 'OPENAI_API_KEY not configured' });
}
// Set up SSE
res.setHeader('Content-Type', 'text/event-stream');
res.setHeader('Cache-Control', 'no-cache');
res.setHeader('Connection', 'keep-alive');
res.setHeader('Access-Control-Allow-Origin', '*');
const sendEvent = (event, data) => {
res.write(`event: ${event}\ndata: ${JSON.stringify(data)}\n\n`);
};
const startTime = Date.now();
let videosDownloaded = 0;
let videosTranscribed = 0;
let totalVideos = 1;
const videoTimes = [];
try {
// Phase 1: Get info
sendEvent('status', { message: 'Fetching video info...', phase: 'info' });
const hasPlaylist = url.includes('list=');
const info = await getInfo(url, hasPlaylist);
totalVideos = info._type === 'playlist' ? (info.entries?.length || 1) : 1;
sendEvent('info', {
title: info.title,
type: info._type || 'video',
totalVideos,
playlistTitle: info._type === 'playlist' ? info.title : null,
});
// Phase 2: Download
console.log(`Processing: ${url}`);
let videoStartTime = Date.now();
const downloadResult = await download(url, {
outputDir: OUTPUT_DIR,
onDownloadProgress: (progress) => {
const videoProgress = progress.percent || 0;
// Download is 50% of total, transcribe is other 50%
const overallPercent = ((videosDownloaded + (videoProgress / 100)) / totalVideos) * 50;
sendEvent('progress', {
percent: Math.round(overallPercent * 10) / 10,
videoPercent: Math.round(videoProgress * 10) / 10,
currentVideo: progress.videoIndex || 1,
totalVideos: progress.totalVideos || totalVideos,
title: progress.title,
speed: progress.speed,
eta: progress.eta,
phase: 'downloading',
phaseLabel: 'Downloading',
});
},
onVideoComplete: (video) => {
const videoTime = Date.now() - videoStartTime;
videoTimes.push(videoTime);
videosDownloaded++;
videoStartTime = Date.now();
sendEvent('video-complete', {
title: video.title,
success: video.success,
phase: 'downloading',
videosCompleted: videosDownloaded,
totalVideos,
});
},
});
// Phase 3: Transcribe
sendEvent('status', { message: 'Starting transcription...', phase: 'transcribing' });
const successfulDownloads = downloadResult.videos.filter(v => v.success);
const filePaths = successfulDownloads.map(v => v.filePath);
const transcribeResults = [];
for (let i = 0; i < filePaths.length; i++) {
const filePath = filePaths[i];
const video = successfulDownloads[i];
sendEvent('progress', {
percent: 50 + ((i / filePaths.length) * 50),
currentVideo: i + 1,
totalVideos: filePaths.length,
title: video.title,
phase: 'transcribing',
phaseLabel: 'Transcribing',
});
try {
const result = await transcribeAndSave(filePath, {
language: language || undefined,
responseFormat: 'text',
outputFormat: 'txt',
model,
});
transcribeResults.push(result);
videosTranscribed++;
sendEvent('transcribe-complete', {
title: video.title,
success: true,
videosCompleted: videosTranscribed,
totalFiles: filePaths.length,
});
} catch (error) {
transcribeResults.push({
success: false,
filePath,
error: error.message,
});
sendEvent('transcribe-complete', {
title: video.title,
success: false,
error: error.message,
videosCompleted: videosTranscribed,
totalFiles: filePaths.length,
});
}
}
// Combine results
const combinedResults = downloadResult.videos.map(v => {
const transcription = transcribeResults.find(t => t.filePath === v.filePath);
return {
title: v.title,
downloadSuccess: v.success,
audioUrl: v.filePath ? `/files/${path.basename(v.filePath)}` : null,
transcriptionSuccess: transcription?.success || false,
transcriptionUrl: transcription?.transcriptionPath
? `/files/${path.basename(transcription.transcriptionPath)}`
: null,
text: transcription?.text,
error: v.error || transcription?.error,
};
});
sendEvent('complete', {
success: true,
playlistTitle: downloadResult.playlistTitle,
totalVideos: downloadResult.totalVideos,
downloadedCount: downloadResult.successCount,
transcribedCount: videosTranscribed,
totalTime: Math.round((Date.now() - startTime) / 1000),
results: combinedResults,
});
} catch (error) {
sendEvent('error', { message: error.message });
} finally {
res.end();
}
});
/**
* POST /process
* Download and transcribe a video or playlist (non-streaming)
* Body: { url: string, language?: string, format?: string }
*/
app.post('/process', async (req, res) => {
try {
const { url, language, format = 'txt', outputDir = OUTPUT_DIR, model = 'gpt-4o-transcribe' } = req.body;
if (!url) {
return res.status(400).json({ error: 'URL required in request body' });
}
if (!process.env.OPENAI_API_KEY) {
return res.status(500).json({ error: 'OPENAI_API_KEY not configured' });
}
// Step 1: Download
console.log(`Step 1: Downloading ${url}`);
const downloadResult = await download(url, { outputDir });
// Step 2: Transcribe
console.log(`Step 2: Transcribing with model ${model}...`);
const successfulDownloads = downloadResult.videos.filter(v => v.success);
const filePaths = successfulDownloads.map(v => v.filePath);
const transcribeResult = await transcribeMultiple(filePaths, {
language,
responseFormat: format === 'txt' ? 'text' : format,
outputFormat: format,
model,
});
// Combine results
const combinedResults = downloadResult.videos.map(v => {
const transcription = transcribeResult.results.find(
t => t.filePath === v.filePath
);
return {
title: v.title,
downloadSuccess: v.success,
audioPath: v.filePath,
audioUrl: v.filePath ? `/files/${path.basename(v.filePath)}` : null,
transcriptionSuccess: transcription?.success || false,
transcriptionPath: transcription?.transcriptionPath,
transcriptionUrl: transcription?.transcriptionPath
? `/files/${path.basename(transcription.transcriptionPath)}`
: null,
text: transcription?.text,
error: v.error || transcription?.error,
};
});
res.json({
success: true,
playlistTitle: downloadResult.playlistTitle,
totalVideos: downloadResult.totalVideos,
downloadedCount: downloadResult.successCount,
transcribedCount: transcribeResult.successCount,
results: combinedResults,
});
} catch (error) {
res.status(500).json({ error: error.message });
}
});
/**
* GET /files
* List all downloaded files
*/
app.get('/files-list', (req, res) => {
try {
if (!fs.existsSync(OUTPUT_DIR)) {
return res.json({ files: [] });
}
const files = fs.readdirSync(OUTPUT_DIR).map(file => ({
name: file,
url: `/files/${file}`,
path: path.join(OUTPUT_DIR, file),
}));
res.json({ files });
} catch (error) {
res.status(500).json({ error: error.message });
}
});
/**
* GET /languages
* Get available translation languages
*/
app.get('/languages', (req, res) => {
res.json({ languages: getLanguages() });
});
/**
* POST /translate
* Translate text
* Body: { text: string, targetLang: string, sourceLang?: string }
*/
app.post('/translate', async (req, res) => {
try {
const { text, targetLang, sourceLang } = req.body;
if (!text) {
return res.status(400).json({ error: 'text required in request body' });
}
if (!targetLang) {
return res.status(400).json({ error: 'targetLang required in request body' });
}
if (!process.env.OPENAI_API_KEY) {
return res.status(500).json({ error: 'OPENAI_API_KEY not configured' });
}
console.log(`Translating text to ${targetLang}`);
const result = await translateText(text, targetLang, sourceLang);
res.json({
success: true,
...result,
});
} catch (error) {
res.status(500).json({ error: error.message });
}
});
/**
* POST /translate-file
* Translate uploaded text files
*/
app.post('/translate-file', uploadText.array('files', 50), async (req, res) => {
try {
if (!process.env.OPENAI_API_KEY) {
return res.status(500).json({ error: 'OPENAI_API_KEY not configured' });
}
if (!req.files || req.files.length === 0) {
return res.status(400).json({ error: 'No files uploaded' });
}
const { targetLang, sourceLang } = req.body;
if (!targetLang) {
return res.status(400).json({ error: 'targetLang required' });
}
const results = [];
console.log(`Translating ${req.files.length} files to ${targetLang}`);
for (let i = 0; i < req.files.length; i++) {
const file = req.files[i];
console.log(`[${i + 1}/${req.files.length}] Translating: ${file.originalname}`);
try {
const result = await translateFile(file.path, targetLang, sourceLang || null);
results.push({
success: true,
fileName: file.originalname,
translationPath: result.translationPath,
translationUrl: `/files/${path.basename(result.translationPath)}`,
translatedText: result.translatedText,
});
} catch (error) {
console.error(`Failed to translate ${file.originalname}: ${error.message}`);
results.push({
success: false,
fileName: file.originalname,
error: error.message,
});
}
}
res.json({
success: true,
totalFiles: req.files.length,
successCount: results.filter(r => r.success).length,
failCount: results.filter(r => !r.success).length,
results,
});
} catch (error) {
res.status(500).json({ error: error.message });
}
});
app.listen(PORT, () => {
console.log(`Server running on http://localhost:${PORT}`);
console.log('\nEndpoints:');
console.log(' GET /health - Health check');
console.log(' GET /info?url= - Get video/playlist info');
console.log(' POST /download - Download as MP3');
console.log(' POST /transcribe - Transcribe audio file');
console.log(' POST /process - Download + transcribe');
console.log(' GET /files-list - List downloaded files');
console.log(' GET /files/<name> - Serve downloaded files');
});

View File

@ -0,0 +1,178 @@
import OpenAI from 'openai';
import fs from 'fs';
import path from 'path';
let openai = null;
// Available transcription models
const MODELS = {
'gpt-4o-transcribe': {
name: 'gpt-4o-transcribe',
formats: ['json', 'text'],
supportsLanguage: true,
},
'gpt-4o-mini-transcribe': {
name: 'gpt-4o-mini-transcribe',
formats: ['json', 'text'],
supportsLanguage: true,
},
'whisper-1': {
name: 'whisper-1',
formats: ['json', 'text', 'srt', 'vtt', 'verbose_json'],
supportsLanguage: true,
},
};
const DEFAULT_MODEL = 'gpt-4o-transcribe';
/**
* Get OpenAI client (lazy initialization)
*/
function getOpenAI() {
if (!openai) {
if (!process.env.OPENAI_API_KEY) {
throw new Error('OPENAI_API_KEY environment variable is not set');
}
openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
});
}
return openai;
}
/**
* Get available models
*/
export function getAvailableModels() {
return Object.keys(MODELS);
}
/**
* Transcribe an audio file using OpenAI API
* @param {string} filePath - Path to audio file
* @param {Object} options - Transcription options
* @param {string} options.language - Language code (e.g., 'en', 'fr', 'es', 'zh')
* @param {string} options.responseFormat - Output format: 'json' or 'text' (gpt-4o models), or 'srt'/'vtt' (whisper-1 only)
* @param {string} options.prompt - Optional context prompt for better accuracy
* @param {string} options.model - Model to use (default: gpt-4o-transcribe)
*/
export async function transcribeFile(filePath, options = {}) {
const {
language = null, // Auto-detect if null
responseFormat = 'text', // json or text for gpt-4o models
prompt = null, // Optional context prompt
model = DEFAULT_MODEL,
} = options;
if (!fs.existsSync(filePath)) {
throw new Error(`File not found: ${filePath}`);
}
const modelConfig = MODELS[model] || MODELS[DEFAULT_MODEL];
const actualModel = modelConfig.name;
// Validate response format for model
let actualFormat = responseFormat;
if (!modelConfig.formats.includes(responseFormat)) {
console.warn(`Format '${responseFormat}' not supported by ${actualModel}, using 'text'`);
actualFormat = 'text';
}
try {
const transcriptionOptions = {
file: fs.createReadStream(filePath),
model: actualModel,
response_format: actualFormat,
};
if (language) {
transcriptionOptions.language = language;
}
if (prompt) {
transcriptionOptions.prompt = prompt;
}
console.log(`Using model: ${actualModel}, format: ${actualFormat}${language ? `, language: ${language}` : ''}`);
const transcription = await getOpenAI().audio.transcriptions.create(transcriptionOptions);
return {
success: true,
filePath,
text: actualFormat === 'json' || actualFormat === 'verbose_json'
? transcription.text
: transcription,
format: actualFormat,
model: actualModel,
};
} catch (error) {
throw new Error(`Transcription failed: ${error.message}`);
}
}
/**
* Transcribe and save to file
*/
export async function transcribeAndSave(filePath, options = {}) {
const { outputFormat = 'txt', outputDir = null } = options;
const result = await transcribeFile(filePath, options);
// Determine output path
const baseName = path.basename(filePath, path.extname(filePath));
const outputPath = path.join(
outputDir || path.dirname(filePath),
`${baseName}.${outputFormat}`
);
// Save transcription
fs.writeFileSync(outputPath, result.text, 'utf-8');
return {
...result,
transcriptionPath: outputPath,
};
}
/**
* Transcribe multiple files
*/
export async function transcribeMultiple(filePaths, options = {}) {
const { onProgress, onFileComplete } = options;
const results = [];
for (let i = 0; i < filePaths.length; i++) {
const filePath = filePaths[i];
if (onProgress) {
onProgress({ current: i + 1, total: filePaths.length, filePath });
}
console.log(`[${i + 1}/${filePaths.length}] Transcribing: ${path.basename(filePath)}`);
try {
const result = await transcribeAndSave(filePath, options);
results.push(result);
if (onFileComplete) {
onFileComplete(result);
}
} catch (error) {
console.error(`Failed to transcribe ${filePath}: ${error.message}`);
results.push({
success: false,
filePath,
error: error.message,
});
}
}
return {
success: true,
results,
totalFiles: filePaths.length,
successCount: results.filter(r => r.success).length,
failCount: results.filter(r => !r.success).length,
};
}

270
src/services/translation.js Normal file
View File

@ -0,0 +1,270 @@
import OpenAI from 'openai';
import fs from 'fs';
import path from 'path';
let openai = null;
// Max characters per chunk (~6000 tokens ≈ 24000 characters for most languages)
const MAX_CHUNK_CHARS = 20000;
const LANGUAGES = {
en: 'English',
fr: 'French',
es: 'Spanish',
de: 'German',
it: 'Italian',
pt: 'Portuguese',
zh: 'Chinese',
ja: 'Japanese',
ko: 'Korean',
ru: 'Russian',
ar: 'Arabic',
hi: 'Hindi',
nl: 'Dutch',
pl: 'Polish',
tr: 'Turkish',
vi: 'Vietnamese',
th: 'Thai',
sv: 'Swedish',
da: 'Danish',
fi: 'Finnish',
no: 'Norwegian',
cs: 'Czech',
el: 'Greek',
he: 'Hebrew',
id: 'Indonesian',
ms: 'Malay',
ro: 'Romanian',
uk: 'Ukrainian',
};
// Sentence ending patterns for different languages
const SENTENCE_ENDINGS = /[.!?。!?。\n]/g;
/**
* Get OpenAI client (lazy initialization)
*/
function getOpenAI() {
if (!openai) {
if (!process.env.OPENAI_API_KEY) {
throw new Error('OPENAI_API_KEY environment variable is not set');
}
openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
});
}
return openai;
}
/**
* Split text into chunks at sentence boundaries
* @param {string} text - Text to split
* @param {number} maxChars - Maximum characters per chunk
* @returns {string[]} Array of text chunks
*/
function splitIntoChunks(text, maxChars = MAX_CHUNK_CHARS) {
if (text.length <= maxChars) {
return [text];
}
const chunks = [];
let currentPos = 0;
while (currentPos < text.length) {
let endPos = currentPos + maxChars;
// If we're at the end, just take the rest
if (endPos >= text.length) {
chunks.push(text.slice(currentPos));
break;
}
// Find the last sentence ending before maxChars
const searchText = text.slice(currentPos, endPos);
let lastSentenceEnd = -1;
// Find all sentence endings in the search range
let match;
SENTENCE_ENDINGS.lastIndex = 0;
while ((match = SENTENCE_ENDINGS.exec(searchText)) !== null) {
lastSentenceEnd = match.index + 1; // Include the punctuation
}
// If we found a sentence ending, cut there
// Otherwise, look for the next sentence ending after maxChars (up to 20% more)
if (lastSentenceEnd > maxChars * 0.5) {
endPos = currentPos + lastSentenceEnd;
} else {
// Look forward for a sentence ending (up to 20% more characters)
const extendedSearch = text.slice(endPos, endPos + maxChars * 0.2);
SENTENCE_ENDINGS.lastIndex = 0;
const forwardMatch = SENTENCE_ENDINGS.exec(extendedSearch);
if (forwardMatch) {
endPos = endPos + forwardMatch.index + 1;
}
// If still no sentence ending found, just cut at maxChars
}
chunks.push(text.slice(currentPos, endPos).trim());
currentPos = endPos;
// Skip any leading whitespace for the next chunk
while (currentPos < text.length && /\s/.test(text[currentPos])) {
currentPos++;
}
}
return chunks.filter(chunk => chunk.length > 0);
}
/**
* Get available languages
*/
export function getLanguages() {
return LANGUAGES;
}
/**
* Translate a single chunk of text
*/
async function translateChunk(text, targetLanguage, sourceLanguage) {
const prompt = sourceLanguage
? `Translate the following text from ${sourceLanguage} to ${targetLanguage}. Only output the translation, nothing else:\n\n${text}`
: `Translate the following text to ${targetLanguage}. Only output the translation, nothing else:\n\n${text}`;
const response = await getOpenAI().chat.completions.create({
model: 'gpt-4o-mini',
max_tokens: 16384,
messages: [
{
role: 'user',
content: prompt,
},
],
});
return response.choices[0].message.content;
}
/**
* Translate text using GPT-4o-mini with chunking for long texts
* @param {string} text - Text to translate
* @param {string} targetLang - Target language code (e.g., 'en', 'fr')
* @param {string} sourceLang - Source language code (optional, auto-detect if null)
*/
export async function translateText(text, targetLang, sourceLang = null) {
if (!text || !text.trim()) {
throw new Error('No text provided for translation');
}
const targetLanguage = LANGUAGES[targetLang] || targetLang;
const sourceLanguage = sourceLang ? (LANGUAGES[sourceLang] || sourceLang) : null;
try {
// Split text into chunks
const chunks = splitIntoChunks(text);
if (chunks.length === 1) {
// Single chunk - translate directly
const translation = await translateChunk(text, targetLanguage, sourceLanguage);
return {
success: true,
originalText: text,
translatedText: translation,
targetLanguage: targetLanguage,
sourceLanguage: sourceLanguage || 'auto-detected',
chunks: 1,
};
}
// Multiple chunks - translate each and combine
console.log(`Splitting text into ${chunks.length} chunks for translation...`);
const translations = [];
for (let i = 0; i < chunks.length; i++) {
console.log(` Translating chunk ${i + 1}/${chunks.length} (${chunks[i].length} chars)...`);
const translation = await translateChunk(chunks[i], targetLanguage, sourceLanguage);
translations.push(translation);
}
const combinedTranslation = translations.join('\n\n');
return {
success: true,
originalText: text,
translatedText: combinedTranslation,
targetLanguage: targetLanguage,
sourceLanguage: sourceLanguage || 'auto-detected',
chunks: chunks.length,
};
} catch (error) {
throw new Error(`Translation failed: ${error.message}`);
}
}
/**
* Translate a text file
* @param {string} filePath - Path to text file
* @param {string} targetLang - Target language code
* @param {string} sourceLang - Source language code (optional)
*/
export async function translateFile(filePath, targetLang, sourceLang = null) {
if (!fs.existsSync(filePath)) {
throw new Error(`File not found: ${filePath}`);
}
const text = fs.readFileSync(filePath, 'utf-8');
const result = await translateText(text, targetLang, sourceLang);
// Save translation
const baseName = path.basename(filePath, path.extname(filePath));
const outputPath = path.join(
path.dirname(filePath),
`${baseName}_${targetLang}.txt`
);
fs.writeFileSync(outputPath, result.translatedText, 'utf-8');
return {
...result,
originalPath: filePath,
translationPath: outputPath,
};
}
/**
* Translate multiple files
*/
export async function translateMultiple(filePaths, targetLang, sourceLang = null, onProgress = null) {
const results = [];
for (let i = 0; i < filePaths.length; i++) {
const filePath = filePaths[i];
if (onProgress) {
onProgress({ current: i + 1, total: filePaths.length, filePath });
}
console.log(`[${i + 1}/${filePaths.length}] Translating: ${path.basename(filePath)}`);
try {
const result = await translateFile(filePath, targetLang, sourceLang);
results.push(result);
} catch (error) {
console.error(`Failed to translate ${filePath}: ${error.message}`);
results.push({
success: false,
originalPath: filePath,
error: error.message,
});
}
}
return {
success: true,
results,
totalFiles: filePaths.length,
successCount: results.filter(r => r.success).length,
failCount: results.filter(r => !r.success).length,
};
}

239
src/services/youtube.js Normal file
View File

@ -0,0 +1,239 @@
import youtubedl from 'youtube-dl-exec';
import path from 'path';
import fs from 'fs';
const OUTPUT_DIR = process.env.OUTPUT_DIR || './output';
/**
* Sanitize filename to remove invalid characters
*/
function sanitizeFilename(filename) {
return filename
.replace(/[<>:"/\\|?*]/g, '')
.replace(/\s+/g, '_')
.substring(0, 200);
}
/**
* Check if URL contains a playlist parameter
*/
function hasPlaylistParam(url) {
try {
const urlObj = new URL(url);
return urlObj.searchParams.has('list');
} catch {
return false;
}
}
/**
* Extract playlist URL if present in the URL
*/
function extractPlaylistUrl(url) {
const urlObj = new URL(url);
const listId = urlObj.searchParams.get('list');
if (listId) {
return `https://www.youtube.com/playlist?list=${listId}`;
}
return null;
}
/**
* Get video/playlist info without downloading
*/
export async function getInfo(url, forcePlaylist = false) {
try {
// If URL contains a playlist ID and we want to force playlist mode
const playlistUrl = extractPlaylistUrl(url);
const targetUrl = (forcePlaylist && playlistUrl) ? playlistUrl : url;
const info = await youtubedl(targetUrl, {
dumpSingleJson: true,
noDownload: true,
noWarnings: true,
flatPlaylist: true,
});
return info;
} catch (error) {
throw new Error(`Failed to get info: ${error.message}`);
}
}
/**
* Check if URL is a playlist
*/
export async function isPlaylist(url) {
const info = await getInfo(url);
return info._type === 'playlist';
}
/**
* Download a single video as MP3
*/
export async function downloadVideo(url, options = {}) {
const { outputDir = OUTPUT_DIR, onProgress, onDownloadProgress } = options;
// Ensure output directory exists
if (!fs.existsSync(outputDir)) {
fs.mkdirSync(outputDir, { recursive: true });
}
try {
// Get video info first
const info = await youtubedl(url, {
dumpSingleJson: true,
noDownload: true,
noWarnings: true,
});
const title = sanitizeFilename(info.title);
const outputPath = path.join(outputDir, `${title}.mp3`);
// Download and convert to MP3 with progress
const subprocess = youtubedl.exec(url, {
extractAudio: true,
audioFormat: 'mp3',
audioQuality: 0,
output: outputPath,
noWarnings: true,
newline: true,
});
// Parse progress from yt-dlp output
if (onDownloadProgress && subprocess.stdout) {
subprocess.stdout.on('data', (data) => {
const line = data.toString();
// Parse progress: [download] 45.2% of 10.5MiB at 1.2MiB/s ETA 00:05
const progressMatch = line.match(/\[download\]\s+(\d+\.?\d*)%/);
const etaMatch = line.match(/ETA\s+(\d+:\d+)/);
const speedMatch = line.match(/at\s+([\d.]+\w+\/s)/);
if (progressMatch) {
onDownloadProgress({
percent: parseFloat(progressMatch[1]),
eta: etaMatch ? etaMatch[1] : null,
speed: speedMatch ? speedMatch[1] : null,
title: info.title,
});
}
});
}
await subprocess;
return {
success: true,
title: info.title,
duration: info.duration,
filePath: outputPath,
url: url,
};
} catch (error) {
throw new Error(`Failed to download: ${error.message}`);
}
}
/**
* Download all videos from a playlist as MP3
*/
export async function downloadPlaylist(url, options = {}) {
const { outputDir = OUTPUT_DIR, onProgress, onVideoComplete, onDownloadProgress, forcePlaylist = false } = options;
// Ensure output directory exists
if (!fs.existsSync(outputDir)) {
fs.mkdirSync(outputDir, { recursive: true });
}
try {
// Get playlist info (force playlist mode if URL has list= param)
const info = await getInfo(url, forcePlaylist || hasPlaylistParam(url));
if (info._type !== 'playlist') {
// Single video, redirect to downloadVideo
const result = await downloadVideo(url, { ...options, onDownloadProgress });
return {
success: true,
playlistTitle: result.title,
videos: [result],
totalVideos: 1,
};
}
const results = [];
const entries = info.entries || [];
console.log(`Playlist: ${info.title} (${entries.length} videos)`);
for (let i = 0; i < entries.length; i++) {
const entry = entries[i];
const videoUrl = entry.url || `https://www.youtube.com/watch?v=${entry.id}`;
try {
if (onProgress) {
onProgress({ current: i + 1, total: entries.length, title: entry.title });
}
console.log(`[${i + 1}/${entries.length}] Downloading: ${entry.title}`);
// Wrap progress callback to include playlist context
const wrappedProgress = onDownloadProgress ? (progress) => {
onDownloadProgress({
...progress,
videoIndex: i + 1,
totalVideos: entries.length,
playlistTitle: info.title,
});
} : undefined;
const result = await downloadVideo(videoUrl, { outputDir, onDownloadProgress: wrappedProgress });
results.push(result);
if (onVideoComplete) {
onVideoComplete(result);
}
} catch (error) {
console.error(`Failed to download ${entry.title}: ${error.message}`);
results.push({
success: false,
title: entry.title,
url: videoUrl,
error: error.message,
});
}
}
return {
success: true,
playlistTitle: info.title,
videos: results,
totalVideos: entries.length,
successCount: results.filter(r => r.success).length,
failCount: results.filter(r => !r.success).length,
};
} catch (error) {
throw new Error(`Failed to download playlist: ${error.message}`);
}
}
/**
* Smart download - detects if URL is video or playlist
*/
export async function download(url, options = {}) {
// If URL contains list= parameter, treat it as a playlist
const isPlaylistUrl = hasPlaylistParam(url);
const info = await getInfo(url, isPlaylistUrl);
if (info._type === 'playlist') {
return downloadPlaylist(url, { ...options, forcePlaylist: true });
} else {
const result = await downloadVideo(url, options);
return {
success: true,
playlistTitle: null,
videos: [result],
totalVideos: 1,
successCount: 1,
failCount: 0,
};
}
}