- YouTube video/playlist download as MP3 (yt-dlp) - Audio transcription with OpenAI (gpt-4o-transcribe, whisper-1) - Translation with GPT-4o-mini (chunking for long texts) - Web interface with progress bars and drag & drop - CLI and REST API interfaces - Linux shell scripts 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
236 lines
5.1 KiB
Markdown
236 lines
5.1 KiB
Markdown
# Video to MP3 Transcriptor
|
|
|
|
Download YouTube videos/playlists to MP3 and transcribe them using OpenAI Whisper API.
|
|
|
|
## Features
|
|
|
|
- Download single YouTube videos as MP3
|
|
- Download entire playlists as MP3
|
|
- Transcribe audio files using OpenAI Whisper API
|
|
- CLI interface for quick operations
|
|
- REST API for integration with other systems
|
|
|
|
## Prerequisites
|
|
|
|
- **Node.js** 18+
|
|
- **yt-dlp** installed on your system
|
|
- **ffmpeg** installed (for audio conversion)
|
|
- **OpenAI API key** (for transcription)
|
|
|
|
### Installing yt-dlp
|
|
|
|
```bash
|
|
# Windows (winget)
|
|
winget install yt-dlp
|
|
|
|
# macOS
|
|
brew install yt-dlp
|
|
|
|
# Linux
|
|
sudo apt install yt-dlp
|
|
# or
|
|
pip install yt-dlp
|
|
```
|
|
|
|
### Installing ffmpeg
|
|
|
|
```bash
|
|
# Windows (winget)
|
|
winget install ffmpeg
|
|
|
|
# macOS
|
|
brew install ffmpeg
|
|
|
|
# Linux
|
|
sudo apt install ffmpeg
|
|
```
|
|
|
|
## Installation
|
|
|
|
```bash
|
|
# Clone and install
|
|
cd videotoMP3Transcriptor
|
|
npm install
|
|
|
|
# Configure environment
|
|
cp .env.example .env
|
|
# Edit .env and add your OPENAI_API_KEY
|
|
```
|
|
|
|
## Usage
|
|
|
|
### CLI
|
|
|
|
```bash
|
|
# Download a video as MP3
|
|
npm run cli download "https://youtube.com/watch?v=VIDEO_ID"
|
|
|
|
# Download a playlist
|
|
npm run cli download "https://youtube.com/playlist?list=PLAYLIST_ID"
|
|
|
|
# Download with custom output directory
|
|
npm run cli download "URL" -o ./my-folder
|
|
|
|
# Get info about a video/playlist
|
|
npm run cli info "URL"
|
|
|
|
# Transcribe an existing MP3
|
|
npm run cli transcribe ./output/video.mp3
|
|
|
|
# Transcribe with specific language
|
|
npm run cli transcribe ./output/video.mp3 -l fr
|
|
|
|
# Transcribe with specific model
|
|
npm run cli transcribe ./output/video.mp3 -m gpt-4o-mini-transcribe
|
|
|
|
# Download AND transcribe
|
|
npm run cli process "URL"
|
|
|
|
# Download and transcribe with options
|
|
npm run cli process "URL" -l en -m gpt-4o-transcribe
|
|
```
|
|
|
|
### Linux Scripts
|
|
|
|
Convenience scripts are available in the `scripts/` directory:
|
|
|
|
```bash
|
|
# Make scripts executable (first time only)
|
|
chmod +x scripts/*.sh
|
|
|
|
# Download video/playlist
|
|
./scripts/download.sh "https://youtube.com/watch?v=VIDEO_ID"
|
|
|
|
# Transcribe a file
|
|
./scripts/transcribe.sh ./output/video.mp3 fr
|
|
|
|
# Download + transcribe
|
|
./scripts/process.sh "https://youtube.com/watch?v=VIDEO_ID" en
|
|
|
|
# Start the API server
|
|
./scripts/server.sh
|
|
|
|
# Get video info
|
|
./scripts/info.sh "https://youtube.com/watch?v=VIDEO_ID"
|
|
```
|
|
|
|
### API Server
|
|
|
|
```bash
|
|
# Start the server
|
|
npm run server
|
|
```
|
|
|
|
Server runs on `http://localhost:3000` by default.
|
|
|
|
#### Endpoints
|
|
|
|
##### GET /health
|
|
Health check endpoint.
|
|
|
|
##### GET /info?url=YOUTUBE_URL
|
|
Get info about a video or playlist.
|
|
|
|
```bash
|
|
curl "http://localhost:3000/info?url=https://youtube.com/watch?v=VIDEO_ID"
|
|
```
|
|
|
|
##### POST /download
|
|
Download video(s) as MP3.
|
|
|
|
```bash
|
|
curl -X POST http://localhost:3000/download \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"url": "https://youtube.com/watch?v=VIDEO_ID"}'
|
|
```
|
|
|
|
##### POST /transcribe
|
|
Transcribe an existing audio file.
|
|
|
|
```bash
|
|
curl -X POST http://localhost:3000/transcribe \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"filePath": "./output/video.mp3", "language": "en"}'
|
|
```
|
|
|
|
##### POST /process
|
|
Download and transcribe in one call.
|
|
|
|
```bash
|
|
curl -X POST http://localhost:3000/process \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"url": "https://youtube.com/watch?v=VIDEO_ID", "language": "en", "format": "txt"}'
|
|
```
|
|
|
|
##### GET /files-list
|
|
List all downloaded files.
|
|
|
|
##### GET /files/:filename
|
|
Download/stream a specific file.
|
|
|
|
## Configuration
|
|
|
|
Environment variables (`.env`):
|
|
|
|
| Variable | Description | Default |
|
|
|----------|-------------|---------|
|
|
| `OPENAI_API_KEY` | Your OpenAI API key | Required for transcription |
|
|
| `PORT` | Server port | 3000 |
|
|
| `OUTPUT_DIR` | Download directory | ./output |
|
|
|
|
## Transcription Models
|
|
|
|
| Model | Description | Formats |
|
|
|-------|-------------|---------|
|
|
| `gpt-4o-transcribe` | Best quality, latest GPT-4o (default) | txt, json |
|
|
| `gpt-4o-mini-transcribe` | Faster, cheaper, good quality | txt, json |
|
|
| `whisper-1` | Legacy Whisper model | txt, json, srt, vtt |
|
|
|
|
## Transcription Formats
|
|
|
|
- `txt` - Plain text (all models)
|
|
- `json` - JSON response (all models)
|
|
- `srt` - SubRip subtitles (whisper-1 only)
|
|
- `vtt` - WebVTT subtitles (whisper-1 only)
|
|
|
|
## Language Codes
|
|
|
|
Common language codes for the `-l` option:
|
|
- `en` - English
|
|
- `fr` - French
|
|
- `es` - Spanish
|
|
- `de` - German
|
|
- `it` - Italian
|
|
- `pt` - Portuguese
|
|
- `zh` - Chinese
|
|
- `ja` - Japanese
|
|
- `ko` - Korean
|
|
- `ru` - Russian
|
|
|
|
Leave empty for auto-detection.
|
|
|
|
## Project Structure
|
|
|
|
```
|
|
videotoMP3Transcriptor/
|
|
├── src/
|
|
│ ├── services/
|
|
│ │ ├── youtube.js # YouTube download service
|
|
│ │ └── transcription.js # OpenAI transcription service
|
|
│ ├── cli.js # CLI entry point
|
|
│ └── server.js # Express API server
|
|
├── scripts/ # Linux convenience scripts
|
|
│ ├── download.sh # Download video/playlist
|
|
│ ├── transcribe.sh # Transcribe audio file
|
|
│ ├── process.sh # Download + transcribe
|
|
│ ├── server.sh # Start API server
|
|
│ └── info.sh # Get video info
|
|
├── output/ # Downloaded files
|
|
├── .env # Configuration
|
|
└── package.json
|
|
```
|
|
|
|
## License
|
|
|
|
MIT
|