Allow cookies uploaded after server start to be detected by checking process.env.YOUTUBE_COOKIES_PATH dynamically instead of relying only on the cached COOKIES_PATH constant. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> |
||
|---|---|---|
| .claude | ||
| docs | ||
| public | ||
| scripts | ||
| src | ||
| .env.example | ||
| .gitignore | ||
| CLAUDE.md | ||
| COOKIES_QUICK_START.md | ||
| extract-and-upload-cookies.sh | ||
| FILE_SHARING.md | ||
| package-lock.json | ||
| package.json | ||
| README.md | ||
| refresh-cookies.sh | ||
| start-server.bat | ||
| start-server.sh | ||
| TEST_LOCAL.md | ||
Video to MP3 Transcriptor
Download YouTube videos/playlists to MP3 and transcribe them using OpenAI Whisper API.
Features
- Download single YouTube videos as MP3
- Download entire playlists as MP3
- Transcribe audio files using OpenAI Whisper API
- CLI interface for quick operations
- REST API for integration with other systems
Prerequisites
- Node.js 18+
- yt-dlp installed on your system
- ffmpeg installed (for audio conversion)
- OpenAI API key (for transcription)
Installing yt-dlp
# Windows (winget)
winget install yt-dlp
# macOS
brew install yt-dlp
# Linux
sudo apt install yt-dlp
# or
pip install yt-dlp
Installing ffmpeg
# Windows (winget)
winget install ffmpeg
# macOS
brew install ffmpeg
# Linux
sudo apt install ffmpeg
Installation
# Clone and install
cd videotoMP3Transcriptor
npm install
# Configure environment
cp .env.example .env
# Edit .env and add your OPENAI_API_KEY
Usage
CLI
# Download a video as MP3
npm run cli download "https://youtube.com/watch?v=VIDEO_ID"
# Download a playlist
npm run cli download "https://youtube.com/playlist?list=PLAYLIST_ID"
# Download with custom output directory
npm run cli download "URL" -o ./my-folder
# Get info about a video/playlist
npm run cli info "URL"
# Transcribe an existing MP3
npm run cli transcribe ./output/video.mp3
# Transcribe with specific language
npm run cli transcribe ./output/video.mp3 -l fr
# Transcribe with specific model
npm run cli transcribe ./output/video.mp3 -m gpt-4o-mini-transcribe
# Download AND transcribe
npm run cli process "URL"
# Download and transcribe with options
npm run cli process "URL" -l en -m gpt-4o-transcribe
Linux Scripts
Convenience scripts are available in the scripts/ directory:
# Make scripts executable (first time only)
chmod +x scripts/*.sh
# Download video/playlist
./scripts/download.sh "https://youtube.com/watch?v=VIDEO_ID"
# Transcribe a file
./scripts/transcribe.sh ./output/video.mp3 fr
# Download + transcribe
./scripts/process.sh "https://youtube.com/watch?v=VIDEO_ID" en
# Start the API server
./scripts/server.sh
# Get video info
./scripts/info.sh "https://youtube.com/watch?v=VIDEO_ID"
API Server
# Start the server
npm run server
Server runs on http://localhost:3000 by default.
Endpoints
GET /health
Health check endpoint.
GET /info?url=YOUTUBE_URL
Get info about a video or playlist.
curl "http://localhost:3000/info?url=https://youtube.com/watch?v=VIDEO_ID"
POST /download
Download video(s) as MP3.
curl -X POST http://localhost:3000/download \
-H "Content-Type: application/json" \
-d '{"url": "https://youtube.com/watch?v=VIDEO_ID"}'
POST /transcribe
Transcribe an existing audio file.
curl -X POST http://localhost:3000/transcribe \
-H "Content-Type: application/json" \
-d '{"filePath": "./output/video.mp3", "language": "en"}'
POST /process
Download and transcribe in one call.
curl -X POST http://localhost:3000/process \
-H "Content-Type: application/json" \
-d '{"url": "https://youtube.com/watch?v=VIDEO_ID", "language": "en", "format": "txt"}'
GET /files-list
List all downloaded files.
GET /files/:filename
Download/stream a specific file.
Configuration
Environment variables (.env):
| Variable | Description | Default |
|---|---|---|
OPENAI_API_KEY |
Your OpenAI API key | Required for transcription |
PORT |
Server port | 3000 |
OUTPUT_DIR |
Download directory | ./output |
Transcription Models
| Model | Description | Formats |
|---|---|---|
gpt-4o-transcribe |
Best quality, latest GPT-4o (default) | txt, json |
gpt-4o-mini-transcribe |
Faster, cheaper, good quality | txt, json |
whisper-1 |
Legacy Whisper model | txt, json, srt, vtt |
Transcription Formats
txt- Plain text (all models)json- JSON response (all models)srt- SubRip subtitles (whisper-1 only)vtt- WebVTT subtitles (whisper-1 only)
Language Codes
Common language codes for the -l option:
en- Englishfr- Frenches- Spanishde- Germanit- Italianpt- Portuguesezh- Chineseja- Japaneseko- Koreanru- Russian
Leave empty for auto-detection.
Project Structure
videotoMP3Transcriptor/
├── src/
│ ├── services/
│ │ ├── youtube.js # YouTube download service
│ │ └── transcription.js # OpenAI transcription service
│ ├── cli.js # CLI entry point
│ └── server.js # Express API server
├── scripts/ # Linux convenience scripts
│ ├── download.sh # Download video/playlist
│ ├── transcribe.sh # Transcribe audio file
│ ├── process.sh # Download + transcribe
│ ├── server.sh # Start API server
│ └── info.sh # Get video info
├── output/ # Downloaded files
├── .env # Configuration
└── package.json
License
MIT