Go to file

Trouve Alexis 23bb4cd2d9 Add video/audio to MP3 conversion feature Implement drag-and-drop interface for converting video and audio files to MP3 format using FFmpeg. Users can now upload files (MP4, M4A, AVI, MKV, MOV, WAV, FLAC, OGG) and convert them with customizable bitrate and quality settings. - Add conversion service with FFmpeg integration - Add /convert-to-mp3 and /supported-formats API endpoints - Add new "Video to MP3" tab with drag-and-drop UI - Support multiple file uploads with batch conversion - Add bitrate (128k-320k) and VBR quality options 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>		2025-11-29 22:56:58 +08:00
public	Add video/audio to MP3 conversion feature	2025-11-29 22:56:58 +08:00
scripts	Initial commit: Video to MP3 Transcriptor	2025-11-24 11:40:23 +08:00
src	Add video/audio to MP3 conversion feature	2025-11-29 22:56:58 +08:00
.env.example	Initial commit: Video to MP3 Transcriptor	2025-11-24 11:40:23 +08:00
.gitignore	Initial commit: Video to MP3 Transcriptor	2025-11-24 11:40:23 +08:00
package-lock.json	Initial commit: Video to MP3 Transcriptor	2025-11-24 11:40:23 +08:00
package.json	Initial commit: Video to MP3 Transcriptor	2025-11-24 11:40:23 +08:00
README.md	Initial commit: Video to MP3 Transcriptor	2025-11-24 11:40:23 +08:00

README.md

Video to MP3 Transcriptor

Download YouTube videos/playlists to MP3 and transcribe them using OpenAI Whisper API.

Features

Download single YouTube videos as MP3
Download entire playlists as MP3
Transcribe audio files using OpenAI Whisper API
CLI interface for quick operations
REST API for integration with other systems

Prerequisites

Node.js 18+
yt-dlp installed on your system
ffmpeg installed (for audio conversion)
OpenAI API key (for transcription)

Installing yt-dlp

# Windows (winget)
winget install yt-dlp

# macOS
brew install yt-dlp

# Linux
sudo apt install yt-dlp
# or
pip install yt-dlp

Installing ffmpeg

# Windows (winget)
winget install ffmpeg

# macOS
brew install ffmpeg

# Linux
sudo apt install ffmpeg

Installation

# Clone and install
cd videotoMP3Transcriptor
npm install

# Configure environment
cp .env.example .env
# Edit .env and add your OPENAI_API_KEY

Usage

CLI

# Download a video as MP3
npm run cli download "https://youtube.com/watch?v=VIDEO_ID"

# Download a playlist
npm run cli download "https://youtube.com/playlist?list=PLAYLIST_ID"

# Download with custom output directory
npm run cli download "URL" -o ./my-folder

# Get info about a video/playlist
npm run cli info "URL"

# Transcribe an existing MP3
npm run cli transcribe ./output/video.mp3

# Transcribe with specific language
npm run cli transcribe ./output/video.mp3 -l fr

# Transcribe with specific model
npm run cli transcribe ./output/video.mp3 -m gpt-4o-mini-transcribe

# Download AND transcribe
npm run cli process "URL"

# Download and transcribe with options
npm run cli process "URL" -l en -m gpt-4o-transcribe

Linux Scripts

Convenience scripts are available in the scripts/ directory:

# Make scripts executable (first time only)
chmod +x scripts/*.sh

# Download video/playlist
./scripts/download.sh "https://youtube.com/watch?v=VIDEO_ID"

# Transcribe a file
./scripts/transcribe.sh ./output/video.mp3 fr

# Download + transcribe
./scripts/process.sh "https://youtube.com/watch?v=VIDEO_ID" en

# Start the API server
./scripts/server.sh

# Get video info
./scripts/info.sh "https://youtube.com/watch?v=VIDEO_ID"

API Server

# Start the server
npm run server

Server runs on http://localhost:3000 by default.

Endpoints

GET /health

Health check endpoint.

GET /info?url=YOUTUBE_URL

Get info about a video or playlist.

curl "http://localhost:3000/info?url=https://youtube.com/watch?v=VIDEO_ID"

POST /download

Download video(s) as MP3.

curl -X POST http://localhost:3000/download \
  -H "Content-Type: application/json" \
  -d '{"url": "https://youtube.com/watch?v=VIDEO_ID"}'

POST /transcribe

Transcribe an existing audio file.

curl -X POST http://localhost:3000/transcribe \
  -H "Content-Type: application/json" \
  -d '{"filePath": "./output/video.mp3", "language": "en"}'

POST /process

Download and transcribe in one call.

curl -X POST http://localhost:3000/process \
  -H "Content-Type: application/json" \
  -d '{"url": "https://youtube.com/watch?v=VIDEO_ID", "language": "en", "format": "txt"}'

GET /files-list

List all downloaded files.

GET /files/:filename

Download/stream a specific file.

Configuration

Environment variables (.env):

Variable	Description	Default
`OPENAI_API_KEY`	Your OpenAI API key	Required for transcription
`PORT`	Server port	3000
`OUTPUT_DIR`	Download directory	./output

Transcription Models

Model	Description	Formats
`gpt-4o-transcribe`	Best quality, latest GPT-4o (default)	txt, json
`gpt-4o-mini-transcribe`	Faster, cheaper, good quality	txt, json
`whisper-1`	Legacy Whisper model	txt, json, srt, vtt

Transcription Formats

txt - Plain text (all models)
json - JSON response (all models)
srt - SubRip subtitles (whisper-1 only)
vtt - WebVTT subtitles (whisper-1 only)

Language Codes

Common language codes for the -l option:

en - English
fr - French
es - Spanish
de - German
it - Italian
pt - Portuguese
zh - Chinese
ja - Japanese
ko - Korean
ru - Russian

Leave empty for auto-detection.

Project Structure

videotoMP3Transcriptor/
├── src/
│   ├── services/
│   │   ├── youtube.js       # YouTube download service
│   │   └── transcription.js # OpenAI transcription service
│   ├── cli.js               # CLI entry point
│   └── server.js            # Express API server
├── scripts/                  # Linux convenience scripts
│   ├── download.sh          # Download video/playlist
│   ├── transcribe.sh        # Transcribe audio file
│   ├── process.sh           # Download + transcribe
│   ├── server.sh            # Start API server
│   └── info.sh              # Get video info
├── output/                   # Downloaded files
├── .env                      # Configuration
└── package.json

License

MIT