- Changed all localhost:8888 references to localhost:3001 - Reflects actual server configuration (PORT=3001 in .env) - 32 occurrences updated throughout the documentation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
26 KiB
API Documentation - Video to MP3 Transcriptor
Base URL
http://localhost:3001
🔐 Authentication
⚠️ IMPORTANT: All API endpoints (except /health and /api) require authentication using an API token.
How to Authenticate
Include your API token in one of these ways:
Option 1: X-API-Key header (Recommended)
curl -H "X-API-Key: your_api_token_here" http://localhost:3001/endpoint
Option 2: Authorization Bearer header
curl -H "Authorization: Bearer your_api_token_here" http://localhost:3001/endpoint
Configuration
-
Set your API token in
.env:API_TOKEN=your_secure_token_here -
Generate a secure token for production:
# Linux/Mac openssl rand -hex 32 # Or use any secure random string generator
Security Notes
- Public endpoints (no auth required):
/health,/api - Protected endpoints: All other endpoints require authentication
- In development: If
API_TOKENis not set, the API will work without authentication (with a warning) - In production: Always set a strong
API_TOKEN
Error Responses
401 Unauthorized - No API key provided:
{
"error": "Unauthorized",
"message": "API key required. Provide X-API-Key header or Authorization: Bearer <token>"
}
403 Forbidden - Invalid API key:
{
"error": "Forbidden",
"message": "Invalid API key"
}
Table of Contents
- Authentication
- Health & Info
- Public Download Endpoint
- Download Endpoints
- Transcription Endpoints
- Conversion Endpoints
- Translation Endpoints
- Summarization Endpoints
- File Management
- Admin Endpoints
- Security Configuration
Health & Info
GET /health
Health check endpoint.
Authentication: Not required (public)
Response:
{
"status": "ok",
"timestamp": "2025-11-28T12:00:00.000Z"
}
GET /api
Get API information and available endpoints.
Authentication: Not required (public)
Response:
{
"name": "Video to MP3 Transcriptor API",
"version": "1.0.0",
"endpoints": { ... }
}
Public Download Endpoint
GET /public/download/:filename
Public endpoint to download files without authentication.
Authentication: Not required (public)
Purpose: Share direct download links for generated files (MP3, transcriptions, translations, summaries) without requiring API authentication.
URL Parameters:
filename(required): Name of the file to download
Security:
- Directory traversal protection enabled (uses
path.basename()) - Only files in the configured OUTPUT_DIR are accessible
- No authentication required
Example:
# Direct download (no auth needed)
curl -O http://localhost:3001/public/download/my_video.mp3
# Or simply open in browser
http://localhost:3001/public/download/my_video.mp3
Response (Success):
- File download with proper Content-Disposition headers
- Browser will prompt to download the file
Response (Error - 404):
{
"error": "File not found",
"message": "File 'my_video.mp3' does not exist"
}
Response (Error - 500):
{
"error": "Download failed",
"message": "Error details..."
}
Use Cases:
- Share download links via email/chat
- Embed in web applications
- Direct browser downloads
- Public file sharing
Note: After processing (download, transcription, etc.), use the returned filePath or fileUrl from authenticated endpoints, then construct public URL:
/public/download/{basename_of_filePath}
GET /info
Get information about a YouTube video or playlist.
Query Parameters:
url(required): YouTube URL
Example:
curl -H "X-API-Key: your_token" \
"http://localhost:3001/info?url=https://www.youtube.com/watch?v=VIDEO_ID"
Response:
{
"success": true,
"title": "Video Title",
"type": "video",
"duration": 300,
"channel": "Channel Name",
"videoCount": 1
}
Download Endpoints
GET /download-stream
Download YouTube video(s) to MP3 with Server-Sent Events (SSE) progress updates.
Query Parameters:
url(required): YouTube URLoutputPath(optional): Custom output directory path
Example:
curl -H "X-API-Key: your_token" \
"http://localhost:3001/download-stream?url=https://www.youtube.com/watch?v=VIDEO_ID"
SSE Events:
info: Video/playlist informationprogress: Download progress updatesvideo-complete: Individual video completioncomplete: All downloads completeerror: Error occurred
POST /download
Download YouTube video(s) to MP3 (non-streaming).
Body Parameters:
{
"url": "https://www.youtube.com/watch?v=VIDEO_ID",
"outputPath": "./custom/path" // optional
}
Example:
curl -H "X-API-Key: your_token" \
-X POST http://localhost:3001/download \
-H "Content-Type: application/json" \
-d '{"url":"https://www.youtube.com/watch?v=VIDEO_ID"}'
Response:
{
"success": true,
"playlistTitle": null,
"totalVideos": 1,
"successCount": 1,
"failCount": 0,
"videos": [
{
"success": true,
"title": "Video Title",
"filePath": "./output/video.mp3",
"fileUrl": "/files/video.mp3"
}
]
}
Transcription Endpoints
POST /transcribe
Transcribe an existing audio file.
Body Parameters:
{
"filePath": "./output/audio.mp3",
"language": "en", // optional (auto-detect if not specified)
"format": "txt", // optional: txt, json, srt, vtt
"model": "gpt-4o-mini-transcribe", // optional: gpt-4o-mini-transcribe (default), gpt-4o-transcribe, whisper-1
"outputPath": "./custom/path" // optional
}
Available Models:
gpt-4o-mini-transcribe(default) - Fast and cost-effectivegpt-4o-transcribe- Higher qualitywhisper-1- Original Whisper model (supports more formats)
Example:
curl -H "X-API-Key: your_token" \
-X POST http://localhost:3001/transcribe \
-H "Content-Type: application/json" \
-d '{
"filePath": "./output/audio.mp3",
"language": "en",
"model": "gpt-4o-mini-transcribe"
}'
Response:
{
"success": true,
"filePath": "./output/audio.mp3",
"transcriptionPath": "./output/audio.txt",
"transcriptionUrl": "/files/audio.txt",
"text": "Transcribed text content..."
}
POST /upload-transcribe
Upload and transcribe audio files.
Form Data:
files: Audio file(s) (multiple files supported, max 50)language: Language code (optional)model: Transcription model (optional, default: gpt-4o-mini-transcribe)outputPath: Custom output directory (optional)
Example:
curl -X POST http://localhost:3001/upload-transcribe \
-F "files=@audio1.mp3" \
-F "files=@audio2.mp3" \
-F "language=en" \
-F "model=gpt-4o-mini-transcribe"
Response:
{
"success": true,
"totalFiles": 2,
"successCount": 2,
"failCount": 0,
"results": [
{
"success": true,
"fileName": "audio1.mp3",
"transcriptionPath": "./output/audio1.txt",
"transcriptionUrl": "/files/audio1.txt",
"text": "Transcription..."
}
]
}
GET /process-stream
Download + Transcribe with SSE progress updates.
Query Parameters:
url(required): YouTube URLlanguage(optional): Language codemodel(optional): Transcription model (default: gpt-4o-mini-transcribe)outputPath(optional): Custom output directory
Example:
curl "http://localhost:3001/process-stream?url=https://www.youtube.com/watch?v=VIDEO_ID&language=en&model=gpt-4o-mini-transcribe"
SSE Events:
info: Video informationprogress: Progress updates (downloading or transcribing)video-complete: Download completetranscribe-complete: Transcription completecomplete: All operations completeerror: Error occurred
POST /process
Download + Transcribe (non-streaming).
Body Parameters:
{
"url": "https://www.youtube.com/watch?v=VIDEO_ID",
"language": "en", // optional
"format": "txt", // optional
"model": "gpt-4o-mini-transcribe", // optional
"outputPath": "./custom/path" // optional
}
Example:
curl -X POST http://localhost:3001/process \
-H "Content-Type: application/json" \
-d '{
"url": "https://www.youtube.com/watch?v=VIDEO_ID",
"language": "en",
"model": "gpt-4o-mini-transcribe"
}'
Response:
{
"success": true,
"playlistTitle": null,
"totalVideos": 1,
"downloadedCount": 1,
"transcribedCount": 1,
"results": [
{
"title": "Video Title",
"downloadSuccess": true,
"audioPath": "./output/video.mp3",
"audioUrl": "/files/video.mp3",
"transcriptionSuccess": true,
"transcriptionPath": "./output/video.txt",
"transcriptionUrl": "/files/video.txt",
"text": "Transcription..."
}
]
}
POST /upload-process
🎯 Smart endpoint that auto-detects input and processes accordingly:
- Video files (MP4, AVI, MKV, etc.) → Convert to MP3 → Transcribe
- Audio files (MP3, WAV, M4A, etc.) → Transcribe directly
- URL parameter → Download from YouTube → Transcribe
- Mixed input → Process both uploaded files AND URL
This endpoint intelligently handles whatever you send it!
Form Data:
files: Video or audio file(s) (optional, multiple files supported, max 50)url: YouTube URL (optional)language: Language code for transcription (optional)model: Transcription model (optional, default: gpt-4o-mini-transcribe)outputPath: Custom output directory (optional)
Note: You must provide either files, url, or both.
Example 1: Upload video files
curl -H "X-API-Key: your_token" \
-X POST http://localhost:3001/upload-process \
-F "files=@meeting.mp4" \
-F "files=@interview.avi" \
-F "language=en" \
-F "model=gpt-4o-mini-transcribe"
Example 2: Upload audio files
curl -H "X-API-Key: your_token" \
-X POST http://localhost:3001/upload-process \
-F "files=@podcast.mp3" \
-F "files=@lecture.wav" \
-F "language=fr"
Example 3: Process YouTube URL
curl -H "X-API-Key: your_token" \
-X POST http://localhost:3001/upload-process \
-F "url=https://www.youtube.com/watch?v=VIDEO_ID" \
-F "language=en"
Example 4: Mixed - Files AND URL
curl -H "X-API-Key: your_token" \
-X POST http://localhost:3001/upload-process \
-F "files=@local_video.mp4" \
-F "url=https://www.youtube.com/watch?v=VIDEO_ID" \
-F "language=en"
Response:
{
"success": true,
"totalFiles": 3,
"successCount": 3,
"failCount": 0,
"results": [
{
"success": true,
"source": "upload",
"sourceType": "video",
"fileName": "meeting.mp4",
"converted": true,
"audioPath": "./output/meeting.mp3",
"audioUrl": "/files/meeting.mp3",
"transcriptionPath": "./output/meeting.txt",
"transcriptionUrl": "/files/meeting.txt",
"text": "Transcribed content..."
},
{
"success": true,
"source": "upload",
"sourceType": "audio",
"fileName": "podcast.mp3",
"converted": false,
"audioPath": "./output/podcast.mp3",
"audioUrl": "/files/podcast.mp3",
"transcriptionPath": "./output/podcast.txt",
"transcriptionUrl": "/files/podcast.txt",
"text": "Transcribed content..."
},
{
"success": true,
"source": "url",
"sourceType": "youtube",
"title": "Video Title from YouTube",
"audioPath": "./output/video_title.mp3",
"audioUrl": "/files/video_title.mp3",
"transcriptionPath": "./output/video_title.txt",
"transcriptionUrl": "/files/video_title.txt",
"text": "Transcribed content..."
}
]
}
Supported Video Formats:
- MP4, AVI, MKV, MOV, WMV, FLV, WebM, M4V
Supported Audio Formats:
- MP3, WAV, M4A, FLAC, OGG, AAC
Conversion Endpoints
POST /convert-to-mp3
Upload video or audio files and convert them to MP3 format.
Form Data:
files: Video or audio file(s) (multiple files supported, max 50)bitrate: Audio bitrate (optional, default: 192k)quality: Audio quality 0-9, where 0 is best (optional, default: 2)
Example:
curl -H "X-API-Key: your_token" \
-X POST http://localhost:3001/convert-to-mp3 \
-F "files=@video.mp4" \
-F "files=@another_video.avi" \
-F "bitrate=320k" \
-F "quality=0"
Response:
{
"success": true,
"totalFiles": 2,
"successCount": 2,
"failCount": 0,
"results": [
{
"success": true,
"fileName": "video.mp4",
"inputPath": "./output/video.mp4",
"outputPath": "./output/video.mp3",
"outputUrl": "/files/video.mp3",
"size": "5.2 MB"
},
{
"success": true,
"fileName": "another_video.avi",
"inputPath": "./output/another_video.avi",
"outputPath": "./output/another_video.mp3",
"outputUrl": "/files/another_video.mp3",
"size": "3.8 MB"
}
]
}
GET /supported-formats
Get list of supported video and audio formats for conversion.
Example:
curl -H "X-API-Key: your_token" \
http://localhost:3001/supported-formats
Response:
{
"formats": {
"video": [".mp4", ".avi", ".mkv", ".mov", ".wmv", ".flv", ".webm", ".m4v"],
"audio": [".m4a", ".wav", ".flac", ".ogg", ".aac", ".wma", ".opus"]
}
}
Translation Endpoints
GET /languages
Get available translation languages.
Response:
{
"languages": {
"en": "English",
"fr": "French",
"es": "Spanish",
"de": "German",
"zh": "Chinese",
"ja": "Japanese",
...
}
}
POST /translate
Translate text.
Body Parameters:
{
"text": "Text to translate",
"targetLang": "fr", // required: target language code
"sourceLang": "en" // optional: source language (auto-detect if not specified)
}
Example:
curl -X POST http://localhost:3001/translate \
-H "Content-Type: application/json" \
-d '{
"text": "Hello, how are you?",
"targetLang": "fr"
}'
Response:
{
"success": true,
"originalText": "Hello, how are you?",
"translatedText": "Bonjour, comment allez-vous ?",
"targetLanguage": "French",
"sourceLanguage": "auto-detected",
"chunks": 1
}
POST /translate-file
Translate uploaded text files.
Form Data:
files: Text file(s) (.txt, multiple files supported, max 50)targetLang: Target language code (required)sourceLang: Source language code (optional)outputPath: Custom output directory (optional)
Example:
curl -X POST http://localhost:3001/translate-file \
-F "files=@document.txt" \
-F "targetLang=fr" \
-F "sourceLang=en"
Response:
{
"success": true,
"totalFiles": 1,
"successCount": 1,
"failCount": 0,
"results": [
{
"success": true,
"fileName": "document.txt",
"translationPath": "./output/document_fr.txt",
"translationUrl": "/files/document_fr.txt",
"translatedText": "Translated content..."
}
]
}
Summarization Endpoints
GET /summary-styles
Get available summary styles.
Response:
{
"styles": {
"concise": "A brief summary capturing main points",
"detailed": "A comprehensive summary with nuances",
"bullet": "Key points as bullet points"
}
}
POST /summarize
Summarize text using GPT-5.1.
Body Parameters:
{
"text": "Long text to summarize...",
"style": "concise", // optional: concise (default), detailed, bullet
"language": "same", // optional: 'same' (default) or language code
"model": "gpt-5.1" // optional: default is gpt-5.1
}
Example:
curl -X POST http://localhost:3001/summarize \
-H "Content-Type: application/json" \
-d '{
"text": "Long article content...",
"style": "bullet",
"language": "same"
}'
Response:
{
"success": true,
"summary": "Summary content...",
"model": "gpt-5.1",
"style": "bullet",
"inputLength": 5000,
"chunks": 1
}
POST /summarize-file
Summarize uploaded text files using GPT-5.1.
Form Data:
files: Text file(s) (.txt, multiple files supported, max 50)style: Summary style (optional, default: concise)language: Output language (optional, default: same)model: AI model (optional, default: gpt-5.1)outputPath: Custom output directory (optional)
Example:
curl -X POST http://localhost:3001/summarize-file \
-F "files=@article.txt" \
-F "style=detailed" \
-F "language=same"
Response:
{
"success": true,
"totalFiles": 1,
"successCount": 1,
"failCount": 0,
"results": [
{
"success": true,
"fileName": "article.txt",
"summaryPath": "./output/article_summary.txt",
"summaryUrl": "/files/article_summary.txt",
"summary": "Summary content...",
"model": "gpt-5.1",
"chunks": 1
}
]
}
GET /summarize-stream
Full pipeline: Download -> Transcribe -> Summarize with SSE progress.
Query Parameters:
url(required): YouTube URLstyle(optional): Summary style (default: concise)language(optional): Output language (default: same)model(optional): Transcription model (default: gpt-4o-mini-transcribe)outputPath(optional): Custom output directory
Example:
curl "http://localhost:3001/summarize-stream?url=https://www.youtube.com/watch?v=VIDEO_ID&style=bullet&model=gpt-4o-mini-transcribe"
SSE Events:
info: Video informationprogress: Progress updates (downloading, transcribing, or summarizing)video-complete: Download completetranscribe-complete: Transcription completesummarize-complete: Summary completecomplete: All operations completeerror: Error occurred
File Management
GET /files-list
List all downloaded/generated files.
Example:
curl http://localhost:3001/files-list
Response:
{
"files": [
{
"name": "video.mp3",
"url": "/files/video.mp3",
"path": "./output/video.mp3"
},
{
"name": "video.txt",
"url": "/files/video.txt",
"path": "./output/video.txt"
}
]
}
GET /files/:filename
Serve a specific file.
Example:
curl http://localhost:3001/files/video.mp3 --output video.mp3
Error Responses
All endpoints return error responses in the following format:
{
"error": "Error message describing what went wrong"
}
Common HTTP status codes:
400- Bad Request (missing required parameters)500- Internal Server Error (processing failed)
Notes
Output Paths
All endpoints that support outputPath parameter:
- If not specified, files are saved to the default
OUTPUT_DIR(./output) - If specified, files are saved to the custom path provided
Models
- Transcription: Default is
gpt-4o-mini-transcribe(cost-effective) - Summarization: Default is
gpt-5.1(latest GPT model) - Translation: Uses
gpt-4o-mini(hardcoded)
File Formats
- Audio: MP3, WAV, M4A, OGG, FLAC
- Text: TXT files
- Transcription outputs: TXT, JSON, SRT, VTT (depending on model)
API Key
Ensure OPENAI_API_KEY is set in your .env file for transcription, translation, and summarization features to work.
Admin Endpoints
POST /admin/upload-cookies
Upload YouTube cookies file to enable authentication bypass for bot detection.
Purpose: When YouTube blocks downloads with "Sign in to confirm you're not a bot", this endpoint allows you to upload cookies from your browser to authenticate requests.
Authentication: Required (use your API token)
Request:
- Method:
POST - Content-Type:
multipart/form-data - Body: File upload with field name
cookies
Example (cURL):
# Upload cookies file
curl -X POST \
-H "X-API-Key: your_api_token" \
-F "cookies=@youtube-cookies.txt" \
http://localhost:3001/admin/upload-cookies
Example (Using the automation script):
# Extract cookies from browser and upload automatically
export API_TOKEN="your_api_token"
export API_URL="http://localhost:3001"
./extract-and-upload-cookies.sh
Response (Success - 200):
{
"success": true,
"message": "Cookies uploaded successfully",
"paths": {
"local": "/home/user/project/youtube-cookies.txt",
"persistent": "/tmp/share/youtube-cookies.txt"
},
"note": "Cookies are now active. No restart required."
}
Response (Error - 400):
{
"error": "No file uploaded",
"message": "Please upload a cookies.txt file",
"help": "Export cookies from your browser using a 'Get cookies.txt' extension"
}
Response (Error - 500):
{
"error": "Failed to upload cookies",
"message": "Error details..."
}
How to Get YouTube Cookies
Method 1: Automated Script (Recommended)
Use the provided extract-and-upload-cookies.sh script:
# Set your API credentials
export API_TOKEN="your_api_token"
export API_URL="http://localhost:3001"
# Run the script - it will auto-detect your browser
./extract-and-upload-cookies.sh
The script will:
- Detect installed browsers (Chrome, Firefox, Edge)
- Extract cookies using yt-dlp
- Upload them to the API automatically
Method 2: Manual Export
-
Install browser extension:
- Chrome/Edge: Get cookies.txt LOCALLY
- Firefox: cookies.txt
-
Login to YouTube:
- Visit https://www.youtube.com
- Make sure you're logged into your account
-
Export cookies:
- Click the extension icon
- Click "Export" or "Download"
- Save the file as
youtube-cookies.txt
-
Upload via API:
curl -X POST \ -H "X-API-Key: your_api_token" \ -F "cookies=@youtube-cookies.txt" \ http://localhost:3001/admin/upload-cookies
Cookie Storage
Cookies are saved to two locations:
-
Local project directory:
/path/to/project/youtube-cookies.txt- Used immediately by the API
- Active without restart
-
Persistent storage:
/tmp/share/youtube-cookies.txt- Persists across server restarts
- Automatically loaded on startup (via
refresh-cookies.sh)
Cookie Expiration
- YouTube cookies typically expire after 2-4 weeks
- When expired, you'll see "YouTube Bot Detection" errors
- Re-upload fresh cookies using the same method
Security Notes
⚠️ Important Cookie Security:
- Cookies = Your YouTube session (treat like a password)
- Never commit
youtube-cookies.txtto git (already in .gitignore) - Don't share publicly
- File permissions are automatically set to
600(owner read/write only) - Re-export periodically when they expire
Security Configuration
Environment Variables
Required security variables in .env:
# API Authentication Token
API_TOKEN=your_secure_random_token_here
# CORS - Allowed Origins (comma-separated)
# Development: * (all origins)
# Production: https://yourdomain.com,https://app.yourdomain.com
ALLOWED_ORIGINS=*
# Server Port
PORT=8888
# Output Directory
OUTPUT_DIR=./output
# OpenAI API Key (required for AI features)
OPENAI_API_KEY=sk-...
Security Features
The API implements the following security measures:
-
API Token Authentication
- All endpoints (except
/healthand/api) require authentication - Supports both
X-API-KeyandAuthorization: Bearerheaders
- All endpoints (except
-
CORS Protection
- Configurable allowed origins via
ALLOWED_ORIGINS - Restricts cross-origin requests to trusted domains
- Configurable allowed origins via
-
HTTP Security Headers
X-Content-Type-Options: nosniffX-Frame-Options: DENYX-XSS-Protection: 1; mode=blockStrict-Transport-Security: max-age=31536000; includeSubDomainsContent-Security-Policywith strict policies
-
Input Validation
- File type validation for uploads
- Parameter validation on all endpoints
Production Deployment Checklist
Before deploying to production:
- Generate a strong, unique
API_TOKEN(min 32 characters) - Set
ALLOWED_ORIGINSto your specific domains (remove*) - Ensure
OPENAI_API_KEYis properly set - Use HTTPS (not HTTP) for all connections
- Set up rate limiting (recommended via reverse proxy)
- Configure firewall rules
- Set up monitoring and logging
- Review and secure file upload limits
Example Authenticated Requests
Using X-API-Key header:
# Download endpoint
curl -H "X-API-Key: your_token" \
-X POST http://localhost:3001/download \
-H "Content-Type: application/json" \
-d '{"url":"https://www.youtube.com/watch?v=VIDEO_ID"}'
# Transcribe endpoint
curl -H "X-API-Key: your_token" \
-X POST http://localhost:3001/transcribe \
-H "Content-Type: application/json" \
-d '{"filePath":"./output/audio.mp3"}'
Using Authorization Bearer:
curl -H "Authorization: Bearer your_token" \
-X POST http://localhost:3001/summarize \
-H "Content-Type: application/json" \
-d '{"text":"Long text to summarize..."}'