videotomp3transcriptor/docs/API.md
debian.StillHammer ea4c49b781 Update API documentation port from 8888 to 3001
- Changed all localhost:8888 references to localhost:3001
- Reflects actual server configuration (PORT=3001 in .env)
- 32 occurrences updated throughout the documentation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-08 12:13:54 +00:00

26 KiB

API Documentation - Video to MP3 Transcriptor

Base URL

http://localhost:3001

🔐 Authentication

⚠️ IMPORTANT: All API endpoints (except /health and /api) require authentication using an API token.

How to Authenticate

Include your API token in one of these ways:

Option 1: X-API-Key header (Recommended)

curl -H "X-API-Key: your_api_token_here" http://localhost:3001/endpoint

Option 2: Authorization Bearer header

curl -H "Authorization: Bearer your_api_token_here" http://localhost:3001/endpoint

Configuration

  1. Set your API token in .env:

    API_TOKEN=your_secure_token_here
    
  2. Generate a secure token for production:

    # Linux/Mac
    openssl rand -hex 32
    
    # Or use any secure random string generator
    

Security Notes

  • Public endpoints (no auth required): /health, /api
  • Protected endpoints: All other endpoints require authentication
  • In development: If API_TOKEN is not set, the API will work without authentication (with a warning)
  • In production: Always set a strong API_TOKEN

Error Responses

401 Unauthorized - No API key provided:

{
  "error": "Unauthorized",
  "message": "API key required. Provide X-API-Key header or Authorization: Bearer <token>"
}

403 Forbidden - Invalid API key:

{
  "error": "Forbidden",
  "message": "Invalid API key"
}

Table of Contents


Health & Info

GET /health

Health check endpoint.

Authentication: Not required (public)

Response:

{
  "status": "ok",
  "timestamp": "2025-11-28T12:00:00.000Z"
}

GET /api

Get API information and available endpoints.

Authentication: Not required (public)

Response:

{
  "name": "Video to MP3 Transcriptor API",
  "version": "1.0.0",
  "endpoints": { ... }
}

Public Download Endpoint

GET /public/download/:filename

Public endpoint to download files without authentication.

Authentication: Not required (public)

Purpose: Share direct download links for generated files (MP3, transcriptions, translations, summaries) without requiring API authentication.

URL Parameters:

  • filename (required): Name of the file to download

Security:

  • Directory traversal protection enabled (uses path.basename())
  • Only files in the configured OUTPUT_DIR are accessible
  • No authentication required

Example:

# Direct download (no auth needed)
curl -O http://localhost:3001/public/download/my_video.mp3

# Or simply open in browser
http://localhost:3001/public/download/my_video.mp3

Response (Success):

  • File download with proper Content-Disposition headers
  • Browser will prompt to download the file

Response (Error - 404):

{
  "error": "File not found",
  "message": "File 'my_video.mp3' does not exist"
}

Response (Error - 500):

{
  "error": "Download failed",
  "message": "Error details..."
}

Use Cases:

  • Share download links via email/chat
  • Embed in web applications
  • Direct browser downloads
  • Public file sharing

Note: After processing (download, transcription, etc.), use the returned filePath or fileUrl from authenticated endpoints, then construct public URL:

/public/download/{basename_of_filePath}

GET /info

Get information about a YouTube video or playlist.

Query Parameters:

  • url (required): YouTube URL

Example:

curl -H "X-API-Key: your_token" \
  "http://localhost:3001/info?url=https://www.youtube.com/watch?v=VIDEO_ID"

Response:

{
  "success": true,
  "title": "Video Title",
  "type": "video",
  "duration": 300,
  "channel": "Channel Name",
  "videoCount": 1
}

Download Endpoints

GET /download-stream

Download YouTube video(s) to MP3 with Server-Sent Events (SSE) progress updates.

Query Parameters:

  • url (required): YouTube URL
  • outputPath (optional): Custom output directory path

Example:

curl -H "X-API-Key: your_token" \
  "http://localhost:3001/download-stream?url=https://www.youtube.com/watch?v=VIDEO_ID"

SSE Events:

  • info: Video/playlist information
  • progress: Download progress updates
  • video-complete: Individual video completion
  • complete: All downloads complete
  • error: Error occurred

POST /download

Download YouTube video(s) to MP3 (non-streaming).

Body Parameters:

{
  "url": "https://www.youtube.com/watch?v=VIDEO_ID",
  "outputPath": "./custom/path"  // optional
}

Example:

curl -H "X-API-Key: your_token" \
  -X POST http://localhost:3001/download \
  -H "Content-Type: application/json" \
  -d '{"url":"https://www.youtube.com/watch?v=VIDEO_ID"}'

Response:

{
  "success": true,
  "playlistTitle": null,
  "totalVideos": 1,
  "successCount": 1,
  "failCount": 0,
  "videos": [
    {
      "success": true,
      "title": "Video Title",
      "filePath": "./output/video.mp3",
      "fileUrl": "/files/video.mp3"
    }
  ]
}

Transcription Endpoints

POST /transcribe

Transcribe an existing audio file.

Body Parameters:

{
  "filePath": "./output/audio.mp3",
  "language": "en",  // optional (auto-detect if not specified)
  "format": "txt",   // optional: txt, json, srt, vtt
  "model": "gpt-4o-mini-transcribe",  // optional: gpt-4o-mini-transcribe (default), gpt-4o-transcribe, whisper-1
  "outputPath": "./custom/path"  // optional
}

Available Models:

  • gpt-4o-mini-transcribe (default) - Fast and cost-effective
  • gpt-4o-transcribe - Higher quality
  • whisper-1 - Original Whisper model (supports more formats)

Example:

curl -H "X-API-Key: your_token" \
  -X POST http://localhost:3001/transcribe \
  -H "Content-Type: application/json" \
  -d '{
    "filePath": "./output/audio.mp3",
    "language": "en",
    "model": "gpt-4o-mini-transcribe"
  }'

Response:

{
  "success": true,
  "filePath": "./output/audio.mp3",
  "transcriptionPath": "./output/audio.txt",
  "transcriptionUrl": "/files/audio.txt",
  "text": "Transcribed text content..."
}

POST /upload-transcribe

Upload and transcribe audio files.

Form Data:

  • files: Audio file(s) (multiple files supported, max 50)
  • language: Language code (optional)
  • model: Transcription model (optional, default: gpt-4o-mini-transcribe)
  • outputPath: Custom output directory (optional)

Example:

curl -X POST http://localhost:3001/upload-transcribe \
  -F "files=@audio1.mp3" \
  -F "files=@audio2.mp3" \
  -F "language=en" \
  -F "model=gpt-4o-mini-transcribe"

Response:

{
  "success": true,
  "totalFiles": 2,
  "successCount": 2,
  "failCount": 0,
  "results": [
    {
      "success": true,
      "fileName": "audio1.mp3",
      "transcriptionPath": "./output/audio1.txt",
      "transcriptionUrl": "/files/audio1.txt",
      "text": "Transcription..."
    }
  ]
}

GET /process-stream

Download + Transcribe with SSE progress updates.

Query Parameters:

  • url (required): YouTube URL
  • language (optional): Language code
  • model (optional): Transcription model (default: gpt-4o-mini-transcribe)
  • outputPath (optional): Custom output directory

Example:

curl "http://localhost:3001/process-stream?url=https://www.youtube.com/watch?v=VIDEO_ID&language=en&model=gpt-4o-mini-transcribe"

SSE Events:

  • info: Video information
  • progress: Progress updates (downloading or transcribing)
  • video-complete: Download complete
  • transcribe-complete: Transcription complete
  • complete: All operations complete
  • error: Error occurred

POST /process

Download + Transcribe (non-streaming).

Body Parameters:

{
  "url": "https://www.youtube.com/watch?v=VIDEO_ID",
  "language": "en",  // optional
  "format": "txt",   // optional
  "model": "gpt-4o-mini-transcribe",  // optional
  "outputPath": "./custom/path"  // optional
}

Example:

curl -X POST http://localhost:3001/process \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://www.youtube.com/watch?v=VIDEO_ID",
    "language": "en",
    "model": "gpt-4o-mini-transcribe"
  }'

Response:

{
  "success": true,
  "playlistTitle": null,
  "totalVideos": 1,
  "downloadedCount": 1,
  "transcribedCount": 1,
  "results": [
    {
      "title": "Video Title",
      "downloadSuccess": true,
      "audioPath": "./output/video.mp3",
      "audioUrl": "/files/video.mp3",
      "transcriptionSuccess": true,
      "transcriptionPath": "./output/video.txt",
      "transcriptionUrl": "/files/video.txt",
      "text": "Transcription..."
    }
  ]
}

POST /upload-process

🎯 Smart endpoint that auto-detects input and processes accordingly:

  • Video files (MP4, AVI, MKV, etc.) → Convert to MP3 → Transcribe
  • Audio files (MP3, WAV, M4A, etc.) → Transcribe directly
  • URL parameter → Download from YouTube → Transcribe
  • Mixed input → Process both uploaded files AND URL

This endpoint intelligently handles whatever you send it!

Form Data:

  • files: Video or audio file(s) (optional, multiple files supported, max 50)
  • url: YouTube URL (optional)
  • language: Language code for transcription (optional)
  • model: Transcription model (optional, default: gpt-4o-mini-transcribe)
  • outputPath: Custom output directory (optional)

Note: You must provide either files, url, or both.

Example 1: Upload video files

curl -H "X-API-Key: your_token" \
  -X POST http://localhost:3001/upload-process \
  -F "files=@meeting.mp4" \
  -F "files=@interview.avi" \
  -F "language=en" \
  -F "model=gpt-4o-mini-transcribe"

Example 2: Upload audio files

curl -H "X-API-Key: your_token" \
  -X POST http://localhost:3001/upload-process \
  -F "files=@podcast.mp3" \
  -F "files=@lecture.wav" \
  -F "language=fr"

Example 3: Process YouTube URL

curl -H "X-API-Key: your_token" \
  -X POST http://localhost:3001/upload-process \
  -F "url=https://www.youtube.com/watch?v=VIDEO_ID" \
  -F "language=en"

Example 4: Mixed - Files AND URL

curl -H "X-API-Key: your_token" \
  -X POST http://localhost:3001/upload-process \
  -F "files=@local_video.mp4" \
  -F "url=https://www.youtube.com/watch?v=VIDEO_ID" \
  -F "language=en"

Response:

{
  "success": true,
  "totalFiles": 3,
  "successCount": 3,
  "failCount": 0,
  "results": [
    {
      "success": true,
      "source": "upload",
      "sourceType": "video",
      "fileName": "meeting.mp4",
      "converted": true,
      "audioPath": "./output/meeting.mp3",
      "audioUrl": "/files/meeting.mp3",
      "transcriptionPath": "./output/meeting.txt",
      "transcriptionUrl": "/files/meeting.txt",
      "text": "Transcribed content..."
    },
    {
      "success": true,
      "source": "upload",
      "sourceType": "audio",
      "fileName": "podcast.mp3",
      "converted": false,
      "audioPath": "./output/podcast.mp3",
      "audioUrl": "/files/podcast.mp3",
      "transcriptionPath": "./output/podcast.txt",
      "transcriptionUrl": "/files/podcast.txt",
      "text": "Transcribed content..."
    },
    {
      "success": true,
      "source": "url",
      "sourceType": "youtube",
      "title": "Video Title from YouTube",
      "audioPath": "./output/video_title.mp3",
      "audioUrl": "/files/video_title.mp3",
      "transcriptionPath": "./output/video_title.txt",
      "transcriptionUrl": "/files/video_title.txt",
      "text": "Transcribed content..."
    }
  ]
}

Supported Video Formats:

  • MP4, AVI, MKV, MOV, WMV, FLV, WebM, M4V

Supported Audio Formats:

  • MP3, WAV, M4A, FLAC, OGG, AAC

Conversion Endpoints

POST /convert-to-mp3

Upload video or audio files and convert them to MP3 format.

Form Data:

  • files: Video or audio file(s) (multiple files supported, max 50)
  • bitrate: Audio bitrate (optional, default: 192k)
  • quality: Audio quality 0-9, where 0 is best (optional, default: 2)

Example:

curl -H "X-API-Key: your_token" \
  -X POST http://localhost:3001/convert-to-mp3 \
  -F "files=@video.mp4" \
  -F "files=@another_video.avi" \
  -F "bitrate=320k" \
  -F "quality=0"

Response:

{
  "success": true,
  "totalFiles": 2,
  "successCount": 2,
  "failCount": 0,
  "results": [
    {
      "success": true,
      "fileName": "video.mp4",
      "inputPath": "./output/video.mp4",
      "outputPath": "./output/video.mp3",
      "outputUrl": "/files/video.mp3",
      "size": "5.2 MB"
    },
    {
      "success": true,
      "fileName": "another_video.avi",
      "inputPath": "./output/another_video.avi",
      "outputPath": "./output/another_video.mp3",
      "outputUrl": "/files/another_video.mp3",
      "size": "3.8 MB"
    }
  ]
}

GET /supported-formats

Get list of supported video and audio formats for conversion.

Example:

curl -H "X-API-Key: your_token" \
  http://localhost:3001/supported-formats

Response:

{
  "formats": {
    "video": [".mp4", ".avi", ".mkv", ".mov", ".wmv", ".flv", ".webm", ".m4v"],
    "audio": [".m4a", ".wav", ".flac", ".ogg", ".aac", ".wma", ".opus"]
  }
}

Translation Endpoints

GET /languages

Get available translation languages.

Response:

{
  "languages": {
    "en": "English",
    "fr": "French",
    "es": "Spanish",
    "de": "German",
    "zh": "Chinese",
    "ja": "Japanese",
    ...
  }
}

POST /translate

Translate text.

Body Parameters:

{
  "text": "Text to translate",
  "targetLang": "fr",  // required: target language code
  "sourceLang": "en"   // optional: source language (auto-detect if not specified)
}

Example:

curl -X POST http://localhost:3001/translate \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Hello, how are you?",
    "targetLang": "fr"
  }'

Response:

{
  "success": true,
  "originalText": "Hello, how are you?",
  "translatedText": "Bonjour, comment allez-vous ?",
  "targetLanguage": "French",
  "sourceLanguage": "auto-detected",
  "chunks": 1
}

POST /translate-file

Translate uploaded text files.

Form Data:

  • files: Text file(s) (.txt, multiple files supported, max 50)
  • targetLang: Target language code (required)
  • sourceLang: Source language code (optional)
  • outputPath: Custom output directory (optional)

Example:

curl -X POST http://localhost:3001/translate-file \
  -F "files=@document.txt" \
  -F "targetLang=fr" \
  -F "sourceLang=en"

Response:

{
  "success": true,
  "totalFiles": 1,
  "successCount": 1,
  "failCount": 0,
  "results": [
    {
      "success": true,
      "fileName": "document.txt",
      "translationPath": "./output/document_fr.txt",
      "translationUrl": "/files/document_fr.txt",
      "translatedText": "Translated content..."
    }
  ]
}

Summarization Endpoints

GET /summary-styles

Get available summary styles.

Response:

{
  "styles": {
    "concise": "A brief summary capturing main points",
    "detailed": "A comprehensive summary with nuances",
    "bullet": "Key points as bullet points"
  }
}

POST /summarize

Summarize text using GPT-5.1.

Body Parameters:

{
  "text": "Long text to summarize...",
  "style": "concise",  // optional: concise (default), detailed, bullet
  "language": "same",  // optional: 'same' (default) or language code
  "model": "gpt-5.1"   // optional: default is gpt-5.1
}

Example:

curl -X POST http://localhost:3001/summarize \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Long article content...",
    "style": "bullet",
    "language": "same"
  }'

Response:

{
  "success": true,
  "summary": "Summary content...",
  "model": "gpt-5.1",
  "style": "bullet",
  "inputLength": 5000,
  "chunks": 1
}

POST /summarize-file

Summarize uploaded text files using GPT-5.1.

Form Data:

  • files: Text file(s) (.txt, multiple files supported, max 50)
  • style: Summary style (optional, default: concise)
  • language: Output language (optional, default: same)
  • model: AI model (optional, default: gpt-5.1)
  • outputPath: Custom output directory (optional)

Example:

curl -X POST http://localhost:3001/summarize-file \
  -F "files=@article.txt" \
  -F "style=detailed" \
  -F "language=same"

Response:

{
  "success": true,
  "totalFiles": 1,
  "successCount": 1,
  "failCount": 0,
  "results": [
    {
      "success": true,
      "fileName": "article.txt",
      "summaryPath": "./output/article_summary.txt",
      "summaryUrl": "/files/article_summary.txt",
      "summary": "Summary content...",
      "model": "gpt-5.1",
      "chunks": 1
    }
  ]
}

GET /summarize-stream

Full pipeline: Download -> Transcribe -> Summarize with SSE progress.

Query Parameters:

  • url (required): YouTube URL
  • style (optional): Summary style (default: concise)
  • language (optional): Output language (default: same)
  • model (optional): Transcription model (default: gpt-4o-mini-transcribe)
  • outputPath (optional): Custom output directory

Example:

curl "http://localhost:3001/summarize-stream?url=https://www.youtube.com/watch?v=VIDEO_ID&style=bullet&model=gpt-4o-mini-transcribe"

SSE Events:

  • info: Video information
  • progress: Progress updates (downloading, transcribing, or summarizing)
  • video-complete: Download complete
  • transcribe-complete: Transcription complete
  • summarize-complete: Summary complete
  • complete: All operations complete
  • error: Error occurred

File Management

GET /files-list

List all downloaded/generated files.

Example:

curl http://localhost:3001/files-list

Response:

{
  "files": [
    {
      "name": "video.mp3",
      "url": "/files/video.mp3",
      "path": "./output/video.mp3"
    },
    {
      "name": "video.txt",
      "url": "/files/video.txt",
      "path": "./output/video.txt"
    }
  ]
}

GET /files/:filename

Serve a specific file.

Example:

curl http://localhost:3001/files/video.mp3 --output video.mp3

Error Responses

All endpoints return error responses in the following format:

{
  "error": "Error message describing what went wrong"
}

Common HTTP status codes:

  • 400 - Bad Request (missing required parameters)
  • 500 - Internal Server Error (processing failed)

Notes

Output Paths

All endpoints that support outputPath parameter:

  • If not specified, files are saved to the default OUTPUT_DIR (./output)
  • If specified, files are saved to the custom path provided

Models

  • Transcription: Default is gpt-4o-mini-transcribe (cost-effective)
  • Summarization: Default is gpt-5.1 (latest GPT model)
  • Translation: Uses gpt-4o-mini (hardcoded)

File Formats

  • Audio: MP3, WAV, M4A, OGG, FLAC
  • Text: TXT files
  • Transcription outputs: TXT, JSON, SRT, VTT (depending on model)

API Key

Ensure OPENAI_API_KEY is set in your .env file for transcription, translation, and summarization features to work.


Admin Endpoints

POST /admin/upload-cookies

Upload YouTube cookies file to enable authentication bypass for bot detection.

Purpose: When YouTube blocks downloads with "Sign in to confirm you're not a bot", this endpoint allows you to upload cookies from your browser to authenticate requests.

Authentication: Required (use your API token)

Request:

  • Method: POST
  • Content-Type: multipart/form-data
  • Body: File upload with field name cookies

Example (cURL):

# Upload cookies file
curl -X POST \
  -H "X-API-Key: your_api_token" \
  -F "cookies=@youtube-cookies.txt" \
  http://localhost:3001/admin/upload-cookies

Example (Using the automation script):

# Extract cookies from browser and upload automatically
export API_TOKEN="your_api_token"
export API_URL="http://localhost:3001"
./extract-and-upload-cookies.sh

Response (Success - 200):

{
  "success": true,
  "message": "Cookies uploaded successfully",
  "paths": {
    "local": "/home/user/project/youtube-cookies.txt",
    "persistent": "/tmp/share/youtube-cookies.txt"
  },
  "note": "Cookies are now active. No restart required."
}

Response (Error - 400):

{
  "error": "No file uploaded",
  "message": "Please upload a cookies.txt file",
  "help": "Export cookies from your browser using a 'Get cookies.txt' extension"
}

Response (Error - 500):

{
  "error": "Failed to upload cookies",
  "message": "Error details..."
}

How to Get YouTube Cookies

Method 1: Automated Script (Recommended)

Use the provided extract-and-upload-cookies.sh script:

# Set your API credentials
export API_TOKEN="your_api_token"
export API_URL="http://localhost:3001"

# Run the script - it will auto-detect your browser
./extract-and-upload-cookies.sh

The script will:

  1. Detect installed browsers (Chrome, Firefox, Edge)
  2. Extract cookies using yt-dlp
  3. Upload them to the API automatically

Method 2: Manual Export

  1. Install browser extension:

  2. Login to YouTube:

  3. Export cookies:

    • Click the extension icon
    • Click "Export" or "Download"
    • Save the file as youtube-cookies.txt
  4. Upload via API:

    curl -X POST \
      -H "X-API-Key: your_api_token" \
      -F "cookies=@youtube-cookies.txt" \
      http://localhost:3001/admin/upload-cookies
    

Cookies are saved to two locations:

  1. Local project directory: /path/to/project/youtube-cookies.txt

    • Used immediately by the API
    • Active without restart
  2. Persistent storage: /tmp/share/youtube-cookies.txt

    • Persists across server restarts
    • Automatically loaded on startup (via refresh-cookies.sh)
  • YouTube cookies typically expire after 2-4 weeks
  • When expired, you'll see "YouTube Bot Detection" errors
  • Re-upload fresh cookies using the same method

Security Notes

⚠️ Important Cookie Security:

  • Cookies = Your YouTube session (treat like a password)
  • Never commit youtube-cookies.txt to git (already in .gitignore)
  • Don't share publicly
  • File permissions are automatically set to 600 (owner read/write only)
  • Re-export periodically when they expire

Security Configuration

Environment Variables

Required security variables in .env:

# API Authentication Token
API_TOKEN=your_secure_random_token_here

# CORS - Allowed Origins (comma-separated)
# Development: * (all origins)
# Production: https://yourdomain.com,https://app.yourdomain.com
ALLOWED_ORIGINS=*

# Server Port
PORT=8888

# Output Directory
OUTPUT_DIR=./output

# OpenAI API Key (required for AI features)
OPENAI_API_KEY=sk-...

Security Features

The API implements the following security measures:

  1. API Token Authentication

    • All endpoints (except /health and /api) require authentication
    • Supports both X-API-Key and Authorization: Bearer headers
  2. CORS Protection

    • Configurable allowed origins via ALLOWED_ORIGINS
    • Restricts cross-origin requests to trusted domains
  3. HTTP Security Headers

    • X-Content-Type-Options: nosniff
    • X-Frame-Options: DENY
    • X-XSS-Protection: 1; mode=block
    • Strict-Transport-Security: max-age=31536000; includeSubDomains
    • Content-Security-Policy with strict policies
  4. Input Validation

    • File type validation for uploads
    • Parameter validation on all endpoints

Production Deployment Checklist

Before deploying to production:

  • Generate a strong, unique API_TOKEN (min 32 characters)
  • Set ALLOWED_ORIGINS to your specific domains (remove *)
  • Ensure OPENAI_API_KEY is properly set
  • Use HTTPS (not HTTP) for all connections
  • Set up rate limiting (recommended via reverse proxy)
  • Configure firewall rules
  • Set up monitoring and logging
  • Review and secure file upload limits

Example Authenticated Requests

Using X-API-Key header:

# Download endpoint
curl -H "X-API-Key: your_token" \
  -X POST http://localhost:3001/download \
  -H "Content-Type: application/json" \
  -d '{"url":"https://www.youtube.com/watch?v=VIDEO_ID"}'

# Transcribe endpoint
curl -H "X-API-Key: your_token" \
  -X POST http://localhost:3001/transcribe \
  -H "Content-Type: application/json" \
  -d '{"filePath":"./output/audio.mp3"}'

Using Authorization Bearer:

curl -H "Authorization: Bearer your_token" \
  -X POST http://localhost:3001/summarize \
  -H "Content-Type: application/json" \
  -d '{"text":"Long text to summarize..."}'