videotomp3transcriptor/docs/API.md
StillHammer c0cfc4c28e Add API security with token authentication and localStorage management
- Add API token authentication middleware (X-API-Key header)
- Add CORS configuration with ALLOWED_ORIGINS
- Add security HTTP headers (X-Frame-Options, CSP, etc.)
- Add web interface for API token configuration with localStorage
- Add toggle visibility for token input
- Add connection status indicator
- Add auto-save token functionality
- Update API documentation with authentication examples
- Add deployment guides (OVH specific and general)
- Add local testing guide

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-05 12:01:19 +08:00

16 KiB

API Documentation - Video to MP3 Transcriptor

Base URL

http://localhost:8888

🔐 Authentication

⚠️ IMPORTANT: All API endpoints (except /health and /api) require authentication using an API token.

How to Authenticate

Include your API token in one of these ways:

Option 1: X-API-Key header (Recommended)

curl -H "X-API-Key: your_api_token_here" http://localhost:8888/endpoint

Option 2: Authorization Bearer header

curl -H "Authorization: Bearer your_api_token_here" http://localhost:8888/endpoint

Configuration

  1. Set your API token in .env:

    API_TOKEN=your_secure_token_here
    
  2. Generate a secure token for production:

    # Linux/Mac
    openssl rand -hex 32
    
    # Or use any secure random string generator
    

Security Notes

  • Public endpoints (no auth required): /health, /api
  • Protected endpoints: All other endpoints require authentication
  • In development: If API_TOKEN is not set, the API will work without authentication (with a warning)
  • In production: Always set a strong API_TOKEN

Error Responses

401 Unauthorized - No API key provided:

{
  "error": "Unauthorized",
  "message": "API key required. Provide X-API-Key header or Authorization: Bearer <token>"
}

403 Forbidden - Invalid API key:

{
  "error": "Forbidden",
  "message": "Invalid API key"
}

Table of Contents


Health & Info

GET /health

Health check endpoint.

Response:

{
  "status": "ok",
  "timestamp": "2025-11-28T12:00:00.000Z"
}

GET /api

Get API information and available endpoints.

Response:

{
  "name": "Video to MP3 Transcriptor API",
  "version": "1.0.0",
  "endpoints": { ... }
}

GET /info

Get information about a YouTube video or playlist.

Query Parameters:

  • url (required): YouTube URL

Example:

curl -H "X-API-Key: your_token" \
  "http://localhost:8888/info?url=https://www.youtube.com/watch?v=VIDEO_ID"

Response:

{
  "success": true,
  "title": "Video Title",
  "type": "video",
  "duration": 300,
  "channel": "Channel Name",
  "videoCount": 1
}

Download Endpoints

GET /download-stream

Download YouTube video(s) to MP3 with Server-Sent Events (SSE) progress updates.

Query Parameters:

  • url (required): YouTube URL
  • outputPath (optional): Custom output directory path

Example:

curl -H "X-API-Key: your_token" \
  "http://localhost:8888/download-stream?url=https://www.youtube.com/watch?v=VIDEO_ID"

SSE Events:

  • info: Video/playlist information
  • progress: Download progress updates
  • video-complete: Individual video completion
  • complete: All downloads complete
  • error: Error occurred

POST /download

Download YouTube video(s) to MP3 (non-streaming).

Body Parameters:

{
  "url": "https://www.youtube.com/watch?v=VIDEO_ID",
  "outputPath": "./custom/path"  // optional
}

Example:

curl -H "X-API-Key: your_token" \
  -X POST http://localhost:8888/download \
  -H "Content-Type: application/json" \
  -d '{"url":"https://www.youtube.com/watch?v=VIDEO_ID"}'

Response:

{
  "success": true,
  "playlistTitle": null,
  "totalVideos": 1,
  "successCount": 1,
  "failCount": 0,
  "videos": [
    {
      "success": true,
      "title": "Video Title",
      "filePath": "./output/video.mp3",
      "fileUrl": "/files/video.mp3"
    }
  ]
}

Transcription Endpoints

POST /transcribe

Transcribe an existing audio file.

Body Parameters:

{
  "filePath": "./output/audio.mp3",
  "language": "en",  // optional (auto-detect if not specified)
  "format": "txt",   // optional: txt, json, srt, vtt
  "model": "gpt-4o-mini-transcribe",  // optional: gpt-4o-mini-transcribe (default), gpt-4o-transcribe, whisper-1
  "outputPath": "./custom/path"  // optional
}

Available Models:

  • gpt-4o-mini-transcribe (default) - Fast and cost-effective
  • gpt-4o-transcribe - Higher quality
  • whisper-1 - Original Whisper model (supports more formats)

Example:

curl -H "X-API-Key: your_token" \
  -X POST http://localhost:8888/transcribe \
  -H "Content-Type: application/json" \
  -d '{
    "filePath": "./output/audio.mp3",
    "language": "en",
    "model": "gpt-4o-mini-transcribe"
  }'

Response:

{
  "success": true,
  "filePath": "./output/audio.mp3",
  "transcriptionPath": "./output/audio.txt",
  "transcriptionUrl": "/files/audio.txt",
  "text": "Transcribed text content..."
}

POST /upload-transcribe

Upload and transcribe audio files.

Form Data:

  • files: Audio file(s) (multiple files supported, max 50)
  • language: Language code (optional)
  • model: Transcription model (optional, default: gpt-4o-mini-transcribe)
  • outputPath: Custom output directory (optional)

Example:

curl -X POST http://localhost:8888/upload-transcribe \
  -F "files=@audio1.mp3" \
  -F "files=@audio2.mp3" \
  -F "language=en" \
  -F "model=gpt-4o-mini-transcribe"

Response:

{
  "success": true,
  "totalFiles": 2,
  "successCount": 2,
  "failCount": 0,
  "results": [
    {
      "success": true,
      "fileName": "audio1.mp3",
      "transcriptionPath": "./output/audio1.txt",
      "transcriptionUrl": "/files/audio1.txt",
      "text": "Transcription..."
    }
  ]
}

GET /process-stream

Download + Transcribe with SSE progress updates.

Query Parameters:

  • url (required): YouTube URL
  • language (optional): Language code
  • model (optional): Transcription model (default: gpt-4o-mini-transcribe)
  • outputPath (optional): Custom output directory

Example:

curl "http://localhost:8888/process-stream?url=https://www.youtube.com/watch?v=VIDEO_ID&language=en&model=gpt-4o-mini-transcribe"

SSE Events:

  • info: Video information
  • progress: Progress updates (downloading or transcribing)
  • video-complete: Download complete
  • transcribe-complete: Transcription complete
  • complete: All operations complete
  • error: Error occurred

POST /process

Download + Transcribe (non-streaming).

Body Parameters:

{
  "url": "https://www.youtube.com/watch?v=VIDEO_ID",
  "language": "en",  // optional
  "format": "txt",   // optional
  "model": "gpt-4o-mini-transcribe",  // optional
  "outputPath": "./custom/path"  // optional
}

Example:

curl -X POST http://localhost:8888/process \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://www.youtube.com/watch?v=VIDEO_ID",
    "language": "en",
    "model": "gpt-4o-mini-transcribe"
  }'

Response:

{
  "success": true,
  "playlistTitle": null,
  "totalVideos": 1,
  "downloadedCount": 1,
  "transcribedCount": 1,
  "results": [
    {
      "title": "Video Title",
      "downloadSuccess": true,
      "audioPath": "./output/video.mp3",
      "audioUrl": "/files/video.mp3",
      "transcriptionSuccess": true,
      "transcriptionPath": "./output/video.txt",
      "transcriptionUrl": "/files/video.txt",
      "text": "Transcription..."
    }
  ]
}

Translation Endpoints

GET /languages

Get available translation languages.

Response:

{
  "languages": {
    "en": "English",
    "fr": "French",
    "es": "Spanish",
    "de": "German",
    "zh": "Chinese",
    "ja": "Japanese",
    ...
  }
}

POST /translate

Translate text.

Body Parameters:

{
  "text": "Text to translate",
  "targetLang": "fr",  // required: target language code
  "sourceLang": "en"   // optional: source language (auto-detect if not specified)
}

Example:

curl -X POST http://localhost:8888/translate \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Hello, how are you?",
    "targetLang": "fr"
  }'

Response:

{
  "success": true,
  "originalText": "Hello, how are you?",
  "translatedText": "Bonjour, comment allez-vous ?",
  "targetLanguage": "French",
  "sourceLanguage": "auto-detected",
  "chunks": 1
}

POST /translate-file

Translate uploaded text files.

Form Data:

  • files: Text file(s) (.txt, multiple files supported, max 50)
  • targetLang: Target language code (required)
  • sourceLang: Source language code (optional)
  • outputPath: Custom output directory (optional)

Example:

curl -X POST http://localhost:8888/translate-file \
  -F "files=@document.txt" \
  -F "targetLang=fr" \
  -F "sourceLang=en"

Response:

{
  "success": true,
  "totalFiles": 1,
  "successCount": 1,
  "failCount": 0,
  "results": [
    {
      "success": true,
      "fileName": "document.txt",
      "translationPath": "./output/document_fr.txt",
      "translationUrl": "/files/document_fr.txt",
      "translatedText": "Translated content..."
    }
  ]
}

Summarization Endpoints

GET /summary-styles

Get available summary styles.

Response:

{
  "styles": {
    "concise": "A brief summary capturing main points",
    "detailed": "A comprehensive summary with nuances",
    "bullet": "Key points as bullet points"
  }
}

POST /summarize

Summarize text using GPT-5.1.

Body Parameters:

{
  "text": "Long text to summarize...",
  "style": "concise",  // optional: concise (default), detailed, bullet
  "language": "same",  // optional: 'same' (default) or language code
  "model": "gpt-5.1"   // optional: default is gpt-5.1
}

Example:

curl -X POST http://localhost:8888/summarize \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Long article content...",
    "style": "bullet",
    "language": "same"
  }'

Response:

{
  "success": true,
  "summary": "Summary content...",
  "model": "gpt-5.1",
  "style": "bullet",
  "inputLength": 5000,
  "chunks": 1
}

POST /summarize-file

Summarize uploaded text files using GPT-5.1.

Form Data:

  • files: Text file(s) (.txt, multiple files supported, max 50)
  • style: Summary style (optional, default: concise)
  • language: Output language (optional, default: same)
  • model: AI model (optional, default: gpt-5.1)
  • outputPath: Custom output directory (optional)

Example:

curl -X POST http://localhost:8888/summarize-file \
  -F "files=@article.txt" \
  -F "style=detailed" \
  -F "language=same"

Response:

{
  "success": true,
  "totalFiles": 1,
  "successCount": 1,
  "failCount": 0,
  "results": [
    {
      "success": true,
      "fileName": "article.txt",
      "summaryPath": "./output/article_summary.txt",
      "summaryUrl": "/files/article_summary.txt",
      "summary": "Summary content...",
      "model": "gpt-5.1",
      "chunks": 1
    }
  ]
}

GET /summarize-stream

Full pipeline: Download -> Transcribe -> Summarize with SSE progress.

Query Parameters:

  • url (required): YouTube URL
  • style (optional): Summary style (default: concise)
  • language (optional): Output language (default: same)
  • model (optional): Transcription model (default: gpt-4o-mini-transcribe)
  • outputPath (optional): Custom output directory

Example:

curl "http://localhost:8888/summarize-stream?url=https://www.youtube.com/watch?v=VIDEO_ID&style=bullet&model=gpt-4o-mini-transcribe"

SSE Events:

  • info: Video information
  • progress: Progress updates (downloading, transcribing, or summarizing)
  • video-complete: Download complete
  • transcribe-complete: Transcription complete
  • summarize-complete: Summary complete
  • complete: All operations complete
  • error: Error occurred

File Management

GET /files-list

List all downloaded/generated files.

Example:

curl http://localhost:8888/files-list

Response:

{
  "files": [
    {
      "name": "video.mp3",
      "url": "/files/video.mp3",
      "path": "./output/video.mp3"
    },
    {
      "name": "video.txt",
      "url": "/files/video.txt",
      "path": "./output/video.txt"
    }
  ]
}

GET /files/:filename

Serve a specific file.

Example:

curl http://localhost:8888/files/video.mp3 --output video.mp3

Error Responses

All endpoints return error responses in the following format:

{
  "error": "Error message describing what went wrong"
}

Common HTTP status codes:

  • 400 - Bad Request (missing required parameters)
  • 500 - Internal Server Error (processing failed)

Notes

Output Paths

All endpoints that support outputPath parameter:

  • If not specified, files are saved to the default OUTPUT_DIR (./output)
  • If specified, files are saved to the custom path provided

Models

  • Transcription: Default is gpt-4o-mini-transcribe (cost-effective)
  • Summarization: Default is gpt-5.1 (latest GPT model)
  • Translation: Uses gpt-4o-mini (hardcoded)

File Formats

  • Audio: MP3, WAV, M4A, OGG, FLAC
  • Text: TXT files
  • Transcription outputs: TXT, JSON, SRT, VTT (depending on model)

API Key

Ensure OPENAI_API_KEY is set in your .env file for transcription, translation, and summarization features to work.


Security Configuration

Environment Variables

Required security variables in .env:

# API Authentication Token
API_TOKEN=your_secure_random_token_here

# CORS - Allowed Origins (comma-separated)
# Development: * (all origins)
# Production: https://yourdomain.com,https://app.yourdomain.com
ALLOWED_ORIGINS=*

# Server Port
PORT=8888

# Output Directory
OUTPUT_DIR=./output

# OpenAI API Key (required for AI features)
OPENAI_API_KEY=sk-...

Security Features

The API implements the following security measures:

  1. API Token Authentication

    • All endpoints (except /health and /api) require authentication
    • Supports both X-API-Key and Authorization: Bearer headers
  2. CORS Protection

    • Configurable allowed origins via ALLOWED_ORIGINS
    • Restricts cross-origin requests to trusted domains
  3. HTTP Security Headers

    • X-Content-Type-Options: nosniff
    • X-Frame-Options: DENY
    • X-XSS-Protection: 1; mode=block
    • Strict-Transport-Security: max-age=31536000; includeSubDomains
    • Content-Security-Policy with strict policies
  4. Input Validation

    • File type validation for uploads
    • Parameter validation on all endpoints

Production Deployment Checklist

Before deploying to production:

  • Generate a strong, unique API_TOKEN (min 32 characters)
  • Set ALLOWED_ORIGINS to your specific domains (remove *)
  • Ensure OPENAI_API_KEY is properly set
  • Use HTTPS (not HTTP) for all connections
  • Set up rate limiting (recommended via reverse proxy)
  • Configure firewall rules
  • Set up monitoring and logging
  • Review and secure file upload limits

Example Authenticated Requests

Using X-API-Key header:

# Download endpoint
curl -H "X-API-Key: your_token" \
  -X POST http://localhost:8888/download \
  -H "Content-Type: application/json" \
  -d '{"url":"https://www.youtube.com/watch?v=VIDEO_ID"}'

# Transcribe endpoint
curl -H "X-API-Key: your_token" \
  -X POST http://localhost:8888/transcribe \
  -H "Content-Type: application/json" \
  -d '{"filePath":"./output/audio.mp3"}'

Using Authorization Bearer:

curl -H "Authorization: Bearer your_token" \
  -X POST http://localhost:8888/summarize \
  -H "Content-Type: application/json" \
  -d '{"text":"Long text to summarize..."}'