videotomp3transcriptor/docs/API.md
2025-12-04 20:57:51 +08:00

12 KiB

API Documentation - Video to MP3 Transcriptor

Base URL

http://localhost:8888

Table of Contents


Health & Info

GET /health

Health check endpoint.

Response:

{
  "status": "ok",
  "timestamp": "2025-11-28T12:00:00.000Z"
}

GET /api

Get API information and available endpoints.

Response:

{
  "name": "Video to MP3 Transcriptor API",
  "version": "1.0.0",
  "endpoints": { ... }
}

GET /info

Get information about a YouTube video or playlist.

Query Parameters:

  • url (required): YouTube URL

Example:

curl "http://localhost:8888/info?url=https://www.youtube.com/watch?v=VIDEO_ID"

Response:

{
  "success": true,
  "title": "Video Title",
  "type": "video",
  "duration": 300,
  "channel": "Channel Name",
  "videoCount": 1
}

Download Endpoints

GET /download-stream

Download YouTube video(s) to MP3 with Server-Sent Events (SSE) progress updates.

Query Parameters:

  • url (required): YouTube URL
  • outputPath (optional): Custom output directory path

Example:

curl "http://localhost:8888/download-stream?url=https://www.youtube.com/watch?v=VIDEO_ID"

SSE Events:

  • info: Video/playlist information
  • progress: Download progress updates
  • video-complete: Individual video completion
  • complete: All downloads complete
  • error: Error occurred

POST /download

Download YouTube video(s) to MP3 (non-streaming).

Body Parameters:

{
  "url": "https://www.youtube.com/watch?v=VIDEO_ID",
  "outputPath": "./custom/path"  // optional
}

Example:

curl -X POST http://localhost:8888/download \
  -H "Content-Type: application/json" \
  -d '{"url":"https://www.youtube.com/watch?v=VIDEO_ID"}'

Response:

{
  "success": true,
  "playlistTitle": null,
  "totalVideos": 1,
  "successCount": 1,
  "failCount": 0,
  "videos": [
    {
      "success": true,
      "title": "Video Title",
      "filePath": "./output/video.mp3",
      "fileUrl": "/files/video.mp3"
    }
  ]
}

Transcription Endpoints

POST /transcribe

Transcribe an existing audio file.

Body Parameters:

{
  "filePath": "./output/audio.mp3",
  "language": "en",  // optional (auto-detect if not specified)
  "format": "txt",   // optional: txt, json, srt, vtt
  "model": "gpt-4o-mini-transcribe",  // optional: gpt-4o-mini-transcribe (default), gpt-4o-transcribe, whisper-1
  "outputPath": "./custom/path"  // optional
}

Available Models:

  • gpt-4o-mini-transcribe (default) - Fast and cost-effective
  • gpt-4o-transcribe - Higher quality
  • whisper-1 - Original Whisper model (supports more formats)

Example:

curl -X POST http://localhost:8888/transcribe \
  -H "Content-Type: application/json" \
  -d '{
    "filePath": "./output/audio.mp3",
    "language": "en",
    "model": "gpt-4o-mini-transcribe"
  }'

Response:

{
  "success": true,
  "filePath": "./output/audio.mp3",
  "transcriptionPath": "./output/audio.txt",
  "transcriptionUrl": "/files/audio.txt",
  "text": "Transcribed text content..."
}

POST /upload-transcribe

Upload and transcribe audio files.

Form Data:

  • files: Audio file(s) (multiple files supported, max 50)
  • language: Language code (optional)
  • model: Transcription model (optional, default: gpt-4o-mini-transcribe)
  • outputPath: Custom output directory (optional)

Example:

curl -X POST http://localhost:8888/upload-transcribe \
  -F "files=@audio1.mp3" \
  -F "files=@audio2.mp3" \
  -F "language=en" \
  -F "model=gpt-4o-mini-transcribe"

Response:

{
  "success": true,
  "totalFiles": 2,
  "successCount": 2,
  "failCount": 0,
  "results": [
    {
      "success": true,
      "fileName": "audio1.mp3",
      "transcriptionPath": "./output/audio1.txt",
      "transcriptionUrl": "/files/audio1.txt",
      "text": "Transcription..."
    }
  ]
}

GET /process-stream

Download + Transcribe with SSE progress updates.

Query Parameters:

  • url (required): YouTube URL
  • language (optional): Language code
  • model (optional): Transcription model (default: gpt-4o-mini-transcribe)
  • outputPath (optional): Custom output directory

Example:

curl "http://localhost:8888/process-stream?url=https://www.youtube.com/watch?v=VIDEO_ID&language=en&model=gpt-4o-mini-transcribe"

SSE Events:

  • info: Video information
  • progress: Progress updates (downloading or transcribing)
  • video-complete: Download complete
  • transcribe-complete: Transcription complete
  • complete: All operations complete
  • error: Error occurred

POST /process

Download + Transcribe (non-streaming).

Body Parameters:

{
  "url": "https://www.youtube.com/watch?v=VIDEO_ID",
  "language": "en",  // optional
  "format": "txt",   // optional
  "model": "gpt-4o-mini-transcribe",  // optional
  "outputPath": "./custom/path"  // optional
}

Example:

curl -X POST http://localhost:8888/process \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://www.youtube.com/watch?v=VIDEO_ID",
    "language": "en",
    "model": "gpt-4o-mini-transcribe"
  }'

Response:

{
  "success": true,
  "playlistTitle": null,
  "totalVideos": 1,
  "downloadedCount": 1,
  "transcribedCount": 1,
  "results": [
    {
      "title": "Video Title",
      "downloadSuccess": true,
      "audioPath": "./output/video.mp3",
      "audioUrl": "/files/video.mp3",
      "transcriptionSuccess": true,
      "transcriptionPath": "./output/video.txt",
      "transcriptionUrl": "/files/video.txt",
      "text": "Transcription..."
    }
  ]
}

Translation Endpoints

GET /languages

Get available translation languages.

Response:

{
  "languages": {
    "en": "English",
    "fr": "French",
    "es": "Spanish",
    "de": "German",
    "zh": "Chinese",
    "ja": "Japanese",
    ...
  }
}

POST /translate

Translate text.

Body Parameters:

{
  "text": "Text to translate",
  "targetLang": "fr",  // required: target language code
  "sourceLang": "en"   // optional: source language (auto-detect if not specified)
}

Example:

curl -X POST http://localhost:8888/translate \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Hello, how are you?",
    "targetLang": "fr"
  }'

Response:

{
  "success": true,
  "originalText": "Hello, how are you?",
  "translatedText": "Bonjour, comment allez-vous ?",
  "targetLanguage": "French",
  "sourceLanguage": "auto-detected",
  "chunks": 1
}

POST /translate-file

Translate uploaded text files.

Form Data:

  • files: Text file(s) (.txt, multiple files supported, max 50)
  • targetLang: Target language code (required)
  • sourceLang: Source language code (optional)
  • outputPath: Custom output directory (optional)

Example:

curl -X POST http://localhost:8888/translate-file \
  -F "files=@document.txt" \
  -F "targetLang=fr" \
  -F "sourceLang=en"

Response:

{
  "success": true,
  "totalFiles": 1,
  "successCount": 1,
  "failCount": 0,
  "results": [
    {
      "success": true,
      "fileName": "document.txt",
      "translationPath": "./output/document_fr.txt",
      "translationUrl": "/files/document_fr.txt",
      "translatedText": "Translated content..."
    }
  ]
}

Summarization Endpoints

GET /summary-styles

Get available summary styles.

Response:

{
  "styles": {
    "concise": "A brief summary capturing main points",
    "detailed": "A comprehensive summary with nuances",
    "bullet": "Key points as bullet points"
  }
}

POST /summarize

Summarize text using GPT-5.1.

Body Parameters:

{
  "text": "Long text to summarize...",
  "style": "concise",  // optional: concise (default), detailed, bullet
  "language": "same",  // optional: 'same' (default) or language code
  "model": "gpt-5.1"   // optional: default is gpt-5.1
}

Example:

curl -X POST http://localhost:8888/summarize \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Long article content...",
    "style": "bullet",
    "language": "same"
  }'

Response:

{
  "success": true,
  "summary": "Summary content...",
  "model": "gpt-5.1",
  "style": "bullet",
  "inputLength": 5000,
  "chunks": 1
}

POST /summarize-file

Summarize uploaded text files using GPT-5.1.

Form Data:

  • files: Text file(s) (.txt, multiple files supported, max 50)
  • style: Summary style (optional, default: concise)
  • language: Output language (optional, default: same)
  • model: AI model (optional, default: gpt-5.1)
  • outputPath: Custom output directory (optional)

Example:

curl -X POST http://localhost:8888/summarize-file \
  -F "files=@article.txt" \
  -F "style=detailed" \
  -F "language=same"

Response:

{
  "success": true,
  "totalFiles": 1,
  "successCount": 1,
  "failCount": 0,
  "results": [
    {
      "success": true,
      "fileName": "article.txt",
      "summaryPath": "./output/article_summary.txt",
      "summaryUrl": "/files/article_summary.txt",
      "summary": "Summary content...",
      "model": "gpt-5.1",
      "chunks": 1
    }
  ]
}

GET /summarize-stream

Full pipeline: Download -> Transcribe -> Summarize with SSE progress.

Query Parameters:

  • url (required): YouTube URL
  • style (optional): Summary style (default: concise)
  • language (optional): Output language (default: same)
  • model (optional): Transcription model (default: gpt-4o-mini-transcribe)
  • outputPath (optional): Custom output directory

Example:

curl "http://localhost:8888/summarize-stream?url=https://www.youtube.com/watch?v=VIDEO_ID&style=bullet&model=gpt-4o-mini-transcribe"

SSE Events:

  • info: Video information
  • progress: Progress updates (downloading, transcribing, or summarizing)
  • video-complete: Download complete
  • transcribe-complete: Transcription complete
  • summarize-complete: Summary complete
  • complete: All operations complete
  • error: Error occurred

File Management

GET /files-list

List all downloaded/generated files.

Example:

curl http://localhost:8888/files-list

Response:

{
  "files": [
    {
      "name": "video.mp3",
      "url": "/files/video.mp3",
      "path": "./output/video.mp3"
    },
    {
      "name": "video.txt",
      "url": "/files/video.txt",
      "path": "./output/video.txt"
    }
  ]
}

GET /files/:filename

Serve a specific file.

Example:

curl http://localhost:8888/files/video.mp3 --output video.mp3

Error Responses

All endpoints return error responses in the following format:

{
  "error": "Error message describing what went wrong"
}

Common HTTP status codes:

  • 400 - Bad Request (missing required parameters)
  • 500 - Internal Server Error (processing failed)

Notes

Output Paths

All endpoints that support outputPath parameter:

  • If not specified, files are saved to the default OUTPUT_DIR (./output)
  • If specified, files are saved to the custom path provided

Models

  • Transcription: Default is gpt-4o-mini-transcribe (cost-effective)
  • Summarization: Default is gpt-5.1 (latest GPT model)
  • Translation: Uses gpt-4o-mini (hardcoded)

File Formats

  • Audio: MP3, WAV, M4A, OGG, FLAC
  • Text: TXT files
  • Transcription outputs: TXT, JSON, SRT, VTT (depending on model)

API Key

Ensure OPENAI_API_KEY is set in your .env file for transcription, translation, and summarization features to work.