- Add complete audio transcriptions for Chapters 13 & 14 with pinyin - Use transcription API to auto-transcribe Chapter 14 audio files - Add raw transcription files for Chapter 13 - Add VIDEOTOMP3_API.md documentation - Add Xiezuo Chapter 4 and XiezuoClass materials 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
12 KiB
API Documentation - Video to MP3 Transcriptor
Base URL
http://localhost:8888
Table of Contents
- Health & Info
- Download Endpoints
- Transcription Endpoints
- Translation Endpoints
- Summarization Endpoints
- File Management
Health & Info
GET /health
Health check endpoint.
Response:
{
"status": "ok",
"timestamp": "2025-11-28T12:00:00.000Z"
}
GET /api
Get API information and available endpoints.
Response:
{
"name": "Video to MP3 Transcriptor API",
"version": "1.0.0",
"endpoints": { ... }
}
GET /info
Get information about a YouTube video or playlist.
Query Parameters:
url(required): YouTube URL
Example:
curl "http://localhost:8888/info?url=https://www.youtube.com/watch?v=VIDEO_ID"
Response:
{
"success": true,
"title": "Video Title",
"type": "video",
"duration": 300,
"channel": "Channel Name",
"videoCount": 1
}
Download Endpoints
GET /download-stream
Download YouTube video(s) to MP3 with Server-Sent Events (SSE) progress updates.
Query Parameters:
url(required): YouTube URLoutputPath(optional): Custom output directory path
Example:
curl "http://localhost:8888/download-stream?url=https://www.youtube.com/watch?v=VIDEO_ID"
SSE Events:
info: Video/playlist informationprogress: Download progress updatesvideo-complete: Individual video completioncomplete: All downloads completeerror: Error occurred
POST /download
Download YouTube video(s) to MP3 (non-streaming).
Body Parameters:
{
"url": "https://www.youtube.com/watch?v=VIDEO_ID",
"outputPath": "./custom/path" // optional
}
Example:
curl -X POST http://localhost:8888/download \
-H "Content-Type: application/json" \
-d '{"url":"https://www.youtube.com/watch?v=VIDEO_ID"}'
Response:
{
"success": true,
"playlistTitle": null,
"totalVideos": 1,
"successCount": 1,
"failCount": 0,
"videos": [
{
"success": true,
"title": "Video Title",
"filePath": "./output/video.mp3",
"fileUrl": "/files/video.mp3"
}
]
}
Transcription Endpoints
POST /transcribe
Transcribe an existing audio file.
Body Parameters:
{
"filePath": "./output/audio.mp3",
"language": "en", // optional (auto-detect if not specified)
"format": "txt", // optional: txt, json, srt, vtt
"model": "gpt-4o-mini-transcribe", // optional: gpt-4o-mini-transcribe (default), gpt-4o-transcribe, whisper-1
"outputPath": "./custom/path" // optional
}
Available Models:
gpt-4o-mini-transcribe(default) - Fast and cost-effectivegpt-4o-transcribe- Higher qualitywhisper-1- Original Whisper model (supports more formats)
Example:
curl -X POST http://localhost:8888/transcribe \
-H "Content-Type: application/json" \
-d '{
"filePath": "./output/audio.mp3",
"language": "en",
"model": "gpt-4o-mini-transcribe"
}'
Response:
{
"success": true,
"filePath": "./output/audio.mp3",
"transcriptionPath": "./output/audio.txt",
"transcriptionUrl": "/files/audio.txt",
"text": "Transcribed text content..."
}
POST /upload-transcribe
Upload and transcribe audio files.
Form Data:
files: Audio file(s) (multiple files supported, max 50)language: Language code (optional)model: Transcription model (optional, default: gpt-4o-mini-transcribe)outputPath: Custom output directory (optional)
Example:
curl -X POST http://localhost:8888/upload-transcribe \
-F "files=@audio1.mp3" \
-F "files=@audio2.mp3" \
-F "language=en" \
-F "model=gpt-4o-mini-transcribe"
Response:
{
"success": true,
"totalFiles": 2,
"successCount": 2,
"failCount": 0,
"results": [
{
"success": true,
"fileName": "audio1.mp3",
"transcriptionPath": "./output/audio1.txt",
"transcriptionUrl": "/files/audio1.txt",
"text": "Transcription..."
}
]
}
GET /process-stream
Download + Transcribe with SSE progress updates.
Query Parameters:
url(required): YouTube URLlanguage(optional): Language codemodel(optional): Transcription model (default: gpt-4o-mini-transcribe)outputPath(optional): Custom output directory
Example:
curl "http://localhost:8888/process-stream?url=https://www.youtube.com/watch?v=VIDEO_ID&language=en&model=gpt-4o-mini-transcribe"
SSE Events:
info: Video informationprogress: Progress updates (downloading or transcribing)video-complete: Download completetranscribe-complete: Transcription completecomplete: All operations completeerror: Error occurred
POST /process
Download + Transcribe (non-streaming).
Body Parameters:
{
"url": "https://www.youtube.com/watch?v=VIDEO_ID",
"language": "en", // optional
"format": "txt", // optional
"model": "gpt-4o-mini-transcribe", // optional
"outputPath": "./custom/path" // optional
}
Example:
curl -X POST http://localhost:8888/process \
-H "Content-Type: application/json" \
-d '{
"url": "https://www.youtube.com/watch?v=VIDEO_ID",
"language": "en",
"model": "gpt-4o-mini-transcribe"
}'
Response:
{
"success": true,
"playlistTitle": null,
"totalVideos": 1,
"downloadedCount": 1,
"transcribedCount": 1,
"results": [
{
"title": "Video Title",
"downloadSuccess": true,
"audioPath": "./output/video.mp3",
"audioUrl": "/files/video.mp3",
"transcriptionSuccess": true,
"transcriptionPath": "./output/video.txt",
"transcriptionUrl": "/files/video.txt",
"text": "Transcription..."
}
]
}
Translation Endpoints
GET /languages
Get available translation languages.
Response:
{
"languages": {
"en": "English",
"fr": "French",
"es": "Spanish",
"de": "German",
"zh": "Chinese",
"ja": "Japanese",
...
}
}
POST /translate
Translate text.
Body Parameters:
{
"text": "Text to translate",
"targetLang": "fr", // required: target language code
"sourceLang": "en" // optional: source language (auto-detect if not specified)
}
Example:
curl -X POST http://localhost:8888/translate \
-H "Content-Type: application/json" \
-d '{
"text": "Hello, how are you?",
"targetLang": "fr"
}'
Response:
{
"success": true,
"originalText": "Hello, how are you?",
"translatedText": "Bonjour, comment allez-vous ?",
"targetLanguage": "French",
"sourceLanguage": "auto-detected",
"chunks": 1
}
POST /translate-file
Translate uploaded text files.
Form Data:
files: Text file(s) (.txt, multiple files supported, max 50)targetLang: Target language code (required)sourceLang: Source language code (optional)outputPath: Custom output directory (optional)
Example:
curl -X POST http://localhost:8888/translate-file \
-F "files=@document.txt" \
-F "targetLang=fr" \
-F "sourceLang=en"
Response:
{
"success": true,
"totalFiles": 1,
"successCount": 1,
"failCount": 0,
"results": [
{
"success": true,
"fileName": "document.txt",
"translationPath": "./output/document_fr.txt",
"translationUrl": "/files/document_fr.txt",
"translatedText": "Translated content..."
}
]
}
Summarization Endpoints
GET /summary-styles
Get available summary styles.
Response:
{
"styles": {
"concise": "A brief summary capturing main points",
"detailed": "A comprehensive summary with nuances",
"bullet": "Key points as bullet points"
}
}
POST /summarize
Summarize text using GPT-5.1.
Body Parameters:
{
"text": "Long text to summarize...",
"style": "concise", // optional: concise (default), detailed, bullet
"language": "same", // optional: 'same' (default) or language code
"model": "gpt-5.1" // optional: default is gpt-5.1
}
Example:
curl -X POST http://localhost:8888/summarize \
-H "Content-Type: application/json" \
-d '{
"text": "Long article content...",
"style": "bullet",
"language": "same"
}'
Response:
{
"success": true,
"summary": "Summary content...",
"model": "gpt-5.1",
"style": "bullet",
"inputLength": 5000,
"chunks": 1
}
POST /summarize-file
Summarize uploaded text files using GPT-5.1.
Form Data:
files: Text file(s) (.txt, multiple files supported, max 50)style: Summary style (optional, default: concise)language: Output language (optional, default: same)model: AI model (optional, default: gpt-5.1)outputPath: Custom output directory (optional)
Example:
curl -X POST http://localhost:8888/summarize-file \
-F "files=@article.txt" \
-F "style=detailed" \
-F "language=same"
Response:
{
"success": true,
"totalFiles": 1,
"successCount": 1,
"failCount": 0,
"results": [
{
"success": true,
"fileName": "article.txt",
"summaryPath": "./output/article_summary.txt",
"summaryUrl": "/files/article_summary.txt",
"summary": "Summary content...",
"model": "gpt-5.1",
"chunks": 1
}
]
}
GET /summarize-stream
Full pipeline: Download -> Transcribe -> Summarize with SSE progress.
Query Parameters:
url(required): YouTube URLstyle(optional): Summary style (default: concise)language(optional): Output language (default: same)model(optional): Transcription model (default: gpt-4o-mini-transcribe)outputPath(optional): Custom output directory
Example:
curl "http://localhost:8888/summarize-stream?url=https://www.youtube.com/watch?v=VIDEO_ID&style=bullet&model=gpt-4o-mini-transcribe"
SSE Events:
info: Video informationprogress: Progress updates (downloading, transcribing, or summarizing)video-complete: Download completetranscribe-complete: Transcription completesummarize-complete: Summary completecomplete: All operations completeerror: Error occurred
File Management
GET /files-list
List all downloaded/generated files.
Example:
curl http://localhost:8888/files-list
Response:
{
"files": [
{
"name": "video.mp3",
"url": "/files/video.mp3",
"path": "./output/video.mp3"
},
{
"name": "video.txt",
"url": "/files/video.txt",
"path": "./output/video.txt"
}
]
}
GET /files/:filename
Serve a specific file.
Example:
curl http://localhost:8888/files/video.mp3 --output video.mp3
Error Responses
All endpoints return error responses in the following format:
{
"error": "Error message describing what went wrong"
}
Common HTTP status codes:
400- Bad Request (missing required parameters)500- Internal Server Error (processing failed)
Notes
Output Paths
All endpoints that support outputPath parameter:
- If not specified, files are saved to the default
OUTPUT_DIR(./output) - If specified, files are saved to the custom path provided
Models
- Transcription: Default is
gpt-4o-mini-transcribe(cost-effective) - Summarization: Default is
gpt-5.1(latest GPT model) - Translation: Uses
gpt-4o-mini(hardcoded)
File Formats
- Audio: MP3, WAV, M4A, OGG, FLAC
- Text: TXT files
- Transcription outputs: TXT, JSON, SRT, VTT (depending on model)
API Key
Ensure OPENAI_API_KEY is set in your .env file for transcription, translation, and summarization features to work.