videotomp3transcriptor/README.md

# Video to MP3 Transcriptor

Download YouTube videos/playlists to MP3 and transcribe them using OpenAI Whisper API.

## Features

- Download single YouTube videos as MP3
- Download entire playlists as MP3
- Transcribe audio files using OpenAI Whisper API
- CLI interface for quick operations
- REST API for integration with other systems

## Prerequisites

- **Node.js** 18+
- **yt-dlp** installed on your system
- **ffmpeg** installed (for audio conversion)
- **OpenAI API key** (for transcription)

### Installing yt-dlp

```bash
# Windows (winget)
winget install yt-dlp

# macOS
brew install yt-dlp

# Linux
sudo apt install yt-dlp
# or
pip install yt-dlp
```

### Installing ffmpeg

```bash
# Windows (winget)
winget install ffmpeg

# macOS
brew install ffmpeg

# Linux
sudo apt install ffmpeg
```

## Installation

```bash
# Clone and install
cd videotoMP3Transcriptor
npm install

# Configure environment
cp .env.example .env
# Edit .env and add your OPENAI_API_KEY
```

## Usage

### CLI

```bash
# Download a video as MP3
npm run cli download "https://youtube.com/watch?v=VIDEO_ID"

# Download a playlist
npm run cli download "https://youtube.com/playlist?list=PLAYLIST_ID"

# Download with custom output directory
npm run cli download "URL" -o ./my-folder

# Get info about a video/playlist
npm run cli info "URL"

# Transcribe an existing MP3
npm run cli transcribe ./output/video.mp3

# Transcribe with specific language
npm run cli transcribe ./output/video.mp3 -l fr

# Transcribe with specific model
npm run cli transcribe ./output/video.mp3 -m gpt-4o-mini-transcribe

# Download AND transcribe
npm run cli process "URL"

# Download and transcribe with options
npm run cli process "URL" -l en -m gpt-4o-transcribe
```

### Linux Scripts

Convenience scripts are available in the `scripts/` directory:

```bash
# Make scripts executable (first time only)
chmod +x scripts/*.sh

# Download video/playlist
./scripts/download.sh "https://youtube.com/watch?v=VIDEO_ID"

# Transcribe a file
./scripts/transcribe.sh ./output/video.mp3 fr

# Download + transcribe
./scripts/process.sh "https://youtube.com/watch?v=VIDEO_ID" en

# Start the API server
./scripts/server.sh

# Get video info
./scripts/info.sh "https://youtube.com/watch?v=VIDEO_ID"
```

### API Server

```bash
# Start the server
npm run server
```

Server runs on `http://localhost:3000` by default.

#### Endpoints

##### GET /health
Health check endpoint.

##### GET /info?url=YOUTUBE_URL
Get info about a video or playlist.

```bash
curl "http://localhost:3000/info?url=https://youtube.com/watch?v=VIDEO_ID"
```

##### POST /download
Download video(s) as MP3.

```bash
curl -X POST http://localhost:3000/download \
  -H "Content-Type: application/json" \
  -d '{"url": "https://youtube.com/watch?v=VIDEO_ID"}'
```

##### POST /transcribe
Transcribe an existing audio file.

```bash
curl -X POST http://localhost:3000/transcribe \
  -H "Content-Type: application/json" \
  -d '{"filePath": "./output/video.mp3", "language": "en"}'
```

##### POST /process
Download and transcribe in one call.

```bash
curl -X POST http://localhost:3000/process \
  -H "Content-Type: application/json" \
  -d '{"url": "https://youtube.com/watch?v=VIDEO_ID", "language": "en", "format": "txt"}'
```

##### GET /files-list
List all downloaded files.

##### GET /files/:filename
Download/stream a specific file.

## Configuration

Environment variables (`.env`):

| Variable | Description | Default |
|----------|-------------|---------|
| `OPENAI_API_KEY` | Your OpenAI API key | Required for transcription |
| `PORT` | Server port | 3000 |
| `OUTPUT_DIR` | Download directory | ./output |

## Transcription Models

| Model | Description | Formats |
|-------|-------------|---------|
| `gpt-4o-transcribe` | Best quality, latest GPT-4o (default) | txt, json |
| `gpt-4o-mini-transcribe` | Faster, cheaper, good quality | txt, json |
| `whisper-1` | Legacy Whisper model | txt, json, srt, vtt |

## Transcription Formats

- `txt` - Plain text (all models)
- `json` - JSON response (all models)
- `srt` - SubRip subtitles (whisper-1 only)
- `vtt` - WebVTT subtitles (whisper-1 only)

## Language Codes

Common language codes for the `-l` option:
- `en` - English
- `fr` - French
- `es` - Spanish
- `de` - German
- `it` - Italian
- `pt` - Portuguese
- `zh` - Chinese
- `ja` - Japanese
- `ko` - Korean
- `ru` - Russian

Leave empty for auto-detection.

## Project Structure

```
videotoMP3Transcriptor/
├── src/
│   ├── services/
│   │   ├── youtube.js       # YouTube download service
│   │   └── transcription.js # OpenAI transcription service
│   ├── cli.js               # CLI entry point
│   └── server.js            # Express API server
├── scripts/                  # Linux convenience scripts
│   ├── download.sh          # Download video/playlist
│   ├── transcribe.sh        # Transcribe audio file
│   ├── process.sh           # Download + transcribe
│   ├── server.sh            # Start API server
│   └── info.sh              # Get video info
├── output/                   # Downloaded files
├── .env                      # Configuration
└── package.json
```

## License

MIT