videotomp3transcriptor/README.md
debian.StillHammer 9c3874d879 🎵 v2.0: Fresh start with Camoufox + yt-dlp
- Remove old backend (transcription, translation, summarization)
- Add Camoufox stealth cookie extraction
- Add automatic cookie refresh (14 days)
- Add cookie validation
- Simplified to focus on YouTube → MP3 downloads
- Auto-retry on bot detection
- Streaming support with range requests
- Clean architecture (services pattern)
- Full documentation
2026-01-31 07:40:22 +00:00

368 lines
6.7 KiB
Markdown

# 🎵 Hanasuba Music Service v2.0
**YouTube to MP3 download service with Camoufox stealth cookies**
Built for [Hanasuba](https://git.etheryale.com/StillHammer/hanasuba) backend.
---
## ✨ Features
-**Stealth cookies** - Camoufox anti-detection Firefox
-**Auto-refresh** - Cookies refresh every 14 days automatically
-**Bot detection bypass** - Works around YouTube rate limiting
-**Audio-only downloads** - MP3 192kbps (configurable)
-**Streaming support** - HTTP range requests for audio players
-**Metadata extraction** - Title, artist, duration, thumbnail
-**Retry logic** - Auto-retry with fresh cookies if blocked
-**REST API** - Simple JSON API for integration
---
## 🏗️ Architecture
```
music-service (Node.js + Python)
├── Express API (Node.js)
│ ├── Download orchestration
│ └── File streaming
├── Camoufox (Python)
│ ├── Stealth cookie extraction
│ └── Cookie validation
└── yt-dlp
└── YouTube download (using stealth cookies)
```
**Why this stack?**
- **Camoufox** = Undetectable Firefox (bypasses bot detection)
- **yt-dlp** = Best YouTube downloader (handles all edge cases)
- **Node.js** = Fast I/O for streaming
---
## 📦 Installation
### Prerequisites
- Node.js 18+
- Python 3.9+
- yt-dlp
- ffmpeg
### Install
```bash
# Clone repo
git clone https://git.etheryale.com/StillHammer/videotomp3transcriptor.git
cd videotomp3transcriptor
git checkout music-service-v2
# Install Node dependencies
npm install
# Install Python dependencies + browsers
npm run setup
# Configure
cp .env.example .env
nano .env # Edit PORT, STORAGE_PATH, etc.
# Start
npm start
```
---
## 🚀 Usage
### Start server
```bash
npm start
```
Server runs on `http://localhost:8889` (configurable via `.env`)
### API Endpoints
#### **POST /download**
Download YouTube video to MP3.
```bash
curl -X POST http://localhost:8889/download \
-H "Content-Type: application/json" \
-d '{"url": "https://youtube.com/watch?v=dQw4w9WgXcQ"}'
```
Response:
```json
{
"success": true,
"title": "Rick Astley - Never Gonna Give You Up",
"duration": 212,
"artist": "Rick Astley",
"filePath": "/var/hanasuba/music/dQw4w9WgXcQ.mp3",
"fileName": "dQw4w9WgXcQ.mp3",
"youtubeId": "dQw4w9WgXcQ",
"thumbnail": "https://..."
}
```
#### **GET /stream/:filename**
Stream MP3 file (supports range requests for seeking).
```bash
curl http://localhost:8889/stream/dQw4w9WgXcQ.mp3 --output song.mp3
```
#### **DELETE /file/:filename**
Delete downloaded file.
```bash
curl -X DELETE http://localhost:8889/file/dQw4w9WgXcQ.mp3
```
#### **GET /health**
Health check.
```bash
curl http://localhost:8889/health
```
#### **POST /admin/refresh-cookies**
Force refresh cookies (normally automatic).
```bash
curl -X POST http://localhost:8889/admin/refresh-cookies
```
---
## 🍪 How Cookies Work
### Automatic Refresh
Cookies are **automatically refreshed** in these cases:
1. **Every 14 days** (proactive refresh)
2. **On startup** (if invalid)
3. **Every 12 hours** (validation check)
4. **On bot detection** (retry with fresh cookies)
### Manual Refresh
```bash
# Via API
curl -X POST http://localhost:8889/admin/refresh-cookies
# Via npm script
npm run cookies:extract
```
### Validation
```bash
# Check if cookies are valid
npm run cookies:validate
```
---
## 🔧 Configuration
### Environment Variables
See `.env.example`:
```bash
PORT=8889 # Server port
STORAGE_PATH=/var/hanasuba/music # Where to save MP3 files
PYTHON_PATH=python3 # Python binary
YTDLP_PATH=yt-dlp # yt-dlp binary
ALLOWED_ORIGINS=* # CORS
```
### Audio Quality
Pass `quality` parameter in download request:
```json
{
"url": "https://youtube.com/watch?v=...",
"quality": "320k" // or "192k" (default), "128k"
}
```
---
## 🐛 Troubleshooting
### "Sign in to confirm you're not a bot"
**Solution**: Cookies have expired or are invalid.
```bash
# Force refresh
curl -X POST http://localhost:8889/admin/refresh-cookies
# Or restart service (auto-refresh on startup)
npm start
```
### yt-dlp not found
```bash
# Install yt-dlp
pip install yt-dlp
# or
sudo apt install yt-dlp
```
### Camoufox install fails
```bash
# Manual install
pip install camoufox camoufox-captcha playwright
playwright install firefox
```
### Downloads slow
This is normal. YouTube throttles downloads. The service uses `mweb` client for best speed.
---
## 🔐 Security
- Cookies file permissions: `600` (owner read/write only)
- Cookies **never** logged or exposed
- Cookies stored locally only
- CORS configurable via `ALLOWED_ORIGINS`
---
## 🚢 Deployment
### PM2 (recommended)
```bash
pm2 start src/server.js --name music-service
pm2 save
pm2 startup
```
### systemd
```ini
[Unit]
Description=Hanasuba Music Service
After=network.target
[Service]
Type=simple
User=debian
WorkingDirectory=/home/debian/videotomp3transcriptor
ExecStart=/usr/bin/node src/server.js
Restart=on-failure
[Install]
WantedBy=multi-user.target
```
```bash
sudo systemctl enable music-service
sudo systemctl start music-service
```
---
## 📊 Monitoring
Check service status:
```bash
# Health check
curl http://localhost:8889/health
# Cookies status
curl http://localhost:8889/admin/cookies-status
# Logs (PM2)
pm2 logs music-service
# Logs (systemd)
journalctl -u music-service -f
```
---
## 🔗 Integration with Hanasuba
Hanasuba (Rust) calls this service via HTTP:
```rust
// In Hanasuba src/music/client.rs
let response = reqwest::Client::new()
.post("http://localhost:8889/download")
.json(&json!({ "url": youtube_url }))
.send()
.await?;
let result: DownloadResult = response.json().await?;
// Save metadata to PostgreSQL
```
---
## 📝 Development
```bash
# Dev mode (auto-restart on changes)
npm run dev
# Extract cookies manually
npm run cookies:extract
# Validate cookies
npm run cookies:validate
```
---
## 🆚 v1 vs v2
| Feature | v1 (legacy) | v2 (current) |
|---------|-------------|--------------|
| Cookies | Firefox standard | **Camoufox stealth** |
| Auto-refresh | ❌ Manual | ✅ Automatic (14 days) |
| Bot detection | ❌ Fails often | ✅ Auto-retry |
| Validation | ❌ None | ✅ Every 12h |
| Reliability | ~60% | **~95%** |
| Transcription | ✅ OpenAI Whisper | ❌ Removed (not needed) |
| Translation | ✅ Claude | ❌ Removed (not needed) |
v2 is **focused** on one thing: reliable YouTube → MP3 downloads.
---
## 📄 License
MIT
---
## 🙏 Credits
- [Camoufox](https://github.com/daijro/camoufox) - Stealth Firefox
- [yt-dlp](https://github.com/yt-dlp/yt-dlp) - YouTube downloader
- [Hanasuba](https://git.etheryale.com/StillHammer/hanasuba) - Main backend
---
**Built with ❤️ for Hanasuba**