YAML Metadata Warning: empty or missing yaml metadata in repo card
Check out the documentation for more information.
β‘ Universal Media Transcriber
Convert YouTube, YouTube Music, Spotify, and direct audio/video URLs into transcript .txt files β extremely fast.
β¨ Features
| Feature | Detail |
|---|---|
| Native captions | YouTube captions grabbed instantly β no audio download |
| Whisper fallback | faster-whisper (up to 4Γ faster than OpenAI Whisper) |
| Spotify | Tracks / albums / playlists via spotdl β Whisper |
| Direct audio | .mp3 .mp4 .wav .m4a .webm .ogg and more |
| Playlist support | Auto-expand playlists, channels, albums |
| Batch + parallel | Multiple URLs, concurrent workers |
| Smart cache | Re-run same URL instantly |
| Auto-install | Deps install themselves on first run |
π Quick Start
# Single YouTube video
python transcriber.py https://youtu.be/VIDEO_ID
# Multiple URLs
python transcriber.py URL1 URL2 URL3
# From a file (one URL per line)
python transcriber.py --file urls.txt
# Full YouTube playlist
python transcriber.py --playlist https://youtube.com/playlist?list=PLAYLIST_ID
# Spotify track
python transcriber.py https://open.spotify.com/track/TRACK_ID
# Force Whisper (ignore captions)
python transcriber.py URL --whisper
# Larger model for better accuracy
python transcriber.py URL --model large-v3
# Merge all into one file
python transcriber.py URL1 URL2 --merge
# Custom output folder
python transcriber.py URL --output ./my_transcripts
π Options
urls One or more media URLs
--file, -f Text file with one URL per line
--output, -o Output directory (default: ./transcripts)
--merge, -m Merge all transcripts into one file
--whisper, -w Force Whisper (skip caption check)
--model tiny | base | small | medium | large-v2 | large-v3
--workers Parallel workers (default: 4)
--no-cache Disable transcript cache
--playlist Expand playlist/channel into individual videos
--clear-cache Wipe the cache
π¦ Dependencies (auto-installed)
yt-dlpβ universal media downloaderyoutube-transcript-apiβ instant YouTube captionsfaster-whisperβ optimized Whisper (CTranslate2)spotdlβ Spotify downloaderrichβ terminal UI
Requires: ffmpeg installed on your system:
# macOS
brew install ffmpeg
# Ubuntu/Debian
sudo apt install ffmpeg
# Windows
winget install ffmpeg
β‘ Speed Guide
| Source | Method | Speed |
|---|---|---|
| YouTube with captions | Native API | < 2 sec |
| YouTube no captions | Whisper base |
~realtime |
| Spotify music | spotdl + Whisper | depends on length |
| Direct audio | Whisper | ~realtime |
Model accuracy vs speed:
tiny β base β small β medium β large-v3
(fastest) (most accurate)
π Output Format
======================================================================
TITLE : My Video Title
UPLOADER : Channel Name
DURATION : 0:15:42
SOURCE : youtube
METHOD : native_captions
URL : https://youtu.be/...
======================================================================
[0:00:00] Hello and welcome to this video...
[0:00:05] Today we're going to talk about...
π URL File Format
# urls.txt β lines starting with # are comments
https://youtu.be/VIDEO1
https://youtu.be/VIDEO2
https://open.spotify.com/track/TRACK_ID
https://example.com/podcast.mp3
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support