Convert backlog, project-plan, save-doc, youtube-transcriber, and z-image from skills/ to commands/ so they appear as user-invocable slash commands with plugin name prefixes. Update youtube-transcriber: switch default model from gpt-4o-transcribe to gpt-4o-mini-transcribe (OpenAI's current recommendation, half cost) and fix cost estimates that were 4-7x too high. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
4.8 KiB
4.8 KiB
| description |
|---|
| Transcribe YouTube videos of any length using GPT-4o-mini-transcribe |
YouTube Transcriber - High-Quality Video Transcription
Script Location
Primary script: $YOUTUBE_TRANSCRIBER_DIR/transcribe.py
Key Features
- Parallel processing: Multiple videos can be transcribed simultaneously
- Unlimited length: Auto-chunks videos >10 minutes to prevent API limits
- Organized output:
- Transcripts →
output/directory - Temp files →
temp/directory (auto-cleaned)
- Transcripts →
- High quality: Uses GPT-4o-mini-transcribe by default (OpenAI's recommended model)
- Cost options: Can use
-m gpt-4o-transcribefor the full-size model at 2x cost
Basic Usage
Single Video
cd $YOUTUBE_TRANSCRIBER_DIR
uv run python transcribe.py "https://youtube.com/watch?v=VIDEO_ID"
Output: output/Video_Title_2025-11-10.txt
Multiple Videos in Parallel
cd $YOUTUBE_TRANSCRIBER_DIR
# Launch all in background simultaneously
uv run python transcribe.py "URL1" &
uv run python transcribe.py "URL2" &
uv run python transcribe.py "URL3" &
wait
Why parallel works: Each transcription uses unique temp files (UUID-based) in temp/ directory.
Higher Quality Mode
cd $YOUTUBE_TRANSCRIBER_DIR
uv run python transcribe.py "URL" -m gpt-4o-transcribe
When to use full model: Noisy audio, critical accuracy requirements. Costs 2x more ($0.006/min vs $0.003/min).
Command Options
uv run python transcribe.py [URL] [OPTIONS]
Options:
-o, --output PATH Custom output filename (default: auto-generated in output/)
-m, --model MODEL Transcription model (default: gpt-4o-mini-transcribe)
Options: gpt-4o-mini-transcribe, gpt-4o-transcribe, whisper-1
-p, --prompt TEXT Context prompt for better accuracy
--chunk-duration MINUTES Chunk size for long videos (default: 10 minutes)
--keep-audio Keep temp audio files (default: auto-delete)
Workflow for User Requests
Single Video Request
- Change to transcriber directory
- Run script with URL
- Report output file location in
output/directory
Multiple Video Request
- Change to transcriber directory
- Launch all transcriptions in parallel using background processes
- Wait for all to complete
- Report all output files in
output/directory
Testing/Cost-Conscious Request
- Default model (
gpt-4o-mini-transcribe) is already the cheapest GPT-4o option - For even cheaper: suggest Groq's Whisper API as an alternative
- Quality is excellent for most YouTube content
Technical Details
How it works:
- Downloads audio from YouTube (via yt-dlp)
- Saves to unique temp file:
temp/download_{UUID}.mp3 - Splits long videos (>10 min) into chunks automatically
- Transcribes with OpenAI API (GPT-4o-mini-transcribe)
- Saves transcript:
output/Video_Title_YYYY-MM-DD.txt - Cleans up temp files automatically
Parallel safety:
- Each process uses UUID-based temp files
- No file conflicts between parallel processes
- Temp files auto-cleaned after completion
Auto-chunking:
- Videos >10 minutes: Split into 10-minute chunks
- Context preserved between chunks
- Prevents API response truncation
Requirements
- OpenAI API key:
$OPENAI_API_KEYenvironment variable - Python 3.10+ with uv package manager
- FFmpeg (for audio processing)
- yt-dlp (for YouTube downloads)
Check requirements:
echo $OPENAI_API_KEY # Should show API key
which ffmpeg # Should show path
Cost Estimates
Default model (gpt-4o-mini-transcribe at $0.003/min):
- 5-minute video: ~$0.015
- 25-minute video: ~$0.075
- 60-minute video: ~$0.18
Full model (gpt-4o-transcribe at $0.006/min):
- 5-minute video: ~$0.03
- 25-minute video: ~$0.15
- 60-minute video: ~$0.36
Quick Reference
# Single video (default quality)
uv run python transcribe.py "URL"
# Single video (higher quality, 2x cost)
uv run python transcribe.py "URL" -m gpt-4o-transcribe
# Multiple videos in parallel
for url in URL1 URL2 URL3; do
uv run python transcribe.py "$url" &
done
wait
# With custom output
uv run python transcribe.py "URL" -o custom_name.txt
# With context prompt
uv run python transcribe.py "URL" -p "Context about video content"
Integration
With fabric: Process transcripts after generation
cat output/Video_Title_2025-11-10.txt | fabric -p extract_wisdom
Notes
- Script requires being in its directory to work correctly
- Always change to
$YOUTUBE_TRANSCRIBER_DIRfirst - Parallel execution is safe and recommended for multiple videos
- Default model (gpt-4o-mini-transcribe) is recommended for most content
- Output files automatically named with video title + date
- Temp files automatically cleaned after transcription