claude-plugins/plugins/youtube-transcriber/commands/transcribe.md
Cal Corum 51fe634ff5 refactor: convert 5 more skills to commands, update transcriber defaults
Convert backlog, project-plan, save-doc, youtube-transcriber, and
z-image from skills/ to commands/ so they appear as user-invocable
slash commands with plugin name prefixes.

Update youtube-transcriber: switch default model from gpt-4o-transcribe
to gpt-4o-mini-transcribe (OpenAI's current recommendation, half cost)
and fix cost estimates that were 4-7x too high.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 14:41:37 -05:00

165 lines
4.8 KiB
Markdown

---
description: Transcribe YouTube videos of any length using GPT-4o-mini-transcribe
---
# YouTube Transcriber - High-Quality Video Transcription
## Script Location
**Primary script**: `$YOUTUBE_TRANSCRIBER_DIR/transcribe.py`
## Key Features
- **Parallel processing**: Multiple videos can be transcribed simultaneously
- **Unlimited length**: Auto-chunks videos >10 minutes to prevent API limits
- **Organized output**:
- Transcripts → `output/` directory
- Temp files → `temp/` directory (auto-cleaned)
- **High quality**: Uses GPT-4o-mini-transcribe by default (OpenAI's recommended model)
- **Cost options**: Can use `-m gpt-4o-transcribe` for the full-size model at 2x cost
## Basic Usage
### Single Video
```bash
cd $YOUTUBE_TRANSCRIBER_DIR
uv run python transcribe.py "https://youtube.com/watch?v=VIDEO_ID"
```
**Output**: `output/Video_Title_2025-11-10.txt`
### Multiple Videos in Parallel
```bash
cd $YOUTUBE_TRANSCRIBER_DIR
# Launch all in background simultaneously
uv run python transcribe.py "URL1" &
uv run python transcribe.py "URL2" &
uv run python transcribe.py "URL3" &
wait
```
**Why parallel works**: Each transcription uses unique temp files (UUID-based) in `temp/` directory.
### Higher Quality Mode
```bash
cd $YOUTUBE_TRANSCRIBER_DIR
uv run python transcribe.py "URL" -m gpt-4o-transcribe
```
**When to use full model**: Noisy audio, critical accuracy requirements. Costs 2x more ($0.006/min vs $0.003/min).
## Command Options
```bash
uv run python transcribe.py [URL] [OPTIONS]
Options:
-o, --output PATH Custom output filename (default: auto-generated in output/)
-m, --model MODEL Transcription model (default: gpt-4o-mini-transcribe)
Options: gpt-4o-mini-transcribe, gpt-4o-transcribe, whisper-1
-p, --prompt TEXT Context prompt for better accuracy
--chunk-duration MINUTES Chunk size for long videos (default: 10 minutes)
--keep-audio Keep temp audio files (default: auto-delete)
```
## Workflow for User Requests
### Single Video Request
1. Change to transcriber directory
2. Run script with URL
3. Report output file location in `output/` directory
### Multiple Video Request
1. Change to transcriber directory
2. Launch all transcriptions in parallel using background processes
3. Wait for all to complete
4. Report all output files in `output/` directory
### Testing/Cost-Conscious Request
1. Default model (`gpt-4o-mini-transcribe`) is already the cheapest GPT-4o option
2. For even cheaper: suggest Groq's Whisper API as an alternative
3. Quality is excellent for most YouTube content
## Technical Details
**How it works**:
1. Downloads audio from YouTube (via yt-dlp)
2. Saves to unique temp file: `temp/download_{UUID}.mp3`
3. Splits long videos (>10 min) into chunks automatically
4. Transcribes with OpenAI API (GPT-4o-mini-transcribe)
5. Saves transcript: `output/Video_Title_YYYY-MM-DD.txt`
6. Cleans up temp files automatically
**Parallel safety**:
- Each process uses UUID-based temp files
- No file conflicts between parallel processes
- Temp files auto-cleaned after completion
**Auto-chunking**:
- Videos >10 minutes: Split into 10-minute chunks
- Context preserved between chunks
- Prevents API response truncation
## Requirements
- OpenAI API key: `$OPENAI_API_KEY` environment variable
- Python 3.10+ with uv package manager
- FFmpeg (for audio processing)
- yt-dlp (for YouTube downloads)
**Check requirements**:
```bash
echo $OPENAI_API_KEY # Should show API key
which ffmpeg # Should show path
```
## Cost Estimates
Default model (`gpt-4o-mini-transcribe` at $0.003/min):
- **5-minute video**: ~$0.015
- **25-minute video**: ~$0.075
- **60-minute video**: ~$0.18
Full model (`gpt-4o-transcribe` at $0.006/min):
- **5-minute video**: ~$0.03
- **25-minute video**: ~$0.15
- **60-minute video**: ~$0.36
## Quick Reference
```bash
# Single video (default quality)
uv run python transcribe.py "URL"
# Single video (higher quality, 2x cost)
uv run python transcribe.py "URL" -m gpt-4o-transcribe
# Multiple videos in parallel
for url in URL1 URL2 URL3; do
uv run python transcribe.py "$url" &
done
wait
# With custom output
uv run python transcribe.py "URL" -o custom_name.txt
# With context prompt
uv run python transcribe.py "URL" -p "Context about video content"
```
## Integration
**With fabric**: Process transcripts after generation
```bash
cat output/Video_Title_2025-11-10.txt | fabric -p extract_wisdom
```
## Notes
- Script requires being in its directory to work correctly
- Always change to `$YOUTUBE_TRANSCRIBER_DIR` first
- Parallel execution is safe and recommended for multiple videos
- Default model (gpt-4o-mini-transcribe) is recommended for most content
- Output files automatically named with video title + date
- Temp files automatically cleaned after transcription