Convert backlog, project-plan, save-doc, youtube-transcriber, and z-image from skills/ to commands/ so they appear as user-invocable slash commands with plugin name prefixes. Update youtube-transcriber: switch default model from gpt-4o-transcribe to gpt-4o-mini-transcribe (OpenAI's current recommendation, half cost) and fix cost estimates that were 4-7x too high. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
165 lines
4.8 KiB
Markdown
165 lines
4.8 KiB
Markdown
---
|
|
description: Transcribe YouTube videos of any length using GPT-4o-mini-transcribe
|
|
---
|
|
|
|
# YouTube Transcriber - High-Quality Video Transcription
|
|
|
|
## Script Location
|
|
**Primary script**: `$YOUTUBE_TRANSCRIBER_DIR/transcribe.py`
|
|
|
|
## Key Features
|
|
- **Parallel processing**: Multiple videos can be transcribed simultaneously
|
|
- **Unlimited length**: Auto-chunks videos >10 minutes to prevent API limits
|
|
- **Organized output**:
|
|
- Transcripts → `output/` directory
|
|
- Temp files → `temp/` directory (auto-cleaned)
|
|
- **High quality**: Uses GPT-4o-mini-transcribe by default (OpenAI's recommended model)
|
|
- **Cost options**: Can use `-m gpt-4o-transcribe` for the full-size model at 2x cost
|
|
|
|
## Basic Usage
|
|
|
|
### Single Video
|
|
```bash
|
|
cd $YOUTUBE_TRANSCRIBER_DIR
|
|
uv run python transcribe.py "https://youtube.com/watch?v=VIDEO_ID"
|
|
```
|
|
|
|
**Output**: `output/Video_Title_2025-11-10.txt`
|
|
|
|
### Multiple Videos in Parallel
|
|
```bash
|
|
cd $YOUTUBE_TRANSCRIBER_DIR
|
|
|
|
# Launch all in background simultaneously
|
|
uv run python transcribe.py "URL1" &
|
|
uv run python transcribe.py "URL2" &
|
|
uv run python transcribe.py "URL3" &
|
|
wait
|
|
```
|
|
|
|
**Why parallel works**: Each transcription uses unique temp files (UUID-based) in `temp/` directory.
|
|
|
|
### Higher Quality Mode
|
|
```bash
|
|
cd $YOUTUBE_TRANSCRIBER_DIR
|
|
uv run python transcribe.py "URL" -m gpt-4o-transcribe
|
|
```
|
|
|
|
**When to use full model**: Noisy audio, critical accuracy requirements. Costs 2x more ($0.006/min vs $0.003/min).
|
|
|
|
## Command Options
|
|
|
|
```bash
|
|
uv run python transcribe.py [URL] [OPTIONS]
|
|
|
|
Options:
|
|
-o, --output PATH Custom output filename (default: auto-generated in output/)
|
|
-m, --model MODEL Transcription model (default: gpt-4o-mini-transcribe)
|
|
Options: gpt-4o-mini-transcribe, gpt-4o-transcribe, whisper-1
|
|
-p, --prompt TEXT Context prompt for better accuracy
|
|
--chunk-duration MINUTES Chunk size for long videos (default: 10 minutes)
|
|
--keep-audio Keep temp audio files (default: auto-delete)
|
|
```
|
|
|
|
## Workflow for User Requests
|
|
|
|
### Single Video Request
|
|
1. Change to transcriber directory
|
|
2. Run script with URL
|
|
3. Report output file location in `output/` directory
|
|
|
|
### Multiple Video Request
|
|
1. Change to transcriber directory
|
|
2. Launch all transcriptions in parallel using background processes
|
|
3. Wait for all to complete
|
|
4. Report all output files in `output/` directory
|
|
|
|
### Testing/Cost-Conscious Request
|
|
1. Default model (`gpt-4o-mini-transcribe`) is already the cheapest GPT-4o option
|
|
2. For even cheaper: suggest Groq's Whisper API as an alternative
|
|
3. Quality is excellent for most YouTube content
|
|
|
|
## Technical Details
|
|
|
|
**How it works**:
|
|
1. Downloads audio from YouTube (via yt-dlp)
|
|
2. Saves to unique temp file: `temp/download_{UUID}.mp3`
|
|
3. Splits long videos (>10 min) into chunks automatically
|
|
4. Transcribes with OpenAI API (GPT-4o-mini-transcribe)
|
|
5. Saves transcript: `output/Video_Title_YYYY-MM-DD.txt`
|
|
6. Cleans up temp files automatically
|
|
|
|
**Parallel safety**:
|
|
- Each process uses UUID-based temp files
|
|
- No file conflicts between parallel processes
|
|
- Temp files auto-cleaned after completion
|
|
|
|
**Auto-chunking**:
|
|
- Videos >10 minutes: Split into 10-minute chunks
|
|
- Context preserved between chunks
|
|
- Prevents API response truncation
|
|
|
|
## Requirements
|
|
- OpenAI API key: `$OPENAI_API_KEY` environment variable
|
|
- Python 3.10+ with uv package manager
|
|
- FFmpeg (for audio processing)
|
|
- yt-dlp (for YouTube downloads)
|
|
|
|
**Check requirements**:
|
|
```bash
|
|
echo $OPENAI_API_KEY # Should show API key
|
|
which ffmpeg # Should show path
|
|
```
|
|
|
|
## Cost Estimates
|
|
|
|
Default model (`gpt-4o-mini-transcribe` at $0.003/min):
|
|
|
|
- **5-minute video**: ~$0.015
|
|
- **25-minute video**: ~$0.075
|
|
- **60-minute video**: ~$0.18
|
|
|
|
Full model (`gpt-4o-transcribe` at $0.006/min):
|
|
|
|
- **5-minute video**: ~$0.03
|
|
- **25-minute video**: ~$0.15
|
|
- **60-minute video**: ~$0.36
|
|
|
|
## Quick Reference
|
|
|
|
```bash
|
|
# Single video (default quality)
|
|
uv run python transcribe.py "URL"
|
|
|
|
# Single video (higher quality, 2x cost)
|
|
uv run python transcribe.py "URL" -m gpt-4o-transcribe
|
|
|
|
# Multiple videos in parallel
|
|
for url in URL1 URL2 URL3; do
|
|
uv run python transcribe.py "$url" &
|
|
done
|
|
wait
|
|
|
|
# With custom output
|
|
uv run python transcribe.py "URL" -o custom_name.txt
|
|
|
|
# With context prompt
|
|
uv run python transcribe.py "URL" -p "Context about video content"
|
|
```
|
|
|
|
## Integration
|
|
|
|
**With fabric**: Process transcripts after generation
|
|
```bash
|
|
cat output/Video_Title_2025-11-10.txt | fabric -p extract_wisdom
|
|
```
|
|
|
|
## Notes
|
|
|
|
- Script requires being in its directory to work correctly
|
|
- Always change to `$YOUTUBE_TRANSCRIBER_DIR` first
|
|
- Parallel execution is safe and recommended for multiple videos
|
|
- Default model (gpt-4o-mini-transcribe) is recommended for most content
|
|
- Output files automatically named with video title + date
|
|
- Temp files automatically cleaned after transcription
|