Cal Corum 51fe634ff5 refactor: convert 5 more skills to commands, update transcriber defaults

Convert backlog, project-plan, save-doc, youtube-transcriber, and
z-image from skills/ to commands/ so they appear as user-invocable
slash commands with plugin name prefixes.

Update youtube-transcriber: switch default model from gpt-4o-transcribe
to gpt-4o-mini-transcribe (OpenAI's current recommendation, half cost)
and fix cost estimates that were 4-7x too high.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-19 14:41:37 -05:00

4.8 KiB

Raw Permalink Blame History

description
Transcribe YouTube videos of any length using GPT-4o-mini-transcribe

YouTube Transcriber - High-Quality Video Transcription

Script Location

Primary script: $YOUTUBE_TRANSCRIBER_DIR/transcribe.py

Key Features

Parallel processing: Multiple videos can be transcribed simultaneously
Unlimited length: Auto-chunks videos >10 minutes to prevent API limits
Organized output:
- Transcripts → output/ directory
- Temp files → temp/ directory (auto-cleaned)
High quality: Uses GPT-4o-mini-transcribe by default (OpenAI's recommended model)
Cost options: Can use -m gpt-4o-transcribe for the full-size model at 2x cost

Basic Usage

Single Video

cd $YOUTUBE_TRANSCRIBER_DIR
uv run python transcribe.py "https://youtube.com/watch?v=VIDEO_ID"

Output: output/Video_Title_2025-11-10.txt

Multiple Videos in Parallel

cd $YOUTUBE_TRANSCRIBER_DIR

# Launch all in background simultaneously
uv run python transcribe.py "URL1" &
uv run python transcribe.py "URL2" &
uv run python transcribe.py "URL3" &
wait

Why parallel works: Each transcription uses unique temp files (UUID-based) in temp/ directory.

Higher Quality Mode

cd $YOUTUBE_TRANSCRIBER_DIR
uv run python transcribe.py "URL" -m gpt-4o-transcribe

When to use full model: Noisy audio, critical accuracy requirements. Costs 2x more ($0.006/min vs $0.003/min).

Command Options

uv run python transcribe.py [URL] [OPTIONS]

Options:
  -o, --output PATH           Custom output filename (default: auto-generated in output/)
  -m, --model MODEL          Transcription model (default: gpt-4o-mini-transcribe)
                             Options: gpt-4o-mini-transcribe, gpt-4o-transcribe, whisper-1
  -p, --prompt TEXT          Context prompt for better accuracy
  --chunk-duration MINUTES   Chunk size for long videos (default: 10 minutes)
  --keep-audio              Keep temp audio files (default: auto-delete)

Workflow for User Requests

Single Video Request

Change to transcriber directory
Run script with URL
Report output file location in output/ directory

Multiple Video Request

Change to transcriber directory
Launch all transcriptions in parallel using background processes
Wait for all to complete
Report all output files in output/ directory

Testing/Cost-Conscious Request

Default model (gpt-4o-mini-transcribe) is already the cheapest GPT-4o option
For even cheaper: suggest Groq's Whisper API as an alternative
Quality is excellent for most YouTube content

Technical Details

How it works:

Downloads audio from YouTube (via yt-dlp)
Saves to unique temp file: temp/download_{UUID}.mp3
Splits long videos (>10 min) into chunks automatically
Transcribes with OpenAI API (GPT-4o-mini-transcribe)
Saves transcript: output/Video_Title_YYYY-MM-DD.txt
Cleans up temp files automatically

Parallel safety:

Each process uses UUID-based temp files
No file conflicts between parallel processes
Temp files auto-cleaned after completion

Auto-chunking:

Videos >10 minutes: Split into 10-minute chunks
Context preserved between chunks
Prevents API response truncation

Requirements

OpenAI API key: $OPENAI_API_KEY environment variable
Python 3.10+ with uv package manager
FFmpeg (for audio processing)
yt-dlp (for YouTube downloads)

Check requirements:

echo $OPENAI_API_KEY  # Should show API key
which ffmpeg          # Should show path

Cost Estimates

Default model (gpt-4o-mini-transcribe at $0.003/min):

5-minute video: ~$0.015
25-minute video: ~$0.075
60-minute video: ~$0.18

Full model (gpt-4o-transcribe at $0.006/min):

5-minute video: ~$0.03
25-minute video: ~$0.15
60-minute video: ~$0.36

Quick Reference

# Single video (default quality)
uv run python transcribe.py "URL"

# Single video (higher quality, 2x cost)
uv run python transcribe.py "URL" -m gpt-4o-transcribe

# Multiple videos in parallel
for url in URL1 URL2 URL3; do
  uv run python transcribe.py "$url" &
done
wait

# With custom output
uv run python transcribe.py "URL" -o custom_name.txt

# With context prompt
uv run python transcribe.py "URL" -p "Context about video content"

Integration

With fabric: Process transcripts after generation

cat output/Video_Title_2025-11-10.txt | fabric -p extract_wisdom

Notes

Script requires being in its directory to work correctly
Always change to $YOUTUBE_TRANSCRIBER_DIR first
Parallel execution is safe and recommended for multiple videos
Default model (gpt-4o-mini-transcribe) is recommended for most content
Output files automatically named with video title + date
Temp files automatically cleaned after transcription

4.8 KiB Raw Permalink Blame History