claude-home/media-tools/CONTEXT.md
Cal Corum ceb4dd36a0 Add docker scripts, media-tools, VM management, and n8n workflow docs
Add CONTEXT.md for docker and VM management script directories.
Add media-tools documentation with Playwright scraping patterns.
Add Tdarr GPU monitor n8n workflow definition.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 22:26:10 -06:00

2.5 KiB

Media Tools

Tools for downloading and managing media from streaming sites.

Overview

This directory contains utilities for:

  • Extracting video URLs from streaming sites using browser automation
  • Downloading videos via yt-dlp
  • Managing download state for resumable operations

Tools

pokeflix_scraper.py

Downloads Pokemon episodes from pokeflix.tv using Playwright for browser automation.

Location: scripts/pokeflix_scraper.py

Features:

  • Extracts episode lists from season pages
  • Handles iframe-embedded video players (Streamtape, Vidoza, etc.)
  • Resumable downloads with state persistence
  • Configurable episode ranges
  • Dry-run mode for testing

Architecture Pattern

These tools follow a common pattern:

┌─────────────────┐     ┌──────────────────┐     ┌─────────────┐
│  Playwright     │────▶│  Extract embed   │────▶│  yt-dlp     │
│  (navigate)     │     │  video URLs      │     │  (download) │
└─────────────────┘     └──────────────────┘     └─────────────┘

Why this approach:

  1. Playwright handles JavaScript-heavy sites that block simple HTTP requests
  2. Iframe extraction works around sites that use third-party video hosts
  3. yt-dlp is the de-facto standard for video downloading with broad host support

Dependencies

# Python packages
pip install playwright yt-dlp

# Playwright browser installation
playwright install chromium

Common Patterns

Anti-Bot Handling

  • Use headed browser mode (visible window) initially
  • Random delays between requests (2-5 seconds)
  • Realistic viewport and user-agent settings
  • Wait for networkidle state after navigation

State Management

  • JSON state files track downloaded episodes
  • Enable --resume flag to skip completed downloads
  • State includes error information for debugging

Output Organization

{output_dir}/
├── {Season Name}/
│   ├── E01 - Episode Title.mp4
│   ├── E02 - Episode Title.mp4
│   └── download_state.json

When to Use These Tools

  • Downloading entire seasons of shows for offline viewing
  • Archiving content before it becomes unavailable
  • Building a local media library

These tools are for personal archival use. Respect copyright laws in your jurisdiction.