Cal Corum ceb4dd36a0 Add docker scripts, media-tools, VM management, and n8n workflow docs

Add CONTEXT.md for docker and VM management script directories.
Add media-tools documentation with Playwright scraping patterns.
Add Tdarr GPU monitor n8n workflow definition.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-07 22:26:10 -06:00

2.6 KiB

Raw Blame History

Media Tools Scripts

Operational scripts for media downloading and management.

Scripts

pokeflix_scraper.py

Downloads Pokemon episodes from pokeflix.tv.

Dependencies:

pip install playwright yt-dlp
playwright install chromium

Quick Start:

# Download entire season
python pokeflix_scraper.py \
    --url "https://www.pokeflix.tv/browse/pokemon-indigo-league" \
    --output ~/Pokemon/

# Download episodes 1-10 only
python pokeflix_scraper.py \
    --url "https://www.pokeflix.tv/browse/pokemon-indigo-league" \
    --output ~/Pokemon/ \
    --start 1 --end 10

# Resume interrupted download
python pokeflix_scraper.py \
    --url "https://www.pokeflix.tv/browse/pokemon-indigo-league" \
    --output ~/Pokemon/ \
    --resume

# Dry run (extract URLs, don't download)
python pokeflix_scraper.py \
    --url "https://www.pokeflix.tv/browse/pokemon-indigo-league" \
    --dry-run --verbose

CLI Options:

Option	Description
`--url, -u`	Season page URL (required)
`--output, -o`	Output directory (default: ~/Downloads/Pokemon)
`--start, -s`	First episode number to download
`--end, -e`	Last episode number to download
`--resume, -r`	Resume from previous state
`--dry-run, -n`	Extract URLs only, no download
`--headless`	Run browser without visible window
`--verbose, -v`	Enable debug logging

Output Structure:

~/Pokemon/
├── Pokemon Indigo League/
│   ├── E01 - Pokemon I Choose You.mp4
│   ├── E02 - Pokemon Emergency.mp4
│   ├── E03 - Ash Catches a Pokemon.mp4
│   └── download_state.json

State File:

The download_state.json tracks progress:

{
  "season_url": "https://...",
  "season_name": "Pokemon Indigo League",
  "episodes": {
    "1": {
      "number": 1,
      "title": "Pokemon I Choose You",
      "page_url": "https://...",
      "video_url": "https://...",
      "downloaded": true,
      "error": null
    }
  },
  "last_updated": "2025-01-22T..."
}

Adding New Scrapers

To add a scraper for a new site:

Copy the pattern from pokeflix_scraper.py
Modify the selectors for episode list extraction
Modify the iframe/video URL selectors for the new site's player
Test with --dry-run first

Key methods to customize:

get_season_info() - Extract episode list from season page
extract_video_url() - Get video URL from episode page

Performance Notes

Non-headless mode is recommended (default) to avoid anti-bot detection
Random delays (2-5s) between requests prevent rate limiting
Large seasons (80+ episodes) may take hours - use --resume if interrupted

2.6 KiB Raw Blame History