Add CONTEXT.md for docker and VM management script directories. Add media-tools documentation with Playwright scraping patterns. Add Tdarr GPU monitor n8n workflow definition. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2.5 KiB
2.5 KiB
Media Tools
Tools for downloading and managing media from streaming sites.
Overview
This directory contains utilities for:
- Extracting video URLs from streaming sites using browser automation
- Downloading videos via yt-dlp
- Managing download state for resumable operations
Tools
pokeflix_scraper.py
Downloads Pokemon episodes from pokeflix.tv using Playwright for browser automation.
Location: scripts/pokeflix_scraper.py
Features:
- Extracts episode lists from season pages
- Handles iframe-embedded video players (Streamtape, Vidoza, etc.)
- Resumable downloads with state persistence
- Configurable episode ranges
- Dry-run mode for testing
Architecture Pattern
These tools follow a common pattern:
┌─────────────────┐ ┌──────────────────┐ ┌─────────────┐
│ Playwright │────▶│ Extract embed │────▶│ yt-dlp │
│ (navigate) │ │ video URLs │ │ (download) │
└─────────────────┘ └──────────────────┘ └─────────────┘
Why this approach:
- Playwright handles JavaScript-heavy sites that block simple HTTP requests
- Iframe extraction works around sites that use third-party video hosts
- yt-dlp is the de-facto standard for video downloading with broad host support
Dependencies
# Python packages
pip install playwright yt-dlp
# Playwright browser installation
playwright install chromium
Common Patterns
Anti-Bot Handling
- Use headed browser mode (visible window) initially
- Random delays between requests (2-5 seconds)
- Realistic viewport and user-agent settings
- Wait for
networkidlestate after navigation
State Management
- JSON state files track downloaded episodes
- Enable
--resumeflag to skip completed downloads - State includes error information for debugging
Output Organization
{output_dir}/
├── {Season Name}/
│ ├── E01 - Episode Title.mp4
│ ├── E02 - Episode Title.mp4
│ └── download_state.json
When to Use These Tools
- Downloading entire seasons of shows for offline viewing
- Archiving content before it becomes unavailable
- Building a local media library
Legal Considerations
These tools are for personal archival use. Respect copyright laws in your jurisdiction.