Add CONTEXT.md for docker and VM management script directories. Add media-tools documentation with Playwright scraping patterns. Add Tdarr GPU monitor n8n workflow definition. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2.6 KiB
2.6 KiB
Media Tools Scripts
Operational scripts for media downloading and management.
Scripts
pokeflix_scraper.py
Downloads Pokemon episodes from pokeflix.tv.
Dependencies:
pip install playwright yt-dlp
playwright install chromium
Quick Start:
# Download entire season
python pokeflix_scraper.py \
--url "https://www.pokeflix.tv/browse/pokemon-indigo-league" \
--output ~/Pokemon/
# Download episodes 1-10 only
python pokeflix_scraper.py \
--url "https://www.pokeflix.tv/browse/pokemon-indigo-league" \
--output ~/Pokemon/ \
--start 1 --end 10
# Resume interrupted download
python pokeflix_scraper.py \
--url "https://www.pokeflix.tv/browse/pokemon-indigo-league" \
--output ~/Pokemon/ \
--resume
# Dry run (extract URLs, don't download)
python pokeflix_scraper.py \
--url "https://www.pokeflix.tv/browse/pokemon-indigo-league" \
--dry-run --verbose
CLI Options:
| Option | Description |
|---|---|
--url, -u |
Season page URL (required) |
--output, -o |
Output directory (default: ~/Downloads/Pokemon) |
--start, -s |
First episode number to download |
--end, -e |
Last episode number to download |
--resume, -r |
Resume from previous state |
--dry-run, -n |
Extract URLs only, no download |
--headless |
Run browser without visible window |
--verbose, -v |
Enable debug logging |
Output Structure:
~/Pokemon/
├── Pokemon Indigo League/
│ ├── E01 - Pokemon I Choose You.mp4
│ ├── E02 - Pokemon Emergency.mp4
│ ├── E03 - Ash Catches a Pokemon.mp4
│ └── download_state.json
State File:
The download_state.json tracks progress:
{
"season_url": "https://...",
"season_name": "Pokemon Indigo League",
"episodes": {
"1": {
"number": 1,
"title": "Pokemon I Choose You",
"page_url": "https://...",
"video_url": "https://...",
"downloaded": true,
"error": null
}
},
"last_updated": "2025-01-22T..."
}
Adding New Scrapers
To add a scraper for a new site:
- Copy the pattern from
pokeflix_scraper.py - Modify the selectors for episode list extraction
- Modify the iframe/video URL selectors for the new site's player
- Test with
--dry-runfirst
Key methods to customize:
get_season_info()- Extract episode list from season pageextract_video_url()- Get video URL from episode page
Performance Notes
- Non-headless mode is recommended (default) to avoid anti-bot detection
- Random delays (2-5s) between requests prevent rate limiting
- Large seasons (80+ episodes) may take hours - use
--resumeif interrupted