--- title: "Media Tools Scripts Reference" description: "Usage reference for media download scripts including pokeflix_scraper.py CLI options, output structure, state file format, and guide for adding new scrapers." type: reference domain: media-tools tags: [pokeflix, scraper, yt-dlp, playwright, cli, scripts] --- # Media Tools Scripts Operational scripts for media downloading and management. ## Scripts ### pokeflix_scraper.py Downloads Pokemon episodes from pokeflix.tv. **Dependencies:** ```bash pip install playwright yt-dlp playwright install chromium ``` **Quick Start:** ```bash # Download entire season python pokeflix_scraper.py \ --url "https://www.pokeflix.tv/browse/pokemon-indigo-league" \ --output ~/Pokemon/ # Download episodes 1-10 only python pokeflix_scraper.py \ --url "https://www.pokeflix.tv/browse/pokemon-indigo-league" \ --output ~/Pokemon/ \ --start 1 --end 10 # Resume interrupted download python pokeflix_scraper.py \ --url "https://www.pokeflix.tv/browse/pokemon-indigo-league" \ --output ~/Pokemon/ \ --resume # Dry run (extract URLs, don't download) python pokeflix_scraper.py \ --url "https://www.pokeflix.tv/browse/pokemon-indigo-league" \ --dry-run --verbose ``` **CLI Options:** | Option | Description | |--------|-------------| | `--url, -u` | Season page URL (required) | | `--output, -o` | Output directory (default: ~/Downloads/Pokemon) | | `--start, -s` | First episode number to download | | `--end, -e` | Last episode number to download | | `--resume, -r` | Resume from previous state | | `--dry-run, -n` | Extract URLs only, no download | | `--headless` | Run browser without visible window | | `--verbose, -v` | Enable debug logging | **Output Structure:** ``` ~/Pokemon/ ├── Pokemon Indigo League/ │ ├── E01 - Pokemon I Choose You.mp4 │ ├── E02 - Pokemon Emergency.mp4 │ ├── E03 - Ash Catches a Pokemon.mp4 │ └── download_state.json ``` **State File:** The `download_state.json` tracks progress: ```json { "season_url": "https://...", "season_name": "Pokemon Indigo League", "episodes": { "1": { "number": 1, "title": "Pokemon I Choose You", "page_url": "https://...", "video_url": "https://...", "downloaded": true, "error": null } }, "last_updated": "2025-01-22T..." } ``` ## Adding New Scrapers To add a scraper for a new site: 1. Copy the pattern from `pokeflix_scraper.py` 2. Modify the selectors for episode list extraction 3. Modify the iframe/video URL selectors for the new site's player 4. Test with `--dry-run` first Key methods to customize: - `get_season_info()` - Extract episode list from season page - `extract_video_url()` - Get video URL from episode page ## Performance Notes - **Non-headless mode** is recommended (default) to avoid anti-bot detection - Random delays (2-5s) between requests prevent rate limiting - Large seasons (80+ episodes) may take hours - use `--resume` if interrupted