Add docker scripts, media-tools, VM management, and n8n workflow docs
Add CONTEXT.md for docker and VM management script directories. Add media-tools documentation with Playwright scraping patterns. Add Tdarr GPU monitor n8n workflow definition. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
b186107b97
commit
ceb4dd36a0
92
docker/scripts/CONTEXT.md
Normal file
92
docker/scripts/CONTEXT.md
Normal file
@ -0,0 +1,92 @@
|
|||||||
|
# Docker Scripts - Operational Context
|
||||||
|
|
||||||
|
## Script Overview
|
||||||
|
This directory is reserved for active operational scripts for Docker container management, orchestration, and automation.
|
||||||
|
|
||||||
|
**Current Status**: No operational scripts currently deployed. This structure is maintained for future Docker automation needs.
|
||||||
|
|
||||||
|
## Future Script Categories
|
||||||
|
|
||||||
|
### Planned Script Types
|
||||||
|
|
||||||
|
**Container Lifecycle Management**
|
||||||
|
- Start/stop scripts for complex multi-container setups
|
||||||
|
- Health check and restart automation
|
||||||
|
- Graceful shutdown procedures for dependent containers
|
||||||
|
|
||||||
|
**Maintenance Automation**
|
||||||
|
- Image cleanup and pruning scripts
|
||||||
|
- Volume backup and restoration
|
||||||
|
- Container log rotation and archiving
|
||||||
|
- Network cleanup and validation
|
||||||
|
|
||||||
|
**Monitoring and Alerts**
|
||||||
|
- Container health monitoring
|
||||||
|
- Resource usage tracking
|
||||||
|
- Discord/webhook notifications for container events
|
||||||
|
- Uptime and availability reporting
|
||||||
|
|
||||||
|
**Deployment Automation**
|
||||||
|
- CI/CD integration scripts
|
||||||
|
- Rolling update procedures
|
||||||
|
- Blue-green deployment automation
|
||||||
|
- Container migration tools
|
||||||
|
|
||||||
|
## Integration Points
|
||||||
|
|
||||||
|
### External Dependencies
|
||||||
|
- **Docker/Podman**: Container runtime
|
||||||
|
- **Docker Compose**: Multi-container orchestration
|
||||||
|
- **cron**: System scheduler for automation
|
||||||
|
- **Discord Webhooks**: Notification integration (when implemented)
|
||||||
|
|
||||||
|
### File System Dependencies
|
||||||
|
- **Container Volumes**: Various locations depending on service
|
||||||
|
- **Configuration Files**: Service-specific docker-compose.yml files
|
||||||
|
- **Log Files**: Container and automation logs
|
||||||
|
- **Backup Storage**: For volume snapshots and exports
|
||||||
|
|
||||||
|
### Network Dependencies
|
||||||
|
- **Docker Networks**: Bridge, host, and custom networks
|
||||||
|
- **External Services**: APIs and webhooks for monitoring
|
||||||
|
- **Registry Access**: For image pulls and pushes (when needed)
|
||||||
|
|
||||||
|
## Development Guidelines
|
||||||
|
|
||||||
|
### When Adding New Scripts
|
||||||
|
|
||||||
|
**Documentation Requirements**:
|
||||||
|
1. Add script description to this CONTEXT.md under appropriate category
|
||||||
|
2. Include usage examples and command-line options
|
||||||
|
3. Document dependencies and prerequisites
|
||||||
|
4. Specify cron schedule if automated
|
||||||
|
5. Add troubleshooting section for common issues
|
||||||
|
|
||||||
|
**Script Standards**:
|
||||||
|
```bash
|
||||||
|
#!/bin/bash
|
||||||
|
# Script name and purpose
|
||||||
|
# Dependencies: list required commands/services
|
||||||
|
# Usage: ./script.sh [options]
|
||||||
|
|
||||||
|
set -euo pipefail # Strict error handling
|
||||||
|
```
|
||||||
|
|
||||||
|
**Testing Requirements**:
|
||||||
|
- Test with both Docker and Podman where applicable
|
||||||
|
- Verify error handling and logging
|
||||||
|
- Document failure modes and recovery procedures
|
||||||
|
- Include dry-run or test mode where appropriate
|
||||||
|
|
||||||
|
## Related Documentation
|
||||||
|
|
||||||
|
- **Technology Overview**: `/docker/CONTEXT.md`
|
||||||
|
- **Troubleshooting**: `/docker/troubleshooting.md`
|
||||||
|
- **Examples**: `/docker/examples/` - Reference configurations and patterns
|
||||||
|
- **Main Instructions**: `/CLAUDE.md` - Context loading rules
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
|
||||||
|
This directory structure is maintained to support future Docker automation needs while keeping operational scripts organized and documented according to the technology-first documentation pattern established in the claude-home repository.
|
||||||
|
|
||||||
|
When scripts are added, this file should be updated to include specific operational context similar to the comprehensive documentation found in `/tdarr/scripts/CONTEXT.md`.
|
||||||
82
media-tools/CONTEXT.md
Normal file
82
media-tools/CONTEXT.md
Normal file
@ -0,0 +1,82 @@
|
|||||||
|
# Media Tools
|
||||||
|
|
||||||
|
Tools for downloading and managing media from streaming sites.
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
This directory contains utilities for:
|
||||||
|
- Extracting video URLs from streaming sites using browser automation
|
||||||
|
- Downloading videos via yt-dlp
|
||||||
|
- Managing download state for resumable operations
|
||||||
|
|
||||||
|
## Tools
|
||||||
|
|
||||||
|
### pokeflix_scraper.py
|
||||||
|
Downloads Pokemon episodes from pokeflix.tv using Playwright for browser automation.
|
||||||
|
|
||||||
|
**Location:** `scripts/pokeflix_scraper.py`
|
||||||
|
|
||||||
|
**Features:**
|
||||||
|
- Extracts episode lists from season pages
|
||||||
|
- Handles iframe-embedded video players (Streamtape, Vidoza, etc.)
|
||||||
|
- Resumable downloads with state persistence
|
||||||
|
- Configurable episode ranges
|
||||||
|
- Dry-run mode for testing
|
||||||
|
|
||||||
|
## Architecture Pattern
|
||||||
|
|
||||||
|
These tools follow a common pattern:
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────┐ ┌──────────────────┐ ┌─────────────┐
|
||||||
|
│ Playwright │────▶│ Extract embed │────▶│ yt-dlp │
|
||||||
|
│ (navigate) │ │ video URLs │ │ (download) │
|
||||||
|
└─────────────────┘ └──────────────────┘ └─────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
**Why this approach:**
|
||||||
|
1. **Playwright** handles JavaScript-heavy sites that block simple HTTP requests
|
||||||
|
2. **Iframe extraction** works around sites that use third-party video hosts
|
||||||
|
3. **yt-dlp** is the de-facto standard for video downloading with broad host support
|
||||||
|
|
||||||
|
## Dependencies
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Python packages
|
||||||
|
pip install playwright yt-dlp
|
||||||
|
|
||||||
|
# Playwright browser installation
|
||||||
|
playwright install chromium
|
||||||
|
```
|
||||||
|
|
||||||
|
## Common Patterns
|
||||||
|
|
||||||
|
### Anti-Bot Handling
|
||||||
|
- Use headed browser mode (visible window) initially
|
||||||
|
- Random delays between requests (2-5 seconds)
|
||||||
|
- Realistic viewport and user-agent settings
|
||||||
|
- Wait for `networkidle` state after navigation
|
||||||
|
|
||||||
|
### State Management
|
||||||
|
- JSON state files track downloaded episodes
|
||||||
|
- Enable `--resume` flag to skip completed downloads
|
||||||
|
- State includes error information for debugging
|
||||||
|
|
||||||
|
### Output Organization
|
||||||
|
```
|
||||||
|
{output_dir}/
|
||||||
|
├── {Season Name}/
|
||||||
|
│ ├── E01 - Episode Title.mp4
|
||||||
|
│ ├── E02 - Episode Title.mp4
|
||||||
|
│ └── download_state.json
|
||||||
|
```
|
||||||
|
|
||||||
|
## When to Use These Tools
|
||||||
|
|
||||||
|
- Downloading entire seasons of shows for offline viewing
|
||||||
|
- Archiving content before it becomes unavailable
|
||||||
|
- Building a local media library
|
||||||
|
|
||||||
|
## Legal Considerations
|
||||||
|
|
||||||
|
These tools are for personal archival use. Respect copyright laws in your jurisdiction.
|
||||||
103
media-tools/scripts/CONTEXT.md
Normal file
103
media-tools/scripts/CONTEXT.md
Normal file
@ -0,0 +1,103 @@
|
|||||||
|
# Media Tools Scripts
|
||||||
|
|
||||||
|
Operational scripts for media downloading and management.
|
||||||
|
|
||||||
|
## Scripts
|
||||||
|
|
||||||
|
### pokeflix_scraper.py
|
||||||
|
|
||||||
|
Downloads Pokemon episodes from pokeflix.tv.
|
||||||
|
|
||||||
|
**Dependencies:**
|
||||||
|
```bash
|
||||||
|
pip install playwright yt-dlp
|
||||||
|
playwright install chromium
|
||||||
|
```
|
||||||
|
|
||||||
|
**Quick Start:**
|
||||||
|
```bash
|
||||||
|
# Download entire season
|
||||||
|
python pokeflix_scraper.py \
|
||||||
|
--url "https://www.pokeflix.tv/browse/pokemon-indigo-league" \
|
||||||
|
--output ~/Pokemon/
|
||||||
|
|
||||||
|
# Download episodes 1-10 only
|
||||||
|
python pokeflix_scraper.py \
|
||||||
|
--url "https://www.pokeflix.tv/browse/pokemon-indigo-league" \
|
||||||
|
--output ~/Pokemon/ \
|
||||||
|
--start 1 --end 10
|
||||||
|
|
||||||
|
# Resume interrupted download
|
||||||
|
python pokeflix_scraper.py \
|
||||||
|
--url "https://www.pokeflix.tv/browse/pokemon-indigo-league" \
|
||||||
|
--output ~/Pokemon/ \
|
||||||
|
--resume
|
||||||
|
|
||||||
|
# Dry run (extract URLs, don't download)
|
||||||
|
python pokeflix_scraper.py \
|
||||||
|
--url "https://www.pokeflix.tv/browse/pokemon-indigo-league" \
|
||||||
|
--dry-run --verbose
|
||||||
|
```
|
||||||
|
|
||||||
|
**CLI Options:**
|
||||||
|
|
||||||
|
| Option | Description |
|
||||||
|
|--------|-------------|
|
||||||
|
| `--url, -u` | Season page URL (required) |
|
||||||
|
| `--output, -o` | Output directory (default: ~/Downloads/Pokemon) |
|
||||||
|
| `--start, -s` | First episode number to download |
|
||||||
|
| `--end, -e` | Last episode number to download |
|
||||||
|
| `--resume, -r` | Resume from previous state |
|
||||||
|
| `--dry-run, -n` | Extract URLs only, no download |
|
||||||
|
| `--headless` | Run browser without visible window |
|
||||||
|
| `--verbose, -v` | Enable debug logging |
|
||||||
|
|
||||||
|
**Output Structure:**
|
||||||
|
```
|
||||||
|
~/Pokemon/
|
||||||
|
├── Pokemon Indigo League/
|
||||||
|
│ ├── E01 - Pokemon I Choose You.mp4
|
||||||
|
│ ├── E02 - Pokemon Emergency.mp4
|
||||||
|
│ ├── E03 - Ash Catches a Pokemon.mp4
|
||||||
|
│ └── download_state.json
|
||||||
|
```
|
||||||
|
|
||||||
|
**State File:**
|
||||||
|
|
||||||
|
The `download_state.json` tracks progress:
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"season_url": "https://...",
|
||||||
|
"season_name": "Pokemon Indigo League",
|
||||||
|
"episodes": {
|
||||||
|
"1": {
|
||||||
|
"number": 1,
|
||||||
|
"title": "Pokemon I Choose You",
|
||||||
|
"page_url": "https://...",
|
||||||
|
"video_url": "https://...",
|
||||||
|
"downloaded": true,
|
||||||
|
"error": null
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"last_updated": "2025-01-22T..."
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Adding New Scrapers
|
||||||
|
|
||||||
|
To add a scraper for a new site:
|
||||||
|
|
||||||
|
1. Copy the pattern from `pokeflix_scraper.py`
|
||||||
|
2. Modify the selectors for episode list extraction
|
||||||
|
3. Modify the iframe/video URL selectors for the new site's player
|
||||||
|
4. Test with `--dry-run` first
|
||||||
|
|
||||||
|
Key methods to customize:
|
||||||
|
- `get_season_info()` - Extract episode list from season page
|
||||||
|
- `extract_video_url()` - Get video URL from episode page
|
||||||
|
|
||||||
|
## Performance Notes
|
||||||
|
|
||||||
|
- **Non-headless mode** is recommended (default) to avoid anti-bot detection
|
||||||
|
- Random delays (2-5s) between requests prevent rate limiting
|
||||||
|
- Large seasons (80+ episodes) may take hours - use `--resume` if interrupted
|
||||||
777
media-tools/scripts/pokeflix_scraper.py
Executable file
777
media-tools/scripts/pokeflix_scraper.py
Executable file
@ -0,0 +1,777 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Pokeflix Scraper - Download Pokemon episodes from pokeflix.tv
|
||||||
|
|
||||||
|
Pokeflix hosts videos directly on their CDN (v1.pkflx.com). This scraper:
|
||||||
|
1. Extracts the episode list from a season browse page
|
||||||
|
2. Visits each episode page to detect its CDN episode number
|
||||||
|
3. Downloads videos directly from the CDN via yt-dlp
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
# Download entire season
|
||||||
|
python pokeflix_scraper.py --url "https://www.pokeflix.tv/browse/pokemon-indigo-league" --output ~/Pokemon/
|
||||||
|
|
||||||
|
# Download specific episode range
|
||||||
|
python pokeflix_scraper.py --url "..." --start 1 --end 10 --output ~/Pokemon/
|
||||||
|
|
||||||
|
# Resume interrupted download
|
||||||
|
python pokeflix_scraper.py --url "..." --output ~/Pokemon/ --resume
|
||||||
|
|
||||||
|
# Dry run (extract URLs only)
|
||||||
|
python pokeflix_scraper.py --url "..." --dry-run
|
||||||
|
|
||||||
|
# Choose quality
|
||||||
|
python pokeflix_scraper.py --url "..." --quality 720p --output ~/Pokemon/
|
||||||
|
|
||||||
|
Dependencies:
|
||||||
|
pip install playwright
|
||||||
|
playwright install chromium
|
||||||
|
# yt-dlp must be installed: pip install yt-dlp
|
||||||
|
|
||||||
|
Author: Cal Corum (with Jarvis assistance)
|
||||||
|
"""
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import asyncio
|
||||||
|
import json
|
||||||
|
import logging
|
||||||
|
import random
|
||||||
|
import re
|
||||||
|
import subprocess
|
||||||
|
import sys
|
||||||
|
from dataclasses import dataclass, field, asdict
|
||||||
|
from datetime import datetime
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Optional
|
||||||
|
|
||||||
|
try:
|
||||||
|
from playwright.async_api import async_playwright, Page, Browser
|
||||||
|
except ImportError:
|
||||||
|
print("ERROR: playwright not installed. Run: pip install playwright && playwright install chromium")
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
|
||||||
|
# ============================================================================
|
||||||
|
# Data Classes
|
||||||
|
# ============================================================================
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class Episode:
|
||||||
|
"""Represents a single episode with its metadata and download status."""
|
||||||
|
cdn_number: int # The actual episode number on the CDN
|
||||||
|
title: str
|
||||||
|
page_url: str
|
||||||
|
slug: str # URL slug e.g., "01-pokemon-i-choose-you"
|
||||||
|
video_url: Optional[str] = None
|
||||||
|
downloaded: bool = False
|
||||||
|
error: Optional[str] = None
|
||||||
|
|
||||||
|
@property
|
||||||
|
def filename(self) -> str:
|
||||||
|
"""Generate safe filename for the episode."""
|
||||||
|
safe_title = re.sub(r'[<>:"/\\|?*]', '', self.title)
|
||||||
|
safe_title = safe_title.strip()
|
||||||
|
return f"E{self.cdn_number:02d} - {safe_title}.mp4"
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class Season:
|
||||||
|
"""Represents a season/series with all its episodes."""
|
||||||
|
name: str
|
||||||
|
url: str
|
||||||
|
cdn_slug: str # e.g., "01-indigo-league" - used for CDN URLs
|
||||||
|
episodes: list[Episode] = field(default_factory=list)
|
||||||
|
|
||||||
|
@property
|
||||||
|
def safe_name(self) -> str:
|
||||||
|
"""Generate safe directory name for the season."""
|
||||||
|
safe = re.sub(r'[<>:"/\\|?*]', '', self.name)
|
||||||
|
return safe.strip()
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class DownloadState:
|
||||||
|
"""Persistent state for resumable downloads."""
|
||||||
|
season_url: str
|
||||||
|
season_name: str
|
||||||
|
cdn_slug: str
|
||||||
|
episodes: dict[int, dict] = field(default_factory=dict) # cdn_number -> episode dict
|
||||||
|
episode_urls: list[str] = field(default_factory=list) # All episode page URLs
|
||||||
|
last_updated: str = ""
|
||||||
|
|
||||||
|
def save(self, path: Path) -> None:
|
||||||
|
"""Save state to JSON file."""
|
||||||
|
self.last_updated = datetime.now().isoformat()
|
||||||
|
with open(path, 'w') as f:
|
||||||
|
json.dump(asdict(self), f, indent=2)
|
||||||
|
|
||||||
|
@classmethod
|
||||||
|
def load(cls, path: Path) -> Optional['DownloadState']:
|
||||||
|
"""Load state from JSON file."""
|
||||||
|
if not path.exists():
|
||||||
|
return None
|
||||||
|
with open(path) as f:
|
||||||
|
data = json.load(f)
|
||||||
|
return cls(**data)
|
||||||
|
|
||||||
|
|
||||||
|
# ============================================================================
|
||||||
|
# Logging Setup
|
||||||
|
# ============================================================================
|
||||||
|
|
||||||
|
def setup_logging(verbose: bool = False) -> logging.Logger:
|
||||||
|
"""Configure logging with console output."""
|
||||||
|
logger = logging.getLogger('pokeflix_scraper')
|
||||||
|
logger.setLevel(logging.DEBUG if verbose else logging.INFO)
|
||||||
|
|
||||||
|
if not logger.handlers:
|
||||||
|
console = logging.StreamHandler()
|
||||||
|
console.setLevel(logging.DEBUG if verbose else logging.INFO)
|
||||||
|
console.setFormatter(logging.Formatter(
|
||||||
|
'%(asctime)s [%(levelname)s] %(message)s',
|
||||||
|
datefmt='%H:%M:%S'
|
||||||
|
))
|
||||||
|
logger.addHandler(console)
|
||||||
|
|
||||||
|
return logger
|
||||||
|
|
||||||
|
|
||||||
|
# ============================================================================
|
||||||
|
# Scraper Class
|
||||||
|
# ============================================================================
|
||||||
|
|
||||||
|
class PokeflixScraper:
|
||||||
|
"""
|
||||||
|
Scrapes pokeflix.tv for video URLs and downloads them.
|
||||||
|
|
||||||
|
Pokeflix hosts videos on their CDN with URLs like:
|
||||||
|
https://v1.pkflx.com/hls/{season-slug}/{ep-num}/{ep-num}_{quality}.mp4
|
||||||
|
|
||||||
|
The episode number must be detected by visiting each episode page,
|
||||||
|
as the browse page URL slugs don't contain episode numbers.
|
||||||
|
"""
|
||||||
|
|
||||||
|
BASE_URL = "https://www.pokeflix.tv"
|
||||||
|
CDN_URL = "https://v1.pkflx.com/hls"
|
||||||
|
|
||||||
|
# Map browse page URL slugs to CDN slugs
|
||||||
|
SEASON_SLUG_MAP = {
|
||||||
|
'pokemon-indigo-league': '01-indigo-league',
|
||||||
|
'pokemon-adventures-in-the-orange-islands': '02-orange-islands',
|
||||||
|
'pokemon-the-johto-journeys': '03-johto-journeys',
|
||||||
|
'pokemon-johto-league-champions': '04-johto-league-champions',
|
||||||
|
'pokemon-master-quest': '05-master-quest',
|
||||||
|
'pokemon-advanced': '06-advanced',
|
||||||
|
'pokemon-advanced-challenge': '07-advanced-challenge',
|
||||||
|
'pokemon-advanced-battle': '08-advanced-battle',
|
||||||
|
'pokemon-battle-frontier': '09-battle-frontier',
|
||||||
|
'pokemon-diamond-and-pearl': '10-diamond-and-pearl',
|
||||||
|
'pokemon-dp-battle-dimension': '11-battle-dimension',
|
||||||
|
'pokemon-dp-galactic-battles': '12-galactic-battles',
|
||||||
|
'pokemon-dp-sinnoh-league-victors': '13-sinnoh-league-victors',
|
||||||
|
'pokemon-black-white': '14-black-and-white',
|
||||||
|
'pokemon-bw-rival-destinies': '15-rival-destinies',
|
||||||
|
'pokemon-bw-adventures-in-unova': '16-adventures-in-unova',
|
||||||
|
'pokemon-xy': '17-xy',
|
||||||
|
'pokemon-xy-kalos-quest': '18-kalos-quest',
|
||||||
|
'pokemon-xyz': '19-xyz',
|
||||||
|
'pokemon-sun-moon': '20-sun-and-moon',
|
||||||
|
'pokemon-sun-moon-ultra-adventures': '21-ultra-adventures',
|
||||||
|
'pokemon-sun-moon-ultra-legends': '22-ultra-legends',
|
||||||
|
'pokemon-journeys': '23-journeys',
|
||||||
|
'pokemon-master-journeys': '24-master-journeys',
|
||||||
|
'pokemon-ultimate-journeys': '25-ultimate-journeys',
|
||||||
|
'pokemon-horizons': '26-horizons',
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
output_dir: Path,
|
||||||
|
headless: bool = False,
|
||||||
|
dry_run: bool = False,
|
||||||
|
verbose: bool = False,
|
||||||
|
quality: str = "1080p"
|
||||||
|
):
|
||||||
|
self.output_dir = output_dir
|
||||||
|
self.headless = headless
|
||||||
|
self.dry_run = dry_run
|
||||||
|
self.quality = quality
|
||||||
|
self.logger = setup_logging(verbose)
|
||||||
|
self.browser: Optional[Browser] = None
|
||||||
|
self._context = None
|
||||||
|
|
||||||
|
async def __aenter__(self):
|
||||||
|
"""Async context manager entry - launch browser."""
|
||||||
|
playwright = await async_playwright().start()
|
||||||
|
self.browser = await playwright.chromium.launch(
|
||||||
|
headless=self.headless,
|
||||||
|
args=['--disable-blink-features=AutomationControlled']
|
||||||
|
)
|
||||||
|
self._playwright = playwright
|
||||||
|
# Create a persistent context to reuse
|
||||||
|
self._context = await self.browser.new_context(
|
||||||
|
viewport={'width': 1920, 'height': 1080},
|
||||||
|
user_agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36'
|
||||||
|
)
|
||||||
|
return self
|
||||||
|
|
||||||
|
async def __aexit__(self, exc_type, exc_val, exc_tb):
|
||||||
|
"""Async context manager exit - close browser."""
|
||||||
|
if self._context:
|
||||||
|
await self._context.close()
|
||||||
|
if self.browser:
|
||||||
|
await self.browser.close()
|
||||||
|
await self._playwright.stop()
|
||||||
|
|
||||||
|
async def _new_page(self) -> Page:
|
||||||
|
"""Create a new page using the shared context."""
|
||||||
|
return await self._context.new_page()
|
||||||
|
|
||||||
|
async def _random_delay(self, min_sec: float = 1.0, max_sec: float = 3.0):
|
||||||
|
"""Random delay to avoid detection."""
|
||||||
|
delay = random.uniform(min_sec, max_sec)
|
||||||
|
await asyncio.sleep(delay)
|
||||||
|
|
||||||
|
async def _wait_for_cloudflare(self, page: Page, timeout: int = 60):
|
||||||
|
"""Wait for Cloudflare challenge to be solved by user."""
|
||||||
|
try:
|
||||||
|
# Check if we're on a Cloudflare challenge page
|
||||||
|
is_cf = await page.query_selector('#challenge-running, .cf-browser-verification, [id*="challenge"]')
|
||||||
|
if is_cf:
|
||||||
|
self.logger.warning("Cloudflare challenge detected - please solve it in the browser window")
|
||||||
|
self.logger.info("Waiting up to 60 seconds for challenge completion...")
|
||||||
|
|
||||||
|
# Wait for the challenge to be solved (URL changes or challenge element disappears)
|
||||||
|
for _ in range(timeout):
|
||||||
|
await asyncio.sleep(1)
|
||||||
|
is_cf = await page.query_selector('#challenge-running, .cf-browser-verification, [id*="challenge"]')
|
||||||
|
if not is_cf:
|
||||||
|
self.logger.info("Cloudflare challenge completed!")
|
||||||
|
await asyncio.sleep(2) # Wait for page to fully load
|
||||||
|
return True
|
||||||
|
|
||||||
|
self.logger.error("Cloudflare challenge timeout - please try again")
|
||||||
|
return False
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
return True
|
||||||
|
|
||||||
|
def _get_cdn_slug(self, browse_url: str) -> Optional[str]:
|
||||||
|
"""Extract CDN slug from browse page URL."""
|
||||||
|
match = re.search(r'/browse/([^/]+)', browse_url)
|
||||||
|
if match:
|
||||||
|
page_slug = match.group(1)
|
||||||
|
if page_slug in self.SEASON_SLUG_MAP:
|
||||||
|
return self.SEASON_SLUG_MAP[page_slug]
|
||||||
|
self.logger.warning(f"Unknown season slug: {page_slug}, will try to detect from page")
|
||||||
|
return None
|
||||||
|
|
||||||
|
def _construct_video_url(self, cdn_slug: str, ep_num: int) -> str:
|
||||||
|
"""Construct direct CDN video URL."""
|
||||||
|
return f"{self.CDN_URL}/{cdn_slug}/{ep_num:02d}/{ep_num:02d}_{self.quality}.mp4"
|
||||||
|
|
||||||
|
def _slug_to_title(self, slug: str) -> str:
|
||||||
|
"""Convert URL slug to human-readable title."""
|
||||||
|
# Remove season prefix like "01-"
|
||||||
|
title_slug = re.sub(r'^\d+-', '', slug)
|
||||||
|
# Convert to title case
|
||||||
|
title = title_slug.replace('-', ' ').title()
|
||||||
|
# Clean up common words
|
||||||
|
title = re.sub(r'\bPokemon\b', 'Pokémon', title)
|
||||||
|
return title
|
||||||
|
|
||||||
|
async def get_episode_list(self, season_url: str) -> tuple[str, str, list[tuple[str, str]]]:
|
||||||
|
"""
|
||||||
|
Get the list of episode URLs from a season browse page.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Tuple of (season_name, cdn_slug, list of (page_url, slug) tuples)
|
||||||
|
"""
|
||||||
|
self.logger.info(f"Fetching season page: {season_url}")
|
||||||
|
|
||||||
|
cdn_slug = self._get_cdn_slug(season_url)
|
||||||
|
|
||||||
|
page = await self._new_page()
|
||||||
|
try:
|
||||||
|
await page.goto(season_url, wait_until='networkidle', timeout=60000)
|
||||||
|
await self._wait_for_cloudflare(page)
|
||||||
|
await self._random_delay(2, 4)
|
||||||
|
|
||||||
|
# Extract season title
|
||||||
|
title_elem = await page.query_selector('h1, .season-title, .series-title')
|
||||||
|
if not title_elem:
|
||||||
|
title_elem = await page.query_selector('title')
|
||||||
|
season_name = await title_elem.inner_text() if title_elem else "Unknown Season"
|
||||||
|
season_name = season_name.replace('Pokéflix - Watch ', '').replace(' for free online!', '').strip()
|
||||||
|
|
||||||
|
self.logger.info(f"Season: {season_name}")
|
||||||
|
|
||||||
|
# Find all episode links with /v/ pattern
|
||||||
|
links = await page.query_selector_all('a[href^="/v/"]')
|
||||||
|
self.logger.info(f"Found {len(links)} episode links")
|
||||||
|
|
||||||
|
# If we don't have CDN slug, detect it from first episode
|
||||||
|
if not cdn_slug and links:
|
||||||
|
first_href = await links[0].get_attribute('href')
|
||||||
|
cdn_slug = await self._detect_cdn_slug(first_href)
|
||||||
|
|
||||||
|
if not cdn_slug:
|
||||||
|
self.logger.error("Could not determine CDN slug for this season")
|
||||||
|
return season_name, "unknown", []
|
||||||
|
|
||||||
|
self.logger.info(f"CDN slug: {cdn_slug}")
|
||||||
|
|
||||||
|
# Collect all episode URLs
|
||||||
|
episode_data = []
|
||||||
|
seen_urls = set()
|
||||||
|
|
||||||
|
for link in links:
|
||||||
|
href = await link.get_attribute('href')
|
||||||
|
if not href or href in seen_urls:
|
||||||
|
continue
|
||||||
|
seen_urls.add(href)
|
||||||
|
|
||||||
|
# Extract slug from URL
|
||||||
|
slug_match = re.search(r'/v/(.+)', href)
|
||||||
|
if slug_match:
|
||||||
|
slug = slug_match.group(1)
|
||||||
|
full_url = self.BASE_URL + href
|
||||||
|
episode_data.append((full_url, slug))
|
||||||
|
|
||||||
|
return season_name, cdn_slug, episode_data
|
||||||
|
|
||||||
|
finally:
|
||||||
|
await page.close()
|
||||||
|
|
||||||
|
async def _detect_cdn_slug(self, episode_href: str) -> Optional[str]:
|
||||||
|
"""Visit an episode page to detect the CDN slug from network requests."""
|
||||||
|
self.logger.info("Detecting CDN slug from episode page...")
|
||||||
|
|
||||||
|
detected_slug = None
|
||||||
|
|
||||||
|
async def capture_request(request):
|
||||||
|
nonlocal detected_slug
|
||||||
|
if 'v1.pkflx.com/hls/' in request.url:
|
||||||
|
match = re.search(r'hls/([^/]+)/', request.url)
|
||||||
|
if match:
|
||||||
|
detected_slug = match.group(1)
|
||||||
|
|
||||||
|
page = await self._new_page()
|
||||||
|
page.on('request', capture_request)
|
||||||
|
|
||||||
|
try:
|
||||||
|
await page.goto(self.BASE_URL + episode_href, timeout=60000)
|
||||||
|
await self._wait_for_cloudflare(page)
|
||||||
|
await asyncio.sleep(5)
|
||||||
|
return detected_slug
|
||||||
|
finally:
|
||||||
|
await page.close()
|
||||||
|
|
||||||
|
async def get_episode_cdn_number(self, page_url: str) -> Optional[int]:
|
||||||
|
"""
|
||||||
|
Visit an episode page and detect its CDN episode number.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
The episode number used in CDN URLs, or None if not detected
|
||||||
|
"""
|
||||||
|
detected_num = None
|
||||||
|
|
||||||
|
async def capture_request(request):
|
||||||
|
nonlocal detected_num
|
||||||
|
if 'v1.pkflx.com/hls/' in request.url:
|
||||||
|
match = re.search(r'/(\d+)/\d+_', request.url)
|
||||||
|
if match:
|
||||||
|
detected_num = int(match.group(1))
|
||||||
|
|
||||||
|
page = await self._new_page()
|
||||||
|
page.on('request', capture_request)
|
||||||
|
|
||||||
|
try:
|
||||||
|
await page.goto(page_url, timeout=60000)
|
||||||
|
await self._wait_for_cloudflare(page)
|
||||||
|
|
||||||
|
# Wait for initial load
|
||||||
|
await asyncio.sleep(2)
|
||||||
|
|
||||||
|
# Try to trigger video playback by clicking play button or video area
|
||||||
|
play_selectors = [
|
||||||
|
'button[aria-label*="play" i]',
|
||||||
|
'.play-button',
|
||||||
|
'[class*="play"]',
|
||||||
|
'video',
|
||||||
|
'.video-player',
|
||||||
|
'.player',
|
||||||
|
'#player',
|
||||||
|
]
|
||||||
|
|
||||||
|
for selector in play_selectors:
|
||||||
|
try:
|
||||||
|
elem = await page.query_selector(selector)
|
||||||
|
if elem:
|
||||||
|
await elem.click()
|
||||||
|
await asyncio.sleep(0.5)
|
||||||
|
if detected_num:
|
||||||
|
break
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
|
||||||
|
# Wait for video requests after click attempts
|
||||||
|
for _ in range(10): # Wait up to 5 seconds
|
||||||
|
if detected_num:
|
||||||
|
break
|
||||||
|
await asyncio.sleep(0.5)
|
||||||
|
|
||||||
|
# If still not detected, try looking in page source
|
||||||
|
if not detected_num:
|
||||||
|
content = await page.content()
|
||||||
|
match = re.search(r'v1\.pkflx\.com/hls/[^/]+/(\d+)/', content)
|
||||||
|
if match:
|
||||||
|
detected_num = int(match.group(1))
|
||||||
|
|
||||||
|
return detected_num
|
||||||
|
finally:
|
||||||
|
await page.close()
|
||||||
|
|
||||||
|
def download_video(self, video_url: str, output_path: Path) -> bool:
|
||||||
|
"""
|
||||||
|
Download video using yt-dlp.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
video_url: Direct CDN URL to the video
|
||||||
|
output_path: Full path for output file
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
True if download succeeded
|
||||||
|
"""
|
||||||
|
if self.dry_run:
|
||||||
|
self.logger.info(f" [DRY RUN] Would download: {video_url}")
|
||||||
|
self.logger.info(f" To: {output_path}")
|
||||||
|
return True
|
||||||
|
|
||||||
|
self.logger.info(f" Downloading: {output_path.name}")
|
||||||
|
|
||||||
|
cmd = [
|
||||||
|
'yt-dlp',
|
||||||
|
'--no-warnings',
|
||||||
|
'-o', str(output_path),
|
||||||
|
'--no-part',
|
||||||
|
video_url
|
||||||
|
]
|
||||||
|
|
||||||
|
try:
|
||||||
|
result = subprocess.run(
|
||||||
|
cmd,
|
||||||
|
capture_output=True,
|
||||||
|
text=True,
|
||||||
|
timeout=1800
|
||||||
|
)
|
||||||
|
|
||||||
|
if result.returncode == 0:
|
||||||
|
self.logger.info(f" Download complete!")
|
||||||
|
return True
|
||||||
|
else:
|
||||||
|
self.logger.error(f" yt-dlp error: {result.stderr}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
except subprocess.TimeoutExpired:
|
||||||
|
self.logger.error(" Download timed out after 30 minutes")
|
||||||
|
return False
|
||||||
|
except FileNotFoundError:
|
||||||
|
self.logger.error(" yt-dlp not found. Install with: pip install yt-dlp")
|
||||||
|
return False
|
||||||
|
|
||||||
|
async def download_direct(
|
||||||
|
self,
|
||||||
|
season_url: str,
|
||||||
|
start_ep: int,
|
||||||
|
end_ep: int,
|
||||||
|
resume: bool = False
|
||||||
|
) -> None:
|
||||||
|
"""
|
||||||
|
Direct download mode - download episodes by number without visiting pages.
|
||||||
|
|
||||||
|
This is faster and more reliable when you know the episode range.
|
||||||
|
"""
|
||||||
|
# Get CDN slug from URL
|
||||||
|
cdn_slug = self._get_cdn_slug(season_url)
|
||||||
|
if not cdn_slug:
|
||||||
|
self.logger.error("Unknown season - direct mode requires a known season URL")
|
||||||
|
self.logger.info("Known seasons: " + ", ".join(self.SEASON_SLUG_MAP.keys()))
|
||||||
|
return
|
||||||
|
|
||||||
|
# In direct mode, output_dir is the final destination (no subfolder created)
|
||||||
|
season_dir = self.output_dir
|
||||||
|
season_dir.mkdir(parents=True, exist_ok=True)
|
||||||
|
|
||||||
|
self.logger.info(f"Direct download mode: Episodes {start_ep}-{end_ep}")
|
||||||
|
self.logger.info(f"CDN slug: {cdn_slug}")
|
||||||
|
self.logger.info(f"Quality: {self.quality}")
|
||||||
|
self.logger.info(f"Output: {season_dir}")
|
||||||
|
|
||||||
|
downloaded = 0
|
||||||
|
skipped = 0
|
||||||
|
failed = 0
|
||||||
|
|
||||||
|
for ep_num in range(start_ep, end_ep + 1):
|
||||||
|
output_path = season_dir / f"E{ep_num:02d}.mp4"
|
||||||
|
|
||||||
|
# Check if already exists
|
||||||
|
if output_path.exists() and resume:
|
||||||
|
self.logger.info(f"E{ep_num:02d}: Skipping (file exists)")
|
||||||
|
skipped += 1
|
||||||
|
continue
|
||||||
|
|
||||||
|
video_url = self._construct_video_url(cdn_slug, ep_num)
|
||||||
|
self.logger.info(f"E{ep_num:02d}: Downloading...")
|
||||||
|
|
||||||
|
success = self.download_video(video_url, output_path)
|
||||||
|
|
||||||
|
if success:
|
||||||
|
downloaded += 1
|
||||||
|
else:
|
||||||
|
failed += 1
|
||||||
|
|
||||||
|
if not self.dry_run:
|
||||||
|
await self._random_delay(0.5, 1.5)
|
||||||
|
|
||||||
|
self.logger.info(f"\nComplete! Downloaded: {downloaded}, Skipped: {skipped}, Failed: {failed}")
|
||||||
|
|
||||||
|
async def download_season(
|
||||||
|
self,
|
||||||
|
season_url: str,
|
||||||
|
start_ep: Optional[int] = None,
|
||||||
|
end_ep: Optional[int] = None,
|
||||||
|
resume: bool = False,
|
||||||
|
direct: bool = False
|
||||||
|
) -> None:
|
||||||
|
"""
|
||||||
|
Download all episodes from a season.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
season_url: URL to the season browse page
|
||||||
|
start_ep: First episode number to download (inclusive)
|
||||||
|
end_ep: Last episode number to download (inclusive)
|
||||||
|
resume: Whether to resume from previous state
|
||||||
|
direct: If True, skip page visits and download by episode number
|
||||||
|
"""
|
||||||
|
# Direct mode - just download by episode number
|
||||||
|
if direct:
|
||||||
|
if start_ep is None or end_ep is None:
|
||||||
|
self.logger.error("Direct mode requires --start and --end episode numbers")
|
||||||
|
return
|
||||||
|
await self.download_direct(season_url, start_ep, end_ep, resume)
|
||||||
|
return
|
||||||
|
|
||||||
|
# Get episode list
|
||||||
|
season_name, cdn_slug, episode_data = await self.get_episode_list(season_url)
|
||||||
|
|
||||||
|
if not episode_data:
|
||||||
|
self.logger.error("No episodes found!")
|
||||||
|
return
|
||||||
|
|
||||||
|
# Create output directory
|
||||||
|
season_dir = self.output_dir / re.sub(r'[<>:"/\\|?*]', '', season_name).strip()
|
||||||
|
season_dir.mkdir(parents=True, exist_ok=True)
|
||||||
|
|
||||||
|
# State file for resume
|
||||||
|
state_path = season_dir / 'download_state.json'
|
||||||
|
state = None
|
||||||
|
|
||||||
|
if resume:
|
||||||
|
state = DownloadState.load(state_path)
|
||||||
|
if state:
|
||||||
|
self.logger.info(f"Resuming from previous state ({state.last_updated})")
|
||||||
|
|
||||||
|
if not state:
|
||||||
|
state = DownloadState(
|
||||||
|
season_url=season_url,
|
||||||
|
season_name=season_name,
|
||||||
|
cdn_slug=cdn_slug,
|
||||||
|
episode_urls=[url for url, _ in episode_data]
|
||||||
|
)
|
||||||
|
|
||||||
|
self.logger.info(f"Processing {len(episode_data)} episodes (quality: {self.quality})")
|
||||||
|
|
||||||
|
# Process each episode
|
||||||
|
downloaded_count = 0
|
||||||
|
skipped_count = 0
|
||||||
|
failed_count = 0
|
||||||
|
|
||||||
|
for page_url, slug in episode_data:
|
||||||
|
title = self._slug_to_title(slug)
|
||||||
|
|
||||||
|
# Check if we already have this episode in state (by URL)
|
||||||
|
existing_ep = None
|
||||||
|
for ep_data in state.episodes.values():
|
||||||
|
if ep_data.get('page_url') == page_url:
|
||||||
|
existing_ep = ep_data
|
||||||
|
break
|
||||||
|
|
||||||
|
if existing_ep and existing_ep.get('downloaded') and resume:
|
||||||
|
self.logger.info(f"Skipping: {title} (already downloaded)")
|
||||||
|
skipped_count += 1
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Get the CDN episode number by visiting the page
|
||||||
|
self.logger.info(f"Checking: {title}")
|
||||||
|
cdn_num = await self.get_episode_cdn_number(page_url)
|
||||||
|
|
||||||
|
if cdn_num is None:
|
||||||
|
self.logger.error(f" Could not detect episode number, skipping")
|
||||||
|
failed_count += 1
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Check if within requested range
|
||||||
|
if start_ep is not None and cdn_num < start_ep:
|
||||||
|
self.logger.info(f" Episode {cdn_num} before start range, skipping")
|
||||||
|
continue
|
||||||
|
if end_ep is not None and cdn_num > end_ep:
|
||||||
|
self.logger.info(f" Episode {cdn_num} after end range, skipping")
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Check if file already exists
|
||||||
|
output_path = season_dir / f"E{cdn_num:02d} - {title}.mp4"
|
||||||
|
if output_path.exists() and resume:
|
||||||
|
self.logger.info(f" File exists, skipping")
|
||||||
|
state.episodes[str(cdn_num)] = {
|
||||||
|
'cdn_number': cdn_num,
|
||||||
|
'title': title,
|
||||||
|
'page_url': page_url,
|
||||||
|
'slug': slug,
|
||||||
|
'video_url': self._construct_video_url(cdn_slug, cdn_num),
|
||||||
|
'downloaded': True,
|
||||||
|
'error': None
|
||||||
|
}
|
||||||
|
state.save(state_path)
|
||||||
|
skipped_count += 1
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Construct video URL and download
|
||||||
|
video_url = self._construct_video_url(cdn_slug, cdn_num)
|
||||||
|
self.logger.info(f" Episode {cdn_num}: {title}")
|
||||||
|
|
||||||
|
success = self.download_video(video_url, output_path)
|
||||||
|
|
||||||
|
# Save state
|
||||||
|
state.episodes[str(cdn_num)] = {
|
||||||
|
'cdn_number': cdn_num,
|
||||||
|
'title': title,
|
||||||
|
'page_url': page_url,
|
||||||
|
'slug': slug,
|
||||||
|
'video_url': video_url,
|
||||||
|
'downloaded': success,
|
||||||
|
'error': None if success else "Download failed"
|
||||||
|
}
|
||||||
|
state.save(state_path)
|
||||||
|
|
||||||
|
if success:
|
||||||
|
downloaded_count += 1
|
||||||
|
else:
|
||||||
|
failed_count += 1
|
||||||
|
|
||||||
|
# Delay between episodes
|
||||||
|
if not self.dry_run:
|
||||||
|
await self._random_delay(1, 2)
|
||||||
|
|
||||||
|
# Summary
|
||||||
|
self.logger.info(f"\nComplete!")
|
||||||
|
self.logger.info(f" Downloaded: {downloaded_count}")
|
||||||
|
self.logger.info(f" Skipped: {skipped_count}")
|
||||||
|
self.logger.info(f" Failed: {failed_count}")
|
||||||
|
self.logger.info(f" Output: {season_dir}")
|
||||||
|
|
||||||
|
|
||||||
|
# ============================================================================
|
||||||
|
# CLI
|
||||||
|
# ============================================================================
|
||||||
|
|
||||||
|
def main():
|
||||||
|
parser = argparse.ArgumentParser(
|
||||||
|
description='Download Pokemon episodes from pokeflix.tv',
|
||||||
|
formatter_class=argparse.RawDescriptionHelpFormatter,
|
||||||
|
epilog="""
|
||||||
|
Examples:
|
||||||
|
%(prog)s --url "https://www.pokeflix.tv/browse/pokemon-indigo-league" --output ~/Pokemon/
|
||||||
|
%(prog)s --url "..." --start 1 --end 10 --output ~/Pokemon/
|
||||||
|
%(prog)s --url "..." --output ~/Pokemon/ --resume
|
||||||
|
%(prog)s --url "..." --quality 720p --output ~/Pokemon/
|
||||||
|
%(prog)s --url "..." --dry-run
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
|
||||||
|
parser.add_argument(
|
||||||
|
'--url', '-u',
|
||||||
|
required=True,
|
||||||
|
help='URL of the season/series page'
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
'--output', '-o',
|
||||||
|
type=Path,
|
||||||
|
default=Path.home() / 'Downloads' / 'Pokemon',
|
||||||
|
help='Output directory (default: ~/Downloads/Pokemon)'
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
'--start', '-s',
|
||||||
|
type=int,
|
||||||
|
help='Start episode number (CDN number)'
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
'--end', '-e',
|
||||||
|
type=int,
|
||||||
|
help='End episode number (CDN number)'
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
'--quality', '-q',
|
||||||
|
choices=['1080p', '720p', '360p'],
|
||||||
|
default='1080p',
|
||||||
|
help='Video quality (default: 1080p)'
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
'--resume', '-r',
|
||||||
|
action='store_true',
|
||||||
|
help='Resume from previous download state'
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
'--dry-run', '-n',
|
||||||
|
action='store_true',
|
||||||
|
help='Extract URLs only, do not download'
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
'--headless',
|
||||||
|
action='store_true',
|
||||||
|
help='Run browser in headless mode (may trigger anti-bot)'
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
'--verbose', '-v',
|
||||||
|
action='store_true',
|
||||||
|
help='Enable verbose logging'
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
'--direct',
|
||||||
|
action='store_true',
|
||||||
|
help='Direct download mode - skip page visits, just download episode range by number'
|
||||||
|
)
|
||||||
|
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
async def run():
|
||||||
|
async with PokeflixScraper(
|
||||||
|
output_dir=args.output,
|
||||||
|
headless=args.headless,
|
||||||
|
dry_run=args.dry_run,
|
||||||
|
verbose=args.verbose,
|
||||||
|
quality=args.quality
|
||||||
|
) as scraper:
|
||||||
|
await scraper.download_season(
|
||||||
|
season_url=args.url,
|
||||||
|
start_ep=args.start,
|
||||||
|
end_ep=args.end,
|
||||||
|
resume=args.resume,
|
||||||
|
direct=args.direct
|
||||||
|
)
|
||||||
|
|
||||||
|
asyncio.run(run())
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
main()
|
||||||
195
media-tools/troubleshooting.md
Normal file
195
media-tools/troubleshooting.md
Normal file
@ -0,0 +1,195 @@
|
|||||||
|
# Media Tools Troubleshooting
|
||||||
|
|
||||||
|
## Common Issues
|
||||||
|
|
||||||
|
### Playwright Issues
|
||||||
|
|
||||||
|
#### "playwright not installed" Error
|
||||||
|
```
|
||||||
|
ERROR: playwright not installed. Run: pip install playwright && playwright install chromium
|
||||||
|
```
|
||||||
|
|
||||||
|
**Solution:**
|
||||||
|
```bash
|
||||||
|
pip install playwright
|
||||||
|
playwright install chromium
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Browser Launch Fails
|
||||||
|
```
|
||||||
|
Error: Executable doesn't exist at /home/user/.cache/ms-playwright/chromium-xxx/chrome-linux/chrome
|
||||||
|
```
|
||||||
|
|
||||||
|
**Solution:**
|
||||||
|
```bash
|
||||||
|
playwright install chromium
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Timeout Errors
|
||||||
|
```
|
||||||
|
TimeoutError: Timeout 30000ms exceeded
|
||||||
|
```
|
||||||
|
|
||||||
|
**Causes:**
|
||||||
|
- Slow network connection
|
||||||
|
- Site is blocking automated access
|
||||||
|
- Page structure has changed
|
||||||
|
|
||||||
|
**Solutions:**
|
||||||
|
1. Increase timeout in script
|
||||||
|
2. Try without `--headless` flag
|
||||||
|
3. Check if site is up manually
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### yt-dlp Issues
|
||||||
|
|
||||||
|
#### "yt-dlp not found" Error
|
||||||
|
```
|
||||||
|
yt-dlp not found. Install with: pip install yt-dlp
|
||||||
|
```
|
||||||
|
|
||||||
|
**Solution:**
|
||||||
|
```bash
|
||||||
|
pip install yt-dlp
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Download Fails for Specific Host
|
||||||
|
```
|
||||||
|
ERROR: Unsupported URL: https://somehost.com/...
|
||||||
|
```
|
||||||
|
|
||||||
|
**Solution:**
|
||||||
|
```bash
|
||||||
|
# Update yt-dlp to latest version
|
||||||
|
pip install -U yt-dlp
|
||||||
|
```
|
||||||
|
|
||||||
|
If still failing, the host may be unsupported. Check [yt-dlp supported sites](https://github.com/yt-dlp/yt-dlp/blob/master/supportedsites.md).
|
||||||
|
|
||||||
|
#### Slow Downloads
|
||||||
|
**Causes:**
|
||||||
|
- Video host throttling
|
||||||
|
- Network issues
|
||||||
|
|
||||||
|
**Solutions:**
|
||||||
|
- Downloads are typically limited by the source server
|
||||||
|
- Try at different times of day
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Scraping Issues
|
||||||
|
|
||||||
|
#### No Episodes Found
|
||||||
|
```
|
||||||
|
No episodes found!
|
||||||
|
```
|
||||||
|
|
||||||
|
**Causes:**
|
||||||
|
- Site structure has changed
|
||||||
|
- Page requires authentication
|
||||||
|
- Cloudflare protection triggered
|
||||||
|
|
||||||
|
**Solutions:**
|
||||||
|
1. Run without `--headless` to see what's happening
|
||||||
|
2. Check if the URL is correct and accessible manually
|
||||||
|
3. Site may have updated their HTML structure - check selectors in script
|
||||||
|
|
||||||
|
#### Video URL Not Found
|
||||||
|
```
|
||||||
|
No video URL found for episode X
|
||||||
|
```
|
||||||
|
|
||||||
|
**Causes:**
|
||||||
|
- Video is on an unsupported host
|
||||||
|
- Page uses non-standard embedding method
|
||||||
|
- Anti-bot protection on video player
|
||||||
|
|
||||||
|
**Solutions:**
|
||||||
|
1. Run with `--verbose` to see what URLs are being tried
|
||||||
|
2. Open episode manually and check Network tab for video requests
|
||||||
|
3. May need to add new iframe selectors for the specific host
|
||||||
|
|
||||||
|
#### 403 Forbidden on Site
|
||||||
|
**Cause:** Site is blocking automated requests
|
||||||
|
|
||||||
|
**Solutions:**
|
||||||
|
1. Ensure you're NOT using `--headless`
|
||||||
|
2. Increase random delays
|
||||||
|
3. Clear browser cache/cookies (restart script)
|
||||||
|
4. Try from a different IP
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Resume Issues
|
||||||
|
|
||||||
|
#### Resume Not Working
|
||||||
|
```
|
||||||
|
# Should skip downloaded episodes but re-downloads them
|
||||||
|
```
|
||||||
|
|
||||||
|
**Check:**
|
||||||
|
1. Ensure `download_state.json` exists in output directory
|
||||||
|
2. Verify the `--resume` flag is being used
|
||||||
|
3. Check that episode numbers match between runs
|
||||||
|
|
||||||
|
#### Corrupt State File
|
||||||
|
```
|
||||||
|
JSONDecodeError: ...
|
||||||
|
```
|
||||||
|
|
||||||
|
**Solution:**
|
||||||
|
```bash
|
||||||
|
# Delete the state file to start fresh
|
||||||
|
rm /path/to/season/download_state.json
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Debug Mode
|
||||||
|
|
||||||
|
Run with verbose output:
|
||||||
|
```bash
|
||||||
|
python pokeflix_scraper.py --url "..." --output ~/Pokemon/ --verbose
|
||||||
|
```
|
||||||
|
|
||||||
|
Run dry-run to test URL extraction:
|
||||||
|
```bash
|
||||||
|
python pokeflix_scraper.py --url "..." --dry-run --verbose
|
||||||
|
```
|
||||||
|
|
||||||
|
Watch the browser (non-headless):
|
||||||
|
```bash
|
||||||
|
python pokeflix_scraper.py --url "..." --output ~/Pokemon/
|
||||||
|
# (headless is off by default)
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Manual Workarounds
|
||||||
|
|
||||||
|
### If Automated Extraction Fails
|
||||||
|
|
||||||
|
1. **Browser DevTools method:**
|
||||||
|
- Open episode in browser
|
||||||
|
- F12 → Network tab → filter "m3u8" or "mp4"
|
||||||
|
- Play video, copy the stream URL
|
||||||
|
- Download manually: `yt-dlp "URL"`
|
||||||
|
|
||||||
|
2. **Check iframe manually:**
|
||||||
|
- Right-click video player → Inspect
|
||||||
|
- Find `<iframe>` element
|
||||||
|
- Copy `src` attribute
|
||||||
|
- Use that URL with yt-dlp
|
||||||
|
|
||||||
|
### Known Video Hosts
|
||||||
|
|
||||||
|
These hosts are typically supported by yt-dlp:
|
||||||
|
- Streamtape
|
||||||
|
- Vidoza
|
||||||
|
- Mp4upload
|
||||||
|
- Doodstream
|
||||||
|
- Filemoon
|
||||||
|
- Voe.sx
|
||||||
|
|
||||||
|
If the video is on an unsupported host, check if there's an alternative server/quality option on the episode page.
|
||||||
184
productivity/n8n/workflows/tdarr-gpu-monitor.json
Normal file
184
productivity/n8n/workflows/tdarr-gpu-monitor.json
Normal file
@ -0,0 +1,184 @@
|
|||||||
|
{
|
||||||
|
"name": "Tdarr GPU Monitor",
|
||||||
|
"nodes": [
|
||||||
|
{
|
||||||
|
"parameters": {
|
||||||
|
"rule": {
|
||||||
|
"interval": [
|
||||||
|
{
|
||||||
|
"field": "minutes",
|
||||||
|
"minutesInterval": 5
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"id": "schedule-trigger",
|
||||||
|
"name": "Every 5 Minutes",
|
||||||
|
"type": "n8n-nodes-base.scheduleTrigger",
|
||||||
|
"typeVersion": 1.2,
|
||||||
|
"position": [0, 0]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"parameters": {
|
||||||
|
"command": "ssh -i /root/.ssh/n8n_to_claude -o BatchMode=yes -o ConnectTimeout=10 cal@10.10.0.226 'docker exec tdarr-node nvidia-smi --query-gpu=name --format=csv,noheader 2>&1'"
|
||||||
|
},
|
||||||
|
"id": "check-gpu",
|
||||||
|
"name": "Check GPU Access",
|
||||||
|
"type": "n8n-nodes-base.executeCommand",
|
||||||
|
"typeVersion": 1,
|
||||||
|
"position": [220, 0]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"parameters": {
|
||||||
|
"conditions": {
|
||||||
|
"options": {
|
||||||
|
"caseSensitive": false,
|
||||||
|
"leftValue": "",
|
||||||
|
"typeValidation": "loose"
|
||||||
|
},
|
||||||
|
"conditions": [
|
||||||
|
{
|
||||||
|
"id": "gpu-error-check",
|
||||||
|
"leftValue": "={{ $json.stdout + $json.stderr }}",
|
||||||
|
"rightValue": "NVML|CUDA_ERROR|Unknown Error|no CUDA-capable|Failed to initialize",
|
||||||
|
"operator": {
|
||||||
|
"type": "string",
|
||||||
|
"operation": "regex"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"combinator": "or"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"id": "check-error",
|
||||||
|
"name": "GPU Error Detected?",
|
||||||
|
"type": "n8n-nodes-base.if",
|
||||||
|
"typeVersion": 2.2,
|
||||||
|
"position": [440, 0]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"parameters": {
|
||||||
|
"command": "ssh -i /root/.ssh/n8n_to_claude -o BatchMode=yes -o ConnectTimeout=10 cal@10.10.0.226 'cd /home/cal/docker/tdarr && docker compose restart tdarr-node 2>&1'"
|
||||||
|
},
|
||||||
|
"id": "restart-container",
|
||||||
|
"name": "Restart tdarr-node",
|
||||||
|
"type": "n8n-nodes-base.executeCommand",
|
||||||
|
"typeVersion": 1,
|
||||||
|
"position": [660, -100]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"parameters": {
|
||||||
|
"command": "ssh -i /root/.ssh/n8n_to_claude -o BatchMode=yes -o ConnectTimeout=30 cal@10.10.0.226 'sleep 5 && docker exec tdarr-node nvidia-smi --query-gpu=name --format=csv,noheader 2>&1'"
|
||||||
|
},
|
||||||
|
"id": "verify-gpu",
|
||||||
|
"name": "Verify GPU Restored",
|
||||||
|
"type": "n8n-nodes-base.executeCommand",
|
||||||
|
"typeVersion": 1,
|
||||||
|
"position": [880, -100]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"parameters": {
|
||||||
|
"method": "POST",
|
||||||
|
"url": "https://discord.com/api/webhooks/1451783909409816763/O9PMDiNt6ZIWRf8HKocIZ_E4vMGV_lEwq50aAiZ9HVFR2UGwO6J1N9_wOm82p0MetIqT",
|
||||||
|
"sendBody": true,
|
||||||
|
"specifyBody": "json",
|
||||||
|
"jsonBody": "={\n \"embeds\": [{\n \"title\": \"Tdarr GPU Recovery\",\n \"description\": \"GPU access was lost and container was automatically restarted.\",\n \"color\": {{ $('Verify GPU Restored').item.json.stdout.includes('GTX') || $('Verify GPU Restored').item.json.stdout.includes('NVIDIA') ? 3066993 : 15158332 }},\n \"fields\": [\n {\n \"name\": \"Original Error\",\n \"value\": \"```{{ $('Check GPU Access').item.json.stdout.substring(0, 200) }}{{ $('Check GPU Access').item.json.stderr.substring(0, 200) }}```\",\n \"inline\": false\n },\n {\n \"name\": \"Recovery Status\",\n \"value\": \"{{ $('Verify GPU Restored').item.json.stdout.includes('GTX') || $('Verify GPU Restored').item.json.stdout.includes('NVIDIA') ? '✅ GPU access restored' : '❌ GPU still not accessible - manual intervention needed' }}\",\n \"inline\": false\n },\n {\n \"name\": \"Post-Restart GPU\",\n \"value\": \"```{{ $('Verify GPU Restored').item.json.stdout.substring(0, 200) }}```\",\n \"inline\": false\n }\n ],\n \"timestamp\": \"{{ new Date().toISOString() }}\",\n \"footer\": {\n \"text\": \"Tdarr GPU Monitor | ubuntu-manticore\"\n }\n }]\n}",
|
||||||
|
"options": {}
|
||||||
|
},
|
||||||
|
"id": "discord-notify",
|
||||||
|
"name": "Discord Notification",
|
||||||
|
"type": "n8n-nodes-base.httpRequest",
|
||||||
|
"typeVersion": 4.2,
|
||||||
|
"position": [1100, -100]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"parameters": {},
|
||||||
|
"id": "no-action",
|
||||||
|
"name": "GPU OK - No Action",
|
||||||
|
"type": "n8n-nodes-base.noOp",
|
||||||
|
"typeVersion": 1,
|
||||||
|
"position": [660, 100]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"connections": {
|
||||||
|
"Every 5 Minutes": {
|
||||||
|
"main": [
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"node": "Check GPU Access",
|
||||||
|
"type": "main",
|
||||||
|
"index": 0
|
||||||
|
}
|
||||||
|
]
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"Check GPU Access": {
|
||||||
|
"main": [
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"node": "GPU Error Detected?",
|
||||||
|
"type": "main",
|
||||||
|
"index": 0
|
||||||
|
}
|
||||||
|
]
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"GPU Error Detected?": {
|
||||||
|
"main": [
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"node": "Restart tdarr-node",
|
||||||
|
"type": "main",
|
||||||
|
"index": 0
|
||||||
|
}
|
||||||
|
],
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"node": "GPU OK - No Action",
|
||||||
|
"type": "main",
|
||||||
|
"index": 0
|
||||||
|
}
|
||||||
|
]
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"Restart tdarr-node": {
|
||||||
|
"main": [
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"node": "Verify GPU Restored",
|
||||||
|
"type": "main",
|
||||||
|
"index": 0
|
||||||
|
}
|
||||||
|
]
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"Verify GPU Restored": {
|
||||||
|
"main": [
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"node": "Discord Notification",
|
||||||
|
"type": "main",
|
||||||
|
"index": 0
|
||||||
|
}
|
||||||
|
]
|
||||||
|
]
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"settings": {
|
||||||
|
"executionOrder": "v1"
|
||||||
|
},
|
||||||
|
"staticData": null,
|
||||||
|
"tags": [
|
||||||
|
{
|
||||||
|
"name": "homelab"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "monitoring"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "tdarr"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"triggerCount": 1,
|
||||||
|
"pinData": {}
|
||||||
|
}
|
||||||
385
vm-management/scripts/CONTEXT.md
Normal file
385
vm-management/scripts/CONTEXT.md
Normal file
@ -0,0 +1,385 @@
|
|||||||
|
# VM Management Scripts - Operational Context
|
||||||
|
|
||||||
|
## Script Overview
|
||||||
|
This directory contains active operational scripts for VM provisioning, LXC container creation, Docker configuration in containers, and system migration.
|
||||||
|
|
||||||
|
## Core Scripts
|
||||||
|
|
||||||
|
### VM Post-Installation Provisioning
|
||||||
|
**Script**: `vm-post-install.sh`
|
||||||
|
**Purpose**: Automated provisioning of existing VMs with security hardening, SSH keys, and Docker
|
||||||
|
|
||||||
|
**Key Features**:
|
||||||
|
- System updates and essential packages installation
|
||||||
|
- SSH key deployment (primary + emergency keys)
|
||||||
|
- SSH security hardening (disable password authentication)
|
||||||
|
- Docker and Docker Compose installation
|
||||||
|
- User environment setup with bash aliases
|
||||||
|
- Automatic security updates configuration
|
||||||
|
|
||||||
|
**Usage**:
|
||||||
|
```bash
|
||||||
|
./vm-post-install.sh <vm-ip> [ssh-user]
|
||||||
|
|
||||||
|
# Example
|
||||||
|
./vm-post-install.sh 10.10.0.100 cal
|
||||||
|
```
|
||||||
|
|
||||||
|
**Requirements**:
|
||||||
|
- Target VM must have SSH access enabled initially
|
||||||
|
- Homelab SSH keys must exist: `~/.ssh/homelab_rsa` and `~/.ssh/emergency_homelab_rsa`
|
||||||
|
- Initial connection may require password authentication (disabled after provisioning)
|
||||||
|
|
||||||
|
**Post-Provision Verification**:
|
||||||
|
```bash
|
||||||
|
# Test SSH with key
|
||||||
|
ssh cal@<vm-ip>
|
||||||
|
|
||||||
|
# Verify Docker
|
||||||
|
docker --version
|
||||||
|
docker run --rm hello-world
|
||||||
|
|
||||||
|
# Check security
|
||||||
|
sudo sshd -T | grep -E "(passwordauth|pubkeyauth)"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Cloud-Init Automated Provisioning
|
||||||
|
**File**: `cloud-init-user-data.yaml`
|
||||||
|
**Purpose**: Fully automated VM provisioning from first boot using Proxmox cloud-init
|
||||||
|
|
||||||
|
**Features**:
|
||||||
|
- User creation with sudo privileges
|
||||||
|
- SSH keys pre-installed (no password auth needed)
|
||||||
|
- Automatic package updates
|
||||||
|
- Docker and Docker Compose installation
|
||||||
|
- Security hardening from first boot
|
||||||
|
- Useful bash aliases and environment setup
|
||||||
|
- Welcome message with system status
|
||||||
|
|
||||||
|
**Usage in Proxmox**:
|
||||||
|
1. Create new VM with cloud-init support
|
||||||
|
2. Go to Cloud-Init tab in VM settings
|
||||||
|
3. Copy contents of `cloud-init-user-data.yaml`
|
||||||
|
4. Paste into "User Data" field
|
||||||
|
5. Start VM - fully provisioned automatically
|
||||||
|
|
||||||
|
**Benefits**:
|
||||||
|
- Zero-touch provisioning
|
||||||
|
- Consistent configuration across all VMs
|
||||||
|
- No password authentication ever enabled
|
||||||
|
- Faster deployment than post-install script
|
||||||
|
|
||||||
|
### Docker AppArmor Fix for LXC
|
||||||
|
**Script**: `fix-docker-apparmor.sh`
|
||||||
|
**Purpose**: Add AppArmor=unconfined to docker-compose.yml files for LXC compatibility
|
||||||
|
|
||||||
|
**Why Needed**: Docker containers inside LXC containers require AppArmor to be disabled. Without this fix, containers may fail to start or have permission issues.
|
||||||
|
|
||||||
|
**Usage**:
|
||||||
|
```bash
|
||||||
|
./fix-docker-apparmor.sh <LXC_IP> [COMPOSE_DIR]
|
||||||
|
|
||||||
|
# Example - use default directory (/home/cal/container-data)
|
||||||
|
./fix-docker-apparmor.sh 10.10.0.214
|
||||||
|
|
||||||
|
# Example - specify custom directory
|
||||||
|
./fix-docker-apparmor.sh 10.10.0.214 /home/cal/docker
|
||||||
|
```
|
||||||
|
|
||||||
|
**What It Does**:
|
||||||
|
1. SSHs into the LXC container
|
||||||
|
2. Finds all docker-compose.yml files in specified directory
|
||||||
|
3. Adds `security_opt: ["apparmor=unconfined"]` to each service
|
||||||
|
4. Creates `.bak` backups of original files before modification
|
||||||
|
|
||||||
|
**Safety Features**:
|
||||||
|
- Creates backups before modifications
|
||||||
|
- Color-coded output for easy monitoring
|
||||||
|
- Error handling with detailed logging
|
||||||
|
- Validates SSH connectivity before proceeding
|
||||||
|
|
||||||
|
### LXC Container Creation with Docker
|
||||||
|
**Script**: `lxc-docker-create.sh`
|
||||||
|
**Purpose**: Create new LXC containers pre-configured for Docker hosting
|
||||||
|
|
||||||
|
**Key Features**:
|
||||||
|
- Automated LXC container creation in Proxmox
|
||||||
|
- Docker and Docker Compose pre-installed
|
||||||
|
- AppArmor configuration for container compatibility
|
||||||
|
- Network configuration
|
||||||
|
- Security settings optimized for Docker hosting
|
||||||
|
|
||||||
|
**Usage**:
|
||||||
|
```bash
|
||||||
|
./lxc-docker-create.sh [options]
|
||||||
|
```
|
||||||
|
|
||||||
|
**Common Use Cases**:
|
||||||
|
- Creating Docker hosts for specific services (n8n, gitea, etc.)
|
||||||
|
- Rapid deployment of containerized applications
|
||||||
|
- Consistent LXC configuration across infrastructure
|
||||||
|
|
||||||
|
### LXC Migration Guide
|
||||||
|
**Document**: `LXC-MIGRATION-GUIDE.md`
|
||||||
|
**Purpose**: Step-by-step procedures for migrating LXC containers between hosts
|
||||||
|
|
||||||
|
**Covers**:
|
||||||
|
- Pre-migration planning and backups
|
||||||
|
- LXC configuration export/import
|
||||||
|
- Storage migration strategies
|
||||||
|
- Network reconfiguration
|
||||||
|
- Post-migration validation
|
||||||
|
- Rollback procedures
|
||||||
|
|
||||||
|
**When to Use**:
|
||||||
|
- Moving containers to new Proxmox host
|
||||||
|
- Hardware upgrades
|
||||||
|
- Load balancing across nodes
|
||||||
|
- Disaster recovery scenarios
|
||||||
|
|
||||||
|
## Operational Patterns
|
||||||
|
|
||||||
|
### VM Provisioning Workflow
|
||||||
|
|
||||||
|
**Option 1: New VMs (Preferred)**
|
||||||
|
```bash
|
||||||
|
# 1. Create VM in Proxmox with cloud-init support
|
||||||
|
# 2. Copy cloud-init-user-data.yaml to User Data field
|
||||||
|
# 3. Start VM
|
||||||
|
# 4. Verify provisioning completed:
|
||||||
|
ssh cal@<vm-ip>
|
||||||
|
docker --version
|
||||||
|
```
|
||||||
|
|
||||||
|
**Option 2: Existing VMs**
|
||||||
|
```bash
|
||||||
|
# 1. Ensure VM has SSH enabled and accessible
|
||||||
|
# 2. Run post-install script:
|
||||||
|
./vm-post-install.sh 10.10.0.100 cal
|
||||||
|
|
||||||
|
# 3. Verify provisioning:
|
||||||
|
ssh cal@10.10.0.100
|
||||||
|
docker --version
|
||||||
|
```
|
||||||
|
|
||||||
|
### LXC Docker Deployment Workflow
|
||||||
|
|
||||||
|
**Creating New LXC for Docker**:
|
||||||
|
```bash
|
||||||
|
# 1. Create LXC container
|
||||||
|
./lxc-docker-create.sh --id 220 --hostname docker-app --ip 10.10.0.220
|
||||||
|
|
||||||
|
# 2. Deploy docker-compose configurations
|
||||||
|
scp -r ./app-config/ cal@10.10.0.220:/home/cal/container-data/
|
||||||
|
|
||||||
|
# 3. Fix AppArmor compatibility
|
||||||
|
./fix-docker-apparmor.sh 10.10.0.220
|
||||||
|
|
||||||
|
# 4. Start containers
|
||||||
|
ssh cal@10.10.0.220 "cd /home/cal/container-data/app-config && docker compose up -d"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Existing LXC with Docker Issues**:
|
||||||
|
```bash
|
||||||
|
# If containers failing to start in LXC:
|
||||||
|
./fix-docker-apparmor.sh <LXC_IP>
|
||||||
|
|
||||||
|
# Restart affected containers
|
||||||
|
ssh cal@<LXC_IP> "cd /home/cal/container-data && docker compose restart"
|
||||||
|
```
|
||||||
|
|
||||||
|
### SSH Key Integration
|
||||||
|
|
||||||
|
**Both provisioning methods use**:
|
||||||
|
- **Primary Key**: `~/.ssh/homelab_rsa` - Daily use authentication
|
||||||
|
- **Emergency Key**: `~/.ssh/emergency_homelab_rsa` - Backup access
|
||||||
|
|
||||||
|
**Security Configuration**:
|
||||||
|
- Password authentication completely disabled after provisioning
|
||||||
|
- Only key-based SSH access allowed
|
||||||
|
- Emergency keys provide backup access if primary key fails
|
||||||
|
- Automatic security updates enabled
|
||||||
|
|
||||||
|
**Key Management Integration**:
|
||||||
|
- Keys managed by `/networking/scripts/ssh_key_maintenance.sh`
|
||||||
|
- Monthly backups of all SSH keys
|
||||||
|
- Rotation recommendations for keys > 365 days old
|
||||||
|
|
||||||
|
## Configuration Dependencies
|
||||||
|
|
||||||
|
### Required Local Files
|
||||||
|
- `~/.ssh/homelab_rsa` - Primary SSH private key
|
||||||
|
- `~/.ssh/homelab_rsa.pub` - Primary SSH public key
|
||||||
|
- `~/.ssh/emergency_homelab_rsa` - Emergency SSH private key
|
||||||
|
- `~/.ssh/emergency_homelab_rsa.pub` - Emergency SSH public key
|
||||||
|
|
||||||
|
### Target VM Requirements
|
||||||
|
- **For post-install script**: SSH enabled, initial authentication method available
|
||||||
|
- **For cloud-init**: Proxmox cloud-init support, fresh VM
|
||||||
|
- **For LXC**: Proxmox host with LXC support
|
||||||
|
|
||||||
|
### Network Requirements
|
||||||
|
- VMs/LXCs on 10.10.0.0/24 network (homelab subnet)
|
||||||
|
- SSH access (port 22) to target systems
|
||||||
|
- Internet access on target systems for package installation
|
||||||
|
|
||||||
|
## Troubleshooting Context
|
||||||
|
|
||||||
|
### Common Issues
|
||||||
|
|
||||||
|
**1. vm-post-install.sh Connection Failures**
|
||||||
|
```bash
|
||||||
|
# Verify VM is accessible
|
||||||
|
ping <vm-ip>
|
||||||
|
nc -z <vm-ip> 22
|
||||||
|
|
||||||
|
# Check SSH service on target
|
||||||
|
ssh <vm-ip> "systemctl status sshd"
|
||||||
|
|
||||||
|
# Verify SSH keys exist locally
|
||||||
|
ls -la ~/.ssh/homelab_rsa*
|
||||||
|
```
|
||||||
|
|
||||||
|
**2. Cloud-Init Not Working**
|
||||||
|
```bash
|
||||||
|
# On Proxmox host, check cloud-init support
|
||||||
|
qm cloudinit dump <vmid>
|
||||||
|
|
||||||
|
# On VM, check cloud-init logs
|
||||||
|
sudo cloud-init status --long
|
||||||
|
sudo cat /var/log/cloud-init.log
|
||||||
|
```
|
||||||
|
|
||||||
|
**3. Docker Containers Fail in LXC**
|
||||||
|
```bash
|
||||||
|
# Symptom: Containers won't start, permission errors
|
||||||
|
# Solution: Run AppArmor fix
|
||||||
|
./fix-docker-apparmor.sh <LXC_IP>
|
||||||
|
|
||||||
|
# Verify security_opt was added
|
||||||
|
ssh cal@<LXC_IP> "grep -r 'security_opt' ~/container-data/"
|
||||||
|
|
||||||
|
# Check Docker logs
|
||||||
|
ssh cal@<LXC_IP> "docker compose logs"
|
||||||
|
```
|
||||||
|
|
||||||
|
**4. SSH Key Authentication Fails After Provisioning**
|
||||||
|
```bash
|
||||||
|
# Verify key permissions
|
||||||
|
ls -la ~/.ssh/homelab_rsa
|
||||||
|
chmod 600 ~/.ssh/homelab_rsa
|
||||||
|
|
||||||
|
# Check authorized_keys on target
|
||||||
|
ssh <vm-ip> "cat ~/.ssh/authorized_keys"
|
||||||
|
|
||||||
|
# Test with verbose output
|
||||||
|
ssh -v cal@<vm-ip>
|
||||||
|
```
|
||||||
|
|
||||||
|
**5. Docker Installation Issues**
|
||||||
|
```bash
|
||||||
|
# Check internet connectivity on VM
|
||||||
|
ssh <vm-ip> "ping -c 3 8.8.8.8"
|
||||||
|
|
||||||
|
# Verify Docker GPG key
|
||||||
|
ssh <vm-ip> "apt-key list | grep -i docker"
|
||||||
|
|
||||||
|
# Check Docker service
|
||||||
|
ssh <vm-ip> "systemctl status docker"
|
||||||
|
|
||||||
|
# Manual Docker install if needed
|
||||||
|
ssh <vm-ip> "curl -fsSL https://get.docker.com | sh"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Diagnostic Commands
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Post-provisioning validation
|
||||||
|
ssh cal@<vm-ip> "groups" # Should include: sudo docker
|
||||||
|
ssh cal@<vm-ip> "docker run --rm hello-world"
|
||||||
|
ssh cal@<vm-ip> "sudo sshd -T | grep passwordauth" # Should be "no"
|
||||||
|
|
||||||
|
# Cloud-init status check
|
||||||
|
ssh cal@<vm-ip> "cloud-init status"
|
||||||
|
ssh cal@<vm-ip> "cloud-init query -f '{{ds.meta_data.hostname}}'"
|
||||||
|
|
||||||
|
# Docker in LXC verification
|
||||||
|
ssh cal@<LXC_IP> "docker info | grep -i apparmor"
|
||||||
|
ssh cal@<LXC_IP> "docker compose config" # Validate compose files
|
||||||
|
|
||||||
|
# SSH key connectivity test
|
||||||
|
ssh -o ConnectTimeout=5 cal@<vm-ip> "echo 'SSH OK'"
|
||||||
|
```
|
||||||
|
|
||||||
|
## Integration Points
|
||||||
|
|
||||||
|
### External Dependencies
|
||||||
|
- **Proxmox VE**: For VM/LXC creation and cloud-init support
|
||||||
|
- **SSH**: For remote provisioning and management
|
||||||
|
- **Docker**: Installed on target systems
|
||||||
|
- **Cloud-init**: For automated VM provisioning
|
||||||
|
- **AppArmor**: Security framework (configured for LXC compatibility)
|
||||||
|
|
||||||
|
### File System Dependencies
|
||||||
|
- **Script Directory**: `/mnt/NV2/Development/claude-home/vm-management/scripts/`
|
||||||
|
- **SSH Keys**: `~/.ssh/homelab_rsa*`, `~/.ssh/emergency_homelab_rsa*`
|
||||||
|
- **LXC Compose Directories**: Typically `/home/cal/container-data/` on target
|
||||||
|
- **Backup Files**: `.bak` files created by AppArmor fix script
|
||||||
|
|
||||||
|
### Network Dependencies
|
||||||
|
- **Management Network**: 10.10.0.0/24 subnet
|
||||||
|
- **Internet Access**: Required for package installation
|
||||||
|
- **Proxmox API**: For LXC creation operations
|
||||||
|
- **DNS**: For hostname resolution
|
||||||
|
|
||||||
|
## Security Considerations
|
||||||
|
|
||||||
|
### SSH Security
|
||||||
|
- Password authentication disabled after provisioning
|
||||||
|
- Only key-based authentication allowed
|
||||||
|
- Emergency keys provide backup access
|
||||||
|
- Root login disabled
|
||||||
|
|
||||||
|
### Docker Security
|
||||||
|
- User in docker group (no sudo needed for docker commands)
|
||||||
|
- AppArmor unconfined in LXC (required for functionality)
|
||||||
|
- Containers run as non-root when possible
|
||||||
|
- Network isolation via Docker networks
|
||||||
|
|
||||||
|
### VM/LXC Security
|
||||||
|
- Automatic security updates enabled
|
||||||
|
- Minimal package installation (only essentials)
|
||||||
|
- Firewall configuration recommended post-provisioning
|
||||||
|
- Regular key rotation via SSH key maintenance
|
||||||
|
|
||||||
|
## Performance Considerations
|
||||||
|
|
||||||
|
### Cloud-Init vs Post-Install
|
||||||
|
- **Cloud-init**: Faster (0-touch), no manual SSH needed, better for multiple VMs
|
||||||
|
- **Post-install**: More flexible, works with existing VMs, easier debugging
|
||||||
|
|
||||||
|
### LXC vs VM
|
||||||
|
- **LXC**: Lower overhead, faster startup, shared kernel
|
||||||
|
- **VM**: Better isolation, GPU passthrough support, different kernels possible
|
||||||
|
|
||||||
|
### Docker in LXC
|
||||||
|
- **Performance**: Near-native, minimal overhead with AppArmor disabled
|
||||||
|
- **I/O**: Use local storage for best performance, NFS for shared data
|
||||||
|
- **Networking**: Bridge mode for simplicity, macvlan for direct network access
|
||||||
|
|
||||||
|
## Related Documentation
|
||||||
|
|
||||||
|
- **Technology Overview**: `/vm-management/CONTEXT.md`
|
||||||
|
- **Troubleshooting**: `/vm-management/troubleshooting.md`
|
||||||
|
- **Examples**: `/vm-management/examples/` - Configuration templates
|
||||||
|
- **SSH Management**: `/networking/scripts/ssh_key_maintenance.sh`
|
||||||
|
- **Docker Patterns**: `/docker/CONTEXT.md`
|
||||||
|
- **Main Instructions**: `/CLAUDE.md` - Context loading rules
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
|
||||||
|
These scripts form the foundation of the homelab VM and LXC provisioning strategy. They ensure consistent configuration, security hardening, and Docker compatibility across all virtualized infrastructure.
|
||||||
|
|
||||||
|
The cloud-init approach is preferred for new deployments due to zero-touch provisioning, while the post-install script provides flexibility for existing systems or troubleshooting scenarios.
|
||||||
|
|
||||||
|
AppArmor configuration is critical for Docker-in-LXC deployments and should be applied to all LXC containers running Docker to prevent container startup failures.
|
||||||
Loading…
Reference in New Issue
Block a user