The source website uses <span class='energy-text energy-text--type-fire'> to render inline energy icons. BeautifulSoup's get_text() was stripping these spans, losing the energy type information and causing merged text like 'Discard aEnergy' instead of 'Discard a Fire Energy'. Changes: - Add ENERGY_TEXT_TYPES mapping for inline energy references - Add replace_energy_text_spans() to convert spans to text before extraction - Add extract_effect_text() helper with proper text joining (separator=' ') - Update parse_attack(), parse_ability(), _parse_trainer_details() to use it - Fix JSON encoding in convert_cards.py to use UTF-8 (ensure_ascii=False) Before: 'Discard an Energy from this Pokémon' After: 'Discard a Fire Energy from this Pokémon' Re-scraped all 372 cards and regenerated 382 definitions. |
||
|---|---|---|
| .. | ||
| a1 | ||
| a1a | ||
| _index.json | ||
| .gitkeep | ||
| README.md | ||
Raw Scraped Data
Scraped from pokemon-zone.com. Reference only - do not edit.
Run scripts/scrape_pokemon_pocket.py to update this data.
For authoritative card definitions used by the game engine, see ../definitions/.
Structure
raw/
├── _index.json # Index of all scraped cards
├── a1/ # Genetic Apex set
│ └── *.json # Individual card files
└── a1a/ # Mythical Island set
└── *.json # Individual card files
Notes
- This data is the raw output from the scraper
- Schema may differ from the game engine's CardDefinition model
- Use
scripts/convert_cards.pyto transform this data into definitions