claude-memory/graph/solutions/bestiary-pdf-extraction-workflow-ba57b5.md

---
id: ba57b560-92e1-4b6a-91b9-fb690ac44c96
type: solution
title: "Bestiary PDF extraction workflow"
tags: [vagabond-rpg, foundryvtt, pdf-extraction, bestiary, python]
importance: 0.7
confidence: 0.8
created: "2025-12-18T19:28:22.217828+00:00"
updated: "2025-12-18T19:28:22.217828+00:00"
---

Extracted 143 new creatures from Vagabond RPG PDF using pdftotext -raw and custom Python parser. Key learnings: 1) pdftotext -raw produces cleaner output than -layout for two-column PDFs, 2) Actor compendiums need _key prefix !actors! not !items!, 3) ID collisions handled with numeric suffixes, 4) Basic stats (HD, HP, TL, Speed, Armor, Morale, Zone) parse reliably, actions/abilities captured as raw text for manual refinement.