feat: add migration tracking system (#81) #96

Open
Claude wants to merge 1 commits from issue/81-add-migration-tracking-system into main
Collaborator

Closes #81

What

Adds a lightweight migration tracking system to prevent double-application and missed migrations across dev/prod environments.

Changes

  • migrations/2026-03-27_add_schema_versions_table.sql — creates schema_versions(filename PK, applied_at TIMESTAMP) table
  • migrations.py — standalone runner: creates schema_versions if absent, sorts all migrations/*.sql files alphabetically, applies any not yet recorded, inserts each filename on success
  • .gitignore — removed migrations.py entry (it was listed as a legacy root-level file to ignore; the new script should be tracked)

Usage

python migrations.py

Reads POSTGRES_PASSWORD (required), POSTGRES_DB, POSTGRES_USER, POSTGRES_HOST, POSTGRES_PORT — same env vars as the API.

First-run behaviour on existing databases

The first run will apply all existing migration files (they weren't previously tracked). This is safe — every existing migration uses CREATE TABLE IF NOT EXISTS, ALTER TABLE … ADD COLUMN IF NOT EXISTS, or CREATE INDEX IF NOT EXISTS.

Other observations

  • .gitignore has duplicate db_engine.py entries (lines 56, 59, 61) — minor cleanup opportunity, not changed here.
Closes #81 ## What Adds a lightweight migration tracking system to prevent double-application and missed migrations across dev/prod environments. ## Changes - **`migrations/2026-03-27_add_schema_versions_table.sql`** — creates `schema_versions(filename PK, applied_at TIMESTAMP)` table - **`migrations.py`** — standalone runner: creates `schema_versions` if absent, sorts all `migrations/*.sql` files alphabetically, applies any not yet recorded, inserts each filename on success - **`.gitignore`** — removed `migrations.py` entry (it was listed as a legacy root-level file to ignore; the new script should be tracked) ## Usage ```bash python migrations.py ``` Reads `POSTGRES_PASSWORD` (required), `POSTGRES_DB`, `POSTGRES_USER`, `POSTGRES_HOST`, `POSTGRES_PORT` — same env vars as the API. ## First-run behaviour on existing databases The first run will apply all existing migration files (they weren't previously tracked). This is safe — every existing migration uses `CREATE TABLE IF NOT EXISTS`, `ALTER TABLE … ADD COLUMN IF NOT EXISTS`, or `CREATE INDEX IF NOT EXISTS`. ## Other observations - `.gitignore` has duplicate `db_engine.py` entries (lines 56, 59, 61) — minor cleanup opportunity, not changed here.
Claude added 1 commit 2026-03-27 10:34:31 +00:00
feat: add migration tracking system (#81)
All checks were successful
Build Docker Image / build (pull_request) Successful in 2m11s
eccf4d1441
Adds schema_versions table and migrations.py runner to prevent
double-application and missed migrations across dev/prod environments.

- migrations/2026-03-27_add_schema_versions_table.sql: creates tracking table
- migrations.py: applies pending .sql files in sorted order, records each in schema_versions
- .gitignore: untrack migrations.py (was incorrectly ignored as legacy root file)

First run on an existing DB will apply all migrations (safe — all use IF NOT EXISTS).

Closes #81

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude added the
ai-reviewing
label 2026-03-27 10:45:51 +00:00
Claude reviewed 2026-03-27 10:47:37 +00:00
Claude left a comment
Author
Collaborator

AI Code Review

Files Reviewed

  • .gitignore (modified)
  • migrations.py (added)
  • migrations/2026-03-27_add_schema_versions_table.sql (added)

Findings

Correctness

  • Transaction model is sound. _CREATE_SCHEMA_VERSIONS runs in its own with conn: block (commits on exit). The SELECT filename FROM schema_versions runs outside a with conn: block, starting an implicit transaction. When the first migration's with conn: block commits, that implicit transaction is committed along with the migration SQL and INSERT — harmless since the SELECT is read-only.
  • No-pending-migrations path: If not pending:, the early return leaves an implicit SELECT transaction open. finally: conn.close() rolls it back silently. Correct behaviour for a read-only transaction.
  • Multi-statement migration files: cur.execute(sql) with no parameters uses psycopg2's PQexec path, which supports multiple semicolon-separated statements. The existing index migrations (PR #95) with multiple CREATE INDEX statements will run correctly.
  • applied snapshot is safe: The set is computed once before the loop. Since migration SQL files never insert into schema_versions themselves, no re-query is needed between iterations.
  • schema_versions created twice on first run: The inline _CREATE_SCHEMA_VERSIONS creates the table, then 2026-03-27_add_schema_versions_table.sql runs the same DDL as a pending migration. CREATE TABLE IF NOT EXISTS makes the second execution a no-op. Intentional and harmless.
  • Alphabetical sort order: Undated files (add_custom_commands_tables.sql, add_help_commands_table.sql) sort before dated files (2026-…) since '2' < 'a' lexicographically — correct, older migrations run first.

Security

  • No SQL injection risk: parameterized INSERT uses %s with a filesystem filename. Migration SQL is operator-controlled (not user input).
  • No hardcoded credentials — POSTGRES_PASSWORD is required, RuntimeError raised at startup if absent. Same pattern as PR #84.
  • int(os.environ.get("POSTGRES_PORT", "5432")) will raise ValueError on a malformed env var — caught by the outer except Exception and exits with code 1. Acceptable for an ops script.

Style & Conventions

  • psycopg2 directly (not Peewee ORM) is the right choice for a standalone runner that bootstraps before the ORM is configured.
  • Module-level docstring, Path for file handling, named constants — all clean.
  • .gitignore removal of migrations.py is correct; the PR body accurately explains the legacy context.

Suggestions

  • Non-blocking: Wrapping the SELECT in its own with conn: block would make the transaction boundaries more explicit and avoid the implicit-transaction subtlety:
    with conn:
        with conn.cursor() as cur:
            cur.execute("SELECT filename FROM schema_versions")
            applied = {row[0] for row in cur.fetchall()}
    
    Functionally equivalent but clearer intent.
  • Non-blocking: The .gitignore duplicate db_engine.py entries (lines 56, 59, 61) noted in the PR body — a follow-up cleanup issue would be worth filing.

Verdict: APPROVED

Clean, well-structured implementation. Transaction model is correct, idempotent on existing databases, fails fast on missing credentials, and stops cleanly on migration failure. The inline table-creation + migration-file approach is a sensible bootstrap pattern.


Automated review by Claude PR Reviewer

## AI Code Review ### Files Reviewed - `.gitignore` (modified) - `migrations.py` (added) - `migrations/2026-03-27_add_schema_versions_table.sql` (added) ### Findings #### Correctness - **Transaction model is sound.** `_CREATE_SCHEMA_VERSIONS` runs in its own `with conn:` block (commits on exit). The `SELECT filename FROM schema_versions` runs outside a `with conn:` block, starting an implicit transaction. When the first migration's `with conn:` block commits, that implicit transaction is committed along with the migration SQL and INSERT — harmless since the SELECT is read-only. - **No-pending-migrations path:** If `not pending:`, the early `return` leaves an implicit SELECT transaction open. `finally: conn.close()` rolls it back silently. Correct behaviour for a read-only transaction. - **Multi-statement migration files:** `cur.execute(sql)` with no parameters uses psycopg2's `PQexec` path, which supports multiple semicolon-separated statements. The existing index migrations (PR #95) with multiple `CREATE INDEX` statements will run correctly. - **`applied` snapshot is safe:** The set is computed once before the loop. Since migration SQL files never insert into `schema_versions` themselves, no re-query is needed between iterations. - **`schema_versions` created twice on first run:** The inline `_CREATE_SCHEMA_VERSIONS` creates the table, then `2026-03-27_add_schema_versions_table.sql` runs the same DDL as a pending migration. `CREATE TABLE IF NOT EXISTS` makes the second execution a no-op. Intentional and harmless. - **Alphabetical sort order:** Undated files (`add_custom_commands_tables.sql`, `add_help_commands_table.sql`) sort before dated files (`2026-…`) since `'2' < 'a'` lexicographically — correct, older migrations run first. #### Security - No SQL injection risk: parameterized INSERT uses `%s` with a filesystem filename. Migration SQL is operator-controlled (not user input). - No hardcoded credentials — `POSTGRES_PASSWORD` is required, `RuntimeError` raised at startup if absent. Same pattern as PR #84. - `int(os.environ.get("POSTGRES_PORT", "5432"))` will raise `ValueError` on a malformed env var — caught by the outer `except Exception` and exits with code 1. Acceptable for an ops script. #### Style & Conventions - `psycopg2` directly (not Peewee ORM) is the right choice for a standalone runner that bootstraps before the ORM is configured. - Module-level docstring, `Path` for file handling, named constants — all clean. - `.gitignore` removal of `migrations.py` is correct; the PR body accurately explains the legacy context. #### Suggestions - **Non-blocking:** Wrapping the `SELECT` in its own `with conn:` block would make the transaction boundaries more explicit and avoid the implicit-transaction subtlety: ```python with conn: with conn.cursor() as cur: cur.execute("SELECT filename FROM schema_versions") applied = {row[0] for row in cur.fetchall()} ``` Functionally equivalent but clearer intent. - **Non-blocking:** The `.gitignore` duplicate `db_engine.py` entries (lines 56, 59, 61) noted in the PR body — a follow-up cleanup issue would be worth filing. ### Verdict: APPROVED Clean, well-structured implementation. Transaction model is correct, idempotent on existing databases, fails fast on missing credentials, and stops cleanly on migration failure. The inline table-creation + migration-file approach is a sensible bootstrap pattern. --- *Automated review by Claude PR Reviewer*
Claude added
ai-reviewed
and removed
ai-reviewing
labels 2026-03-27 10:47:57 +00:00
All checks were successful
Build Docker Image / build (pull_request) Successful in 2m11s
This pull request doesn't have enough approvals yet. 0 of 1 approvals granted.
This branch is out-of-date with the base branch
You are not authorized to merge this pull request.

Checkout

From your project repository, check out a new branch and test the changes.
git fetch -u origin issue/81-add-migration-tracking-system:issue/81-add-migration-tracking-system
git checkout issue/81-add-migration-tracking-system
Sign in to join this conversation.
No reviewers
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: cal/major-domo-database#96
No description provided.