Raxx · internal docs

internal · gated ↑ index

Raptor SQLite → Postgres Migration — Design Doc

Epic: #1556
Status: Design locked 2026-05-10 UTC
Milestone: #6 — raxx.app v1 (due 2026-05-23 UTC)
Target branch: design/raptor-postgres-migration


1. Context

Raptor (backend_v2) has used sqlite3.connect since inception. The audit-v2, RBAC-v2, and Queue designs (ADRs 0058–0068, PRs #1465, #1501, #1510) all assumed a Heroku Postgres backend that did not exist for Raptor. SC-A1 (PR #1501) exposed this: the raptor_app credential role, GRANT/REVOKE append-only enforcement, and RLS policies are all inert against SQLite.

The operator locked Path B on 2026-05-10 UTC: migrate Raptor to Postgres as part of v1, before launch. Staging Postgres (Standard-0, addon postgresql-adjacent-27271, credential raptor_app) is already provisioned. Prod is still on Essential-0 and needs an upgrade.

This document covers the schema port strategy, connection layer, local-dev path, test fixture selection, alembic adoption, and migration sequencing.


2. Invariants

The following non-negotiable constraints govern every decision in this design:


3. Schema Port Strategy

3.1 Type Map

SQLite type Postgres type Notes
INTEGER PRIMARY KEY AUTOINCREMENT BIGSERIAL PRIMARY KEY All id PKs
TEXT (general) TEXT Postgres TEXT is unbounded; no performance difference
TEXT (ISO-8601 datetime) TIMESTAMPTZ See §3.2
TEXT (JSON blob) JSONB See §3.3
REAL (price/amount) DOUBLE PRECISION financial display precision acceptable per ADR-0058
NUMERIC(10,4) NUMERIC(10,4) Postgres supports natively; no change
INTEGER (boolean 0/1) BOOLEAN Alembic migration uses sa.Boolean()
BLOB (public_key in webauthn_credentials) BYTEA Binary safe
INTEGER (general counts/IDs) BIGINT Future-proof; no cost on Postgres

3.2 Datetime Columns

Every column currently stored as TEXT ISO-8601 becomes TIMESTAMPTZ in Postgres. Alembic migration sets server_default=sa.func.now() to replace SQLite CURRENT_TIMESTAMP. Application code that inserts explicit UTC strings must be updated to pass Python datetime objects (SQLAlchemy handles the conversion).

Columns affected across all tables:

3.3 JSONB Columns

Columns that store serialized JSON in SQLite become JSONB in Postgres.

Table Column Rationale
customer_audit_events target_resource, before_state, after_state Diff state, query-able
trace_events context_json, data_mask_json Sanitized context blobs
onboarding_state completed_steps, skipped_steps, step_data Per-step JSON arrays
vendor_billing_snapshots raw_payload_json Vendor response escape hatch
postmark_delivery_events payload_json Full webhook payload
*_history row_data Shadow row snapshots
dsr_log tables_affected JSON array of table names

Note: columns in trace_events currently annotated -- TIMESCALE: use TIMESTAMPTZ are covered by §3.2. The TIMESCALE hypertable promotion (SC-11) is out of scope for this migration.

3.4 Boolean Columns

SQLite stores booleans as INTEGER (0/1). Postgres has native BOOLEAN. The Alembic migration maps these directly. Application-layer code that compares against 1 or 0 must be updated to use True / False after the port.

Affected: tradable (symbols), backup_eligible, backup_state (webauthn_credentials), enabled (vendor_billing_alert_config), success (support_audit_log).

3.5 Partial Indexes

SQLite supports WHERE clauses on indexes. Postgres also supports partial indexes but uses different planner statistics. All partial indexes from the SQLite migrations must be recreated verbatim in the Alembic migration. No behavior changes — the partial index DDL translates directly.

3.6 Table Inventory

Every table created in migrations 001–019, plus schema_migrations:

Migration Tables
001 historical_bars
002 symbols
003 users, webauthn_credentials
004a bootstrap_tokens_used, audit_log
004b customer_sessions, audit_log
005a postmark_inbound_dedup
005b support_customer_map
006 support_audit_log
007 support_pending_submissions
008a onboarding_state
008b postmark_delivery_events
008c trace_events, trace_workflows
009 bootstrap_tokens_pending, bootstrap_tokens_revoked
010 ALTER users (add columns)
011 vendor_billing_snapshots, vendor_billing_fixed
012 vendor_billing_alert_config
013 vendor_billing_alert_log
014 ALTER customer_sessions, login_challenges
015 Postgres role setup (raptor_app) — operator runbook only
016 customer_audit_events, freescout_ticket_cache, *_history × 6, audit_archival_runs, audit_integrity_log, dsr_log
017 backup_codes
018 email_verifications, ALTER users (add email_verified_at_utc)
019 quebec_waitlist

Total: ~35 distinct tables/views plus schema_migrations (retired after alembic adoption).

Note: migrations 004a and 004b both try to create audit_log. The Alembic baseline must deduplicate — only one audit_log table exists at runtime (the 004b version is the one applied since it comes later and uses IF NOT EXISTS).


4. Connection Layer

4.1 Driver: psycopg2-binary

Choice: psycopg2-binary (not psycopg3 / psycopg).

Rationale:

See ADR-0069 for the full driver comparison.

4.2 SQLAlchemy Version

Choice: SQLAlchemy 2.x (2.0.x series).

The connection layer is new — no legacy callsites to break. SQLAlchemy 2.x create_engine() + Connection.execute(text(...)) is the target API. The existing callsite pattern (with db.connect() as conn: conn.execute(...)) maps directly to SQLAlchemy's connection-as-context-manager idiom.

4.3 New database.py Contract

The new module (backend_v2/db/database.py) replaces the current sqlite3-based Database class with a SQLAlchemy-backed engine wrapper.

Public surface (unchanged from callers' perspective):

class Database:
    def connect(self) -> ContextManager[sqlalchemy.engine.Connection]:
        ...
    def apply_migrations(self) -> None:
        # For SQLite local dev: runs .sql files via text() (backward compat)
        # For Postgres: no-op (alembic runs in release dyno)
        ...

The connect() method returns a SQLAlchemy Connection context manager. Callers already use with db.connect() as conn: conn.execute(...) — no callsite signature changes are needed, but the parameter style changes from SQLite ? placeholders to SQLAlchemy text("... :param") named params.

4.4 Local Dev Path

DATABASE_URL drives everything:

The init_persistence() function in backend_v2/db/__init__.py reads app.config["DATABASE_URL"] (resolved from env) and constructs the engine. SQLite pragma wiring (WAL, cache_size, etc.) applies only when the dialect is sqlite. This is detected via engine.dialect.name == "sqlite".

No local-dev developer needs Postgres. The test suite defaults to SQLite unless DATABASE_URL points elsewhere.

4.5 Alembic Adoption

Raptor gets its own alembic env, parallel to Console's:

backend_v2/
  alembic.ini
  alembic/
    env.py
    script.py.mako
    versions/
      0001_raptor_baseline.py   ← full schema from migrations 001-019
      0002_...                  ← future incremental

The alembic/env.py reads DATABASE_URL from the environment. On Heroku, Alembic runs in the release dyno using the owner credential (DATABASE_URL), never RAPTOR_APP_DATABASE_URL (the restricted role).

schema_migrations (the old hand-rolled table) is not migrated to Postgres. It exists in SQLite only. Alembic's alembic_version table is the canonical migration-state record on Postgres.

The baseline migration (0001_raptor_baseline.py) creates all 35 tables in one shot from the Postgres-typed schema. On a fresh Postgres DB, only this migration runs. On the already-provisioned staging DB, Alembic stamps the baseline version (no DDL run because tables already exist — operator applies alembic stamp as part of the cutover SOP).

Migration naming: NNNN_<slug>.py (4-digit zero-padded), independent of the old NNN_<slug>.sql files which remain in db/migrations/ for SQLite local-dev backward compat during the transition window.


5. Test Fixtures

Choice: pytest-postgresql (ephemeral Postgres, no container daemon).

Rationale:

See ADR-0070 for the full comparison.


6. Sequence Diagram — Cutover Flow

sequenceDiagram
    participant Dev as feature-dev
    participant CI as CI (GitHub Actions)
    participant Heroku as Heroku
    participant PG as Postgres (staging)

    Note over Dev,PG: Phase 1 — Alembic + SQLAlchemy lands
    Dev->>CI: PR: alembic init + engine module
    CI->>CI: run tests (SQLite path)
    CI->>Heroku: merge → release dyno
    Heroku->>PG: alembic upgrade head (baseline schema)
    Note over Heroku,PG: tables created on staging Postgres

    Note over Dev,PG: Phase 2 — Callsite ports land (one PR per service)
    Dev->>CI: PR: port auth routes
    CI->>CI: run tests (SQLite + Postgres)
    CI->>Heroku: merge → staging

    Note over Dev,PG: Phase 3 — Staging cutover
    Dev->>Heroku: heroku config:set DATABASE_URL=postgresql://... -a raxx-api-staging
    Heroku->>Heroku: dyno restart (picks up DATABASE_URL)
    Heroku->>PG: alembic upgrade head (no-op if baseline already ran)
    Dev->>Dev: smoke tests pass

    Note over Dev,PG: Phase 3b — Prod cutover
    Dev->>Heroku: upgrade raxx-api-prod to Standard-0
    Heroku->>Heroku: pg:credentials:create raptor_app
    Dev->>Heroku: heroku config:set DATABASE_URL=postgresql://... -a raxx-api-prod >/dev/null 2>&1
    Heroku->>PG: alembic upgrade head (fresh Postgres — all migrations run)
    Dev->>Dev: smoke tests pass

    Note over Dev,PG: Phase 4 — SC-A1 reactivation
    Dev->>Heroku: GRANT/REVOKE on all tables TO raptor_app
    Dev->>Heroku: heroku config:set FLAG_RAPTOR_APP_ROLE_SEPARATION=1 >/dev/null 2>&1

7. Migrations

Phase 0 (this PR): Design only. No schema changes.

Phase 1: Alembic baseline + engine module

One Alembic migration: 0001_raptor_baseline.py. Creates all tables in Postgres-typed DDL (see §3 type map). Idempotent — running against an empty DB creates everything; running against the staging DB where tables already exist requires alembic stamp head (operator SOP step, documented in migration-plan.md).

Rollback: alembic downgrade base drops all tables. Safe pre-launch because SQLite data is the live path until DATABASE_URL is pointed to Postgres.

Phase 2: Callsite ports

No schema changes. Pure Python — replace conn.execute("SELECT ...", (?, ?)) with conn.execute(text("SELECT ... :param"), {"param": val}). Each PR is independently rollbackable by reverting to SQLite URL.

Phase 3: Cutover

No schema changes. Environment variable flip (DATABASE_URL → Postgres). Rollback: revert DATABASE_URL to SQLite path (pre-launch, no data loss risk).

Phase 4: raptor_app role + flag flip

Post-migration DDL: GRANT/REVOKE per SC-A1 runbook. Rollback: set FLAG_RAPTOR_APP_ROLE_SEPARATION=0. The restricted role still exists but is not used.


8. Rollout Plan

dark    → Phase 1 lands; SQLite still active locally and on staging
         (DATABASE_URL not set to Postgres yet)
flag    → Phase 2 callsite ports land; staging DATABASE_URL flipped to Postgres
         (staging soak: 72 h)
beta    → Phase 3 prod cutover; prod DATABASE_URL set
ga      → Phase 4 raptor_app role + audit-v2 / RBAC-v2 / Queue unblock

No feature flag gates the migration itself — the switch is the DATABASE_URL value. The callsite PRs are safe to merge before the URL is flipped because the new SQLAlchemy code handles both dialects.


9. Security Considerations

9.1 PII

The migration carries forward all existing PII (email addresses, ip_prefix fields, session hashes). No new PII columns are added. Retention windows are unchanged. GDPR deletion paths (ON DELETE CASCADE chains) remain intact — FK constraints now enforced at the Postgres layer rather than application layer.

9.2 Credentials

9.3 Audit Trail

alembic_version replaces schema_migrations as the migration-state record on Postgres. Both are revoked from raptor_app — the restricted runtime role cannot modify migration state.

Every state change to money-affecting or permission-affecting tables remains audited via customer_audit_events (SC-A3, post-Phase 4).

9.4 Kill Switch

If the Postgres cutover causes regressions: revert DATABASE_URL to sqlite+pysqlite:///./raptor.db. Pre-launch, no customer data is lost (SQLite and Postgres are functionally equivalent at zero rows). The Heroku config rollback takes effect on the next dyno restart (~30 s).

9.5 Breach Notification

No new breach surface introduced. Postgres-at-rest encryption is included in Standard-0 (AES-256 EBS). Postgres TLS in transit is enforced by Heroku's sslmode=require default in DATABASE_URL.

9.6 Secrets Rotatable Without Redeploy

Postgres password rotation: heroku pg:credentials:rotate DATABASE -a raxx-api-* rotates the owner credential; heroku pg:credentials:rotate DATABASE --name raptor_app rotates the restricted credential. Both update the Heroku config var automatically. App picks up on next dyno restart.


10. Open Questions

  1. audit_log table dedup: Migrations 004a and 004b both CREATE TABLE IF NOT EXISTS audit_log with slightly different schemas (004a adds ip_prefix; 004b does not). At runtime only one version exists (whichever ran second). The Alembic baseline must reconcile these into a single canonical schema. Feature-dev must inspect the live SQLite schema before writing the baseline migration. Recommend: use the union of all columns (both schemas merged) marked nullable where added later.

  2. webauthn_service.py uses conn.cursor() directly: The service uses cursor = db_connection.cursor() and cursor.execute(). SQLAlchemy connections do not have a .cursor() method in the 2.x API. The callsite port (RM-4, auth routes) must rewrite these to conn.execute(text(...)). This is the highest-touch change in Phase 2 — flag for careful review.

  3. Timescale hypertable promotion (SC-11): trace_events and trace_workflows have -- TIMESCALE: annotations throughout. SC-11 targets these tables. This migration does NOT enable Timescale. SC-11 should be sequenced after the Raptor Postgres migration lands (SC-11 cannot run on SQLite at all). Confirm SC-11 is sequenced post-Phase 3.