Raptor SQLite → Postgres Migration — Design Doc
Epic: #1556
Status: Design locked 2026-05-10 UTC
Milestone: #6 — raxx.app v1 (due 2026-05-23 UTC)
Target branch: design/raptor-postgres-migration
1. Context
Raptor (backend_v2) has used sqlite3.connect since inception. The audit-v2,
RBAC-v2, and Queue designs (ADRs 0058–0068, PRs #1465, #1501, #1510) all assumed
a Heroku Postgres backend that did not exist for Raptor. SC-A1 (PR #1501) exposed
this: the raptor_app credential role, GRANT/REVOKE append-only enforcement, and
RLS policies are all inert against SQLite.
The operator locked Path B on 2026-05-10 UTC: migrate Raptor to Postgres as part
of v1, before launch. Staging Postgres (Standard-0, addon
postgresql-adjacent-27271, credential raptor_app) is already provisioned.
Prod is still on Essential-0 and needs an upgrade.
This document covers the schema port strategy, connection layer, local-dev path, test fixture selection, alembic adoption, and migration sequencing.
2. Invariants
The following non-negotiable constraints govern every decision in this design:
- No stored credentials. Postgres connection strings live in Heroku config vars or SSM — never in source files.
- No stored passwords even in the migration itself. Role creation uses
heroku pg:credentials:create(Heroku's RDS-aware path). NeverCREATE ROLE … WITH LOGIN PASSWORD '…'via psql — RDS rejects it. - Audit trail for every state change. The migration itself is audited via
schema_migrations(SQLite) transitioning toalembic_version(Postgres). Both tables are revoked fromraptor_app. - Paper-first gating is unaffected. The migration changes the wire; it does not touch paper/live decision logic.
- GDPR by default. PII columns (email, session hashes, ip prefixes) are carried forward with the same retention semantics. No new PII columns are introduced by the migration.
- Standard-0 tier required. Essential-tier Postgres does not support
pg:credentials:create. The SC-A1 raptor_app role-separation design is inoperable on Essential-tier. Prod must be upgraded before cutover. - Test suites must continue to pass against SQLite in CI during the transition, and against Postgres after Phase 2 lands. The abstraction must not break the existing test harness.
3. Schema Port Strategy
3.1 Type Map
| SQLite type | Postgres type | Notes |
|---|---|---|
INTEGER PRIMARY KEY AUTOINCREMENT |
BIGSERIAL PRIMARY KEY |
All id PKs |
TEXT (general) |
TEXT |
Postgres TEXT is unbounded; no performance difference |
TEXT (ISO-8601 datetime) |
TIMESTAMPTZ |
See §3.2 |
TEXT (JSON blob) |
JSONB |
See §3.3 |
REAL (price/amount) |
DOUBLE PRECISION |
financial display precision acceptable per ADR-0058 |
NUMERIC(10,4) |
NUMERIC(10,4) |
Postgres supports natively; no change |
INTEGER (boolean 0/1) |
BOOLEAN |
Alembic migration uses sa.Boolean() |
BLOB (public_key in webauthn_credentials) |
BYTEA |
Binary safe |
INTEGER (general counts/IDs) |
BIGINT |
Future-proof; no cost on Postgres |
3.2 Datetime Columns
Every column currently stored as TEXT ISO-8601 becomes TIMESTAMPTZ in
Postgres. Alembic migration sets server_default=sa.func.now() to replace
SQLite CURRENT_TIMESTAMP. Application code that inserts explicit UTC strings
must be updated to pass Python datetime objects (SQLAlchemy handles the
conversion).
Columns affected across all tables:
created_at,updated_at,applied_at— present in nearly every tableat_utc(customer_audit_events)ts_emitted,ts_received(trace_events)ts_started,ts_ended(trace_workflows)fetched_at,period_start,period_end— vendor_billing_snapshots (period_start/end stayDATEnot TIMESTAMPTZ; they are date-only values)sent_at(vendor_billing_alert_log)used_at,invalidated_at,generated_at(backup_codes)expires_at,consumed_at,last_used_at(login_challenges, webauthn_credentials)revoked_at,last_seen_at(customer_sessions)minted_at,revoked_at(bootstrap_tokens_*)
3.3 JSONB Columns
Columns that store serialized JSON in SQLite become JSONB in Postgres.
| Table | Column | Rationale |
|---|---|---|
customer_audit_events |
target_resource, before_state, after_state |
Diff state, query-able |
trace_events |
context_json, data_mask_json |
Sanitized context blobs |
onboarding_state |
completed_steps, skipped_steps, step_data |
Per-step JSON arrays |
vendor_billing_snapshots |
raw_payload_json |
Vendor response escape hatch |
postmark_delivery_events |
payload_json |
Full webhook payload |
*_history |
row_data |
Shadow row snapshots |
dsr_log |
tables_affected |
JSON array of table names |
Note: columns in trace_events currently annotated -- TIMESCALE: use TIMESTAMPTZ
are covered by §3.2. The TIMESCALE hypertable promotion (SC-11) is out of scope
for this migration.
3.4 Boolean Columns
SQLite stores booleans as INTEGER (0/1). Postgres has native BOOLEAN. The
Alembic migration maps these directly. Application-layer code that compares
against 1 or 0 must be updated to use True / False after the port.
Affected: tradable (symbols), backup_eligible, backup_state
(webauthn_credentials), enabled (vendor_billing_alert_config), success
(support_audit_log).
3.5 Partial Indexes
SQLite supports WHERE clauses on indexes. Postgres also supports partial
indexes but uses different planner statistics. All partial indexes from the
SQLite migrations must be recreated verbatim in the Alembic migration. No
behavior changes — the partial index DDL translates directly.
3.6 Table Inventory
Every table created in migrations 001–019, plus schema_migrations:
| Migration | Tables |
|---|---|
| 001 | historical_bars |
| 002 | symbols |
| 003 | users, webauthn_credentials |
| 004a | bootstrap_tokens_used, audit_log |
| 004b | customer_sessions, audit_log |
| 005a | postmark_inbound_dedup |
| 005b | support_customer_map |
| 006 | support_audit_log |
| 007 | support_pending_submissions |
| 008a | onboarding_state |
| 008b | postmark_delivery_events |
| 008c | trace_events, trace_workflows |
| 009 | bootstrap_tokens_pending, bootstrap_tokens_revoked |
| 010 | ALTER users (add columns) |
| 011 | vendor_billing_snapshots, vendor_billing_fixed |
| 012 | vendor_billing_alert_config |
| 013 | vendor_billing_alert_log |
| 014 | ALTER customer_sessions, login_challenges |
| 015 | Postgres role setup (raptor_app) — operator runbook only |
| 016 | customer_audit_events, freescout_ticket_cache, *_history × 6, audit_archival_runs, audit_integrity_log, dsr_log |
| 017 | backup_codes |
| 018 | email_verifications, ALTER users (add email_verified_at_utc) |
| 019 | quebec_waitlist |
Total: ~35 distinct tables/views plus schema_migrations (retired after alembic
adoption).
Note: migrations 004a and 004b both try to create audit_log. The Alembic
baseline must deduplicate — only one audit_log table exists at runtime (the
004b version is the one applied since it comes later and uses IF NOT EXISTS).
4. Connection Layer
4.1 Driver: psycopg2-binary
Choice: psycopg2-binary (not psycopg3 / psycopg).
Rationale:
- SQLAlchemy's primary Postgres dialect is
psycopg2. The existing Console (raxx-console-*) already shipspsycopg2-binary— consistent dep footprint. psycopg2-binarybundles libpq, so no system-level libpq install is required on Heroku dynos or CI.- psycopg3 is stable but SQLAlchemy 2.x asyncio-first patterns are not needed
here (Raptor is a WSGI Flask app). Migrating to psycopg3 later is a one-line
URL swap (
postgresql+psycopg://). psycopg2-binaryis MIT-compatible. No licensing conflict with the TradeMasterAPI MIT license.
See ADR-0069 for the full driver comparison.
4.2 SQLAlchemy Version
Choice: SQLAlchemy 2.x (2.0.x series).
The connection layer is new — no legacy callsites to break. SQLAlchemy 2.x
create_engine() + Connection.execute(text(...)) is the target API. The
existing callsite pattern (with db.connect() as conn: conn.execute(...)) maps
directly to SQLAlchemy's connection-as-context-manager idiom.
4.3 New database.py Contract
The new module (backend_v2/db/database.py) replaces the current
sqlite3-based Database class with a SQLAlchemy-backed engine wrapper.
Public surface (unchanged from callers' perspective):
class Database:
def connect(self) -> ContextManager[sqlalchemy.engine.Connection]:
...
def apply_migrations(self) -> None:
# For SQLite local dev: runs .sql files via text() (backward compat)
# For Postgres: no-op (alembic runs in release dyno)
...
The connect() method returns a SQLAlchemy Connection context manager.
Callers already use with db.connect() as conn: conn.execute(...) — no
callsite signature changes are needed, but the parameter style changes from
SQLite ? placeholders to SQLAlchemy text("... :param") named params.
4.4 Local Dev Path
DATABASE_URL drives everything:
- Not set (default local dev):
DATABASE_URLis absent → engine usessqlite+pysqlite:///./dev.db(SQLite file in project root). No Postgres required for local dev. - Set to Postgres URL (CI, staging, prod): SQLAlchemy routes to psycopg2.
- Set to SQLite URL explicitly:
sqlite+pysqlite:///./test.db— used by test fixtures.
The init_persistence() function in backend_v2/db/__init__.py reads
app.config["DATABASE_URL"] (resolved from env) and constructs the engine.
SQLite pragma wiring (WAL, cache_size, etc.) applies only when the dialect is
sqlite. This is detected via engine.dialect.name == "sqlite".
No local-dev developer needs Postgres. The test suite defaults to SQLite
unless DATABASE_URL points elsewhere.
4.5 Alembic Adoption
Raptor gets its own alembic env, parallel to Console's:
backend_v2/
alembic.ini
alembic/
env.py
script.py.mako
versions/
0001_raptor_baseline.py ← full schema from migrations 001-019
0002_... ← future incremental
The alembic/env.py reads DATABASE_URL from the environment. On Heroku,
Alembic runs in the release dyno using the owner credential (DATABASE_URL),
never RAPTOR_APP_DATABASE_URL (the restricted role).
schema_migrations (the old hand-rolled table) is not migrated to Postgres.
It exists in SQLite only. Alembic's alembic_version table is the canonical
migration-state record on Postgres.
The baseline migration (0001_raptor_baseline.py) creates all 35 tables in one
shot from the Postgres-typed schema. On a fresh Postgres DB, only this migration
runs. On the already-provisioned staging DB, Alembic stamps the baseline version
(no DDL run because tables already exist — operator applies alembic stamp as
part of the cutover SOP).
Migration naming: NNNN_<slug>.py (4-digit zero-padded), independent of the
old NNN_<slug>.sql files which remain in db/migrations/ for SQLite
local-dev backward compat during the transition window.
5. Test Fixtures
Choice: pytest-postgresql (ephemeral Postgres, no container daemon).
Rationale:
pytest-postgresqlspins up a temporary PostgreSQL process per test session using the locally installedpg_ctl. No Docker daemon required in CI.- GitHub Actions
ubuntu-latestimages ship with PostgreSQL;pg_ctlis available aftersudo apt-get install postgresql(one CI step). testcontainersrequires Docker-in-Docker which adds complexity and pull-rate limits in CI. Heroku PR-app approach requires a Heroku add-on per PR (cost + provisioning latency — unacceptable for a 13-day sprint).- The existing SQLite-based tests continue to work:
conftest.pydetectsDATABASE_URL— if absent, uses SQLite fixtures; if pointing to Postgres, usespytest-postgresqlfixtures. Same test bodies run in both modes. - Card #1552 (Postgres-aware migration CI) is the sub-card that wires this up.
See ADR-0070 for the full comparison.
6. Sequence Diagram — Cutover Flow
sequenceDiagram
participant Dev as feature-dev
participant CI as CI (GitHub Actions)
participant Heroku as Heroku
participant PG as Postgres (staging)
Note over Dev,PG: Phase 1 — Alembic + SQLAlchemy lands
Dev->>CI: PR: alembic init + engine module
CI->>CI: run tests (SQLite path)
CI->>Heroku: merge → release dyno
Heroku->>PG: alembic upgrade head (baseline schema)
Note over Heroku,PG: tables created on staging Postgres
Note over Dev,PG: Phase 2 — Callsite ports land (one PR per service)
Dev->>CI: PR: port auth routes
CI->>CI: run tests (SQLite + Postgres)
CI->>Heroku: merge → staging
Note over Dev,PG: Phase 3 — Staging cutover
Dev->>Heroku: heroku config:set DATABASE_URL=postgresql://... -a raxx-api-staging
Heroku->>Heroku: dyno restart (picks up DATABASE_URL)
Heroku->>PG: alembic upgrade head (no-op if baseline already ran)
Dev->>Dev: smoke tests pass
Note over Dev,PG: Phase 3b — Prod cutover
Dev->>Heroku: upgrade raxx-api-prod to Standard-0
Heroku->>Heroku: pg:credentials:create raptor_app
Dev->>Heroku: heroku config:set DATABASE_URL=postgresql://... -a raxx-api-prod >/dev/null 2>&1
Heroku->>PG: alembic upgrade head (fresh Postgres — all migrations run)
Dev->>Dev: smoke tests pass
Note over Dev,PG: Phase 4 — SC-A1 reactivation
Dev->>Heroku: GRANT/REVOKE on all tables TO raptor_app
Dev->>Heroku: heroku config:set FLAG_RAPTOR_APP_ROLE_SEPARATION=1 >/dev/null 2>&1
7. Migrations
Phase 0 (this PR): Design only. No schema changes.
Phase 1: Alembic baseline + engine module
One Alembic migration: 0001_raptor_baseline.py. Creates all tables in
Postgres-typed DDL (see §3 type map). Idempotent — running against an empty DB
creates everything; running against the staging DB where tables already exist
requires alembic stamp head (operator SOP step, documented in
migration-plan.md).
Rollback: alembic downgrade base drops all tables. Safe pre-launch because
SQLite data is the live path until DATABASE_URL is pointed to Postgres.
Phase 2: Callsite ports
No schema changes. Pure Python — replace conn.execute("SELECT ...", (?, ?)) with
conn.execute(text("SELECT ... :param"), {"param": val}). Each PR is
independently rollbackable by reverting to SQLite URL.
Phase 3: Cutover
No schema changes. Environment variable flip (DATABASE_URL → Postgres). Rollback: revert DATABASE_URL to SQLite path (pre-launch, no data loss risk).
Phase 4: raptor_app role + flag flip
Post-migration DDL: GRANT/REVOKE per SC-A1 runbook. Rollback: set
FLAG_RAPTOR_APP_ROLE_SEPARATION=0. The restricted role still exists but is not
used.
8. Rollout Plan
dark → Phase 1 lands; SQLite still active locally and on staging
(DATABASE_URL not set to Postgres yet)
flag → Phase 2 callsite ports land; staging DATABASE_URL flipped to Postgres
(staging soak: 72 h)
beta → Phase 3 prod cutover; prod DATABASE_URL set
ga → Phase 4 raptor_app role + audit-v2 / RBAC-v2 / Queue unblock
No feature flag gates the migration itself — the switch is the DATABASE_URL
value. The callsite PRs are safe to merge before the URL is flipped because the
new SQLAlchemy code handles both dialects.
9. Security Considerations
9.1 PII
The migration carries forward all existing PII (email addresses, ip_prefix fields, session hashes). No new PII columns are added. Retention windows are unchanged. GDPR deletion paths (ON DELETE CASCADE chains) remain intact — FK constraints now enforced at the Postgres layer rather than application layer.
9.2 Credentials
DATABASE_URL(owner credential) lives in Heroku config only. Never in files.RAPTOR_APP_DATABASE_URL(restricted credential) lives in Heroku config only. Created viaheroku pg:credentials:create(notCREATE ROLE … WITH PASSWORD).- Neither URL appears in source, CI logs, or alembic migration files.
9.3 Audit Trail
alembic_version replaces schema_migrations as the migration-state record on
Postgres. Both are revoked from raptor_app — the restricted runtime role cannot
modify migration state.
Every state change to money-affecting or permission-affecting tables remains
audited via customer_audit_events (SC-A3, post-Phase 4).
9.4 Kill Switch
If the Postgres cutover causes regressions: revert DATABASE_URL to
sqlite+pysqlite:///./raptor.db. Pre-launch, no customer data is lost (SQLite
and Postgres are functionally equivalent at zero rows). The Heroku config
rollback takes effect on the next dyno restart (~30 s).
9.5 Breach Notification
No new breach surface introduced. Postgres-at-rest encryption is included in
Standard-0 (AES-256 EBS). Postgres TLS in transit is enforced by Heroku's
sslmode=require default in DATABASE_URL.
9.6 Secrets Rotatable Without Redeploy
Postgres password rotation: heroku pg:credentials:rotate DATABASE -a raxx-api-*
rotates the owner credential; heroku pg:credentials:rotate DATABASE
--name raptor_app rotates the restricted credential. Both update the Heroku
config var automatically. App picks up on next dyno restart.
10. Open Questions
-
audit_logtable dedup: Migrations 004a and 004b bothCREATE TABLE IF NOT EXISTS audit_logwith slightly different schemas (004a addsip_prefix; 004b does not). At runtime only one version exists (whichever ran second). The Alembic baseline must reconcile these into a single canonical schema. Feature-dev must inspect the live SQLite schema before writing the baseline migration. Recommend: use the union of all columns (both schemas merged) marked nullable where added later. -
webauthn_service.pyusesconn.cursor()directly: The service usescursor = db_connection.cursor()andcursor.execute(). SQLAlchemy connections do not have a.cursor()method in the 2.x API. The callsite port (RM-4, auth routes) must rewrite these toconn.execute(text(...)). This is the highest-touch change in Phase 2 — flag for careful review. -
Timescale hypertable promotion (SC-11):
trace_eventsandtrace_workflowshave-- TIMESCALE:annotations throughout. SC-11 targets these tables. This migration does NOT enable Timescale. SC-11 should be sequenced after the Raptor Postgres migration lands (SC-11 cannot run on SQLite at all). Confirm SC-11 is sequenced post-Phase 3.