Raptor Postgres Roles Runbook
System: Raptor (raxx-api-staging, raxx-api-prod)
Owner: operator / sre-agent
Card: #1481 (SC-A Phase 0), #1455 (append-only invariant enforcement, pre-launch-blocker)
Last incident: 2026-05-15 UTC (prod Essential-0 tier blocks pg:credentials:create — see #1455 + escalation issue)
Last reviewed: 2026-05-15 UTC
Why a separate role
Raptor's DATABASE_URL credential is the Postgres owner. Running application
queries under the owner credential means a bug (or a compromised process) can
silently modify any table, alter schema, or wipe migration state — there is no
database-level enforcement boundary.
The raptor_app role solves this:
- Least-privilege DML only — SELECT / INSERT / UPDATE / DELETE on application tables; no DDL, no ownership transfer.
- Audit attribution — when later phases REVOKE UPDATE/DELETE from
customer_audit_events, that REVOKE is meaningful because the application already runs underraptor_app, not the owner. - Migration state isolation —
alembic_versionandschema_migrationsare explicitly REVOKEd, so a compromisedraptor_appsession cannot stamp a migration as applied.
This role is a prerequisite for every SC-A card from Phase 1 onward. Without it, REVOKE-based append-only enforcement is a no-op.
Prerequisites
- Heroku Postgres must be Standard-0 or higher.
Essential-tier databases do NOT support custom credentials:
heroku pg:credentials:createreturnsError: You can't create a custom credential on Essential-tier databases.Verify before proceeding:
bash
heroku pg:info -a raxx-api-staging
# Expect: Plan: Standard 0
psqlclient must be available locally for the GRANT/REVOKE step. Heroku CLI installs it when you runheroku plugins:install heroku-pg-extrasorbrew install postgresql. Verify:psql --version.
How to tell it's broken
heroku logs -a raxx-api-staging | grep "role.*does not exist"—raptor_approle was dropped or never created.heroku logs -a raxx-api-staging | grep "permission denied for table"— GRANT drift; role exists but is missing DML on one or more tables.heroku config:get RAPTOR_APP_DATABASE_URL -a raxx-api-stagingreturns empty — env var was never set or was removed; Raptor is falling back toDATABASE_URL(owner credential).heroku config:get FLAG_RAPTOR_APP_ROLE_SEPARATION -a raxx-api-stagingreturns1but Raptor logs showDATABASE_URLconnections — code-path bug, not a role issue; escalate to feature-developer.
How to diagnose (in order)
- Verify the role exists in Postgres:
bash
heroku pg:psql -a raxx-api-staging -c "SELECT rolname, rolcanlogin FROM pg_catalog.pg_roles WHERE rolname = 'raptor_app';"
# Expected: raptor_app | t
# If empty: role was never created — run the provisioning procedure below.
- Verify grants are intact:
bash
heroku pg:psql -a raxx-api-staging -c "\dp historical_bars" | grep raptor_app
# Expect: raptor_app=arwd/... (a=INSERT, r=SELECT, w=UPDATE, d=DELETE)
# If absent: run the "Re-apply grants if drift occurs" procedure below.
- Verify the env var is set and non-empty:
bash
heroku config:get RAPTOR_APP_DATABASE_URL -a raxx-api-staging | cut -c1-30
# Expect: postgres://raptor_app:...
# If empty: run Step 3 of the provisioning procedure below.
- Verify the feature flag:
bash
heroku config:get FLAG_RAPTOR_APP_ROLE_SEPARATION -a raxx-api-staging
# Expected (once soak is complete): 1
# Expected (during soak period): 0
Known failure modes
Failure mode A: rds_password error on CREATE ROLE
Symptom:
ERROR: must be a member of rds_password to alter passwords
Cause: Heroku Postgres runs on AWS RDS. The DATABASE_URL owner is NOT in
the rds_password group. Direct CREATE ROLE ... WITH LOGIN PASSWORD '...'
fails because only rds_password members can set passwords.
Fix: Do NOT use raw SQL to create the role. Use
heroku pg:credentials:create which goes through Heroku's RDS-aware path.
See provisioning procedure below.
Verification: heroku pg:credentials:url DATABASE --name raptor_app -a <app>
returns a valid postgres:// URL without error.
Failure mode B: Essential-tier block
Symptom:
Error: You can't create a custom credential on Essential-tier databases.
Cause: Essential-0/1/2 plans do not support the pg:credentials:create
command at all. Role separation is impossible on Essential-tier.
Fix: Upgrade the Heroku Postgres addon to Standard-0 before attempting any role-separation work. Standard-0 is the minimum tier that supports this.
heroku addons:upgrade postgresql-adjacent-27271 heroku-postgresql:standard-0 -a raxx-api-staging
Verification: heroku pg:info -a raxx-api-staging shows Plan: Standard 0.
Failure mode C: Grant drift (new table missing grants)
Symptom: permission denied for table <new_table> in Raptor logs after a
migration that added a new table.
Cause: GRANT ... ON ALL TABLES applies only to tables that exist at the
time the statement runs. Tables created in later migrations do not inherit
permissions automatically; the migration must include an explicit GRANT.
Fix: Run the GRANT block in the re-apply procedure below (it uses
ON ALL TABLES IN SCHEMA public which re-covers newly added tables).
Then: update the migration file for the new table to include:
GRANT SELECT, INSERT, UPDATE, DELETE ON <new_table> TO raptor_app;
-- If the table has a sequence:
GRANT USAGE ON SEQUENCE <new_table_id_seq> TO raptor_app;
Verification: \dp <new_table> in heroku pg:psql shows raptor_app=arwd.
2026-06-01 catch-up (PR #2172): Migrations 0011-0020 created 6 tables
without explicit raptor_app grants. Migration 0026_raptor_app_grant_catchup.py
backfills the missing grants. After running alembic upgrade head on staging
and prod, run \dp order_strategy_links (and the other 5 tables) to confirm.
The regression test backend_v2/tests/integration/test_raptor_app_grant_catchup_2172.py
can be run against a live Postgres to verify the full matrix.
The lint gate scripts/ci/lint_migration_grants.sh was expanded (#2172) to
enforce that every op.create_table() call in a future Alembic migration
includes a raptor_app GRANT. New migrations that create tables without grants
will now fail CI before reaching staging.
Failure mode D: RAPTOR_APP_DATABASE_URL unset after dyno restart
Symptom: Raptor logs show connections from the DATABASE_URL owner, not
raptor_app. heroku config:get RAPTOR_APP_DATABASE_URL is empty.
Cause: Heroku config vars persist across restarts. If this is empty, the
var was never set or was explicitly removed (e.g., an operator ran
heroku config:unset).
Fix: Re-provision via heroku pg:credentials:url and re-set the config var.
See Step 3 of the provisioning procedure below.
Verification: heroku config:get RAPTOR_APP_DATABASE_URL -a <app> returns a
non-empty postgres://raptor_app:... URL.
Provisioning procedure (staging-first)
Run on staging. Repeat for prod only after 72 h staging soak with no errors.
Step 1 — Verify Postgres tier
heroku pg:info -a raxx-api-staging
# Expect: Plan: Standard 0
# If Essential-tier: STOP — upgrade first (see Failure mode B above)
Step 2 — Create the role via Heroku CLI
heroku pg:credentials:create DATABASE --name raptor_app -a raxx-api-staging
DATABASE is the literal addon name (the positional argument to the CLI).
--name raptor_app is the credential/role name. Do not swap them — the CLI
will return Unknown database: raptor_app if the role name goes in the
positional slot.
pg:credentials:create creates the Postgres role, generates a password via
RDS, and grants CONNECT + SELECT on all tables automatically.
Step 3 — Retrieve and store the connection URL
heroku pg:credentials:url DATABASE --name raptor_app -a raxx-api-staging
# Copy the "Connection URL:" value (postgres://raptor_app:...) into the next step.
Set in Heroku config (stdout silenced — URL contains the password):
heroku config:set RAPTOR_APP_DATABASE_URL="<URL from above>" -a raxx-api-staging >/dev/null 2>&1
Verify the var is set without revealing the value:
heroku config:get RAPTOR_APP_DATABASE_URL -a raxx-api-staging | cut -c1-30
# Expect: postgres://raptor_app:...
Step 4 — Apply the audit role grant matrix (migration 031)
This step applies the full privilege matrix from
backend_v2/db/migrations/031_audit_role_split.sql. The matrix has two tiers:
- Normal application tables — SELECT, INSERT, UPDATE, DELETE
- Append-only audit / history tables — INSERT + SELECT only. UPDATE and DELETE are explicitly REVOKEd to enforce the KMS audit-chain invariant.
Open a psql session as the DATABASE owner:
heroku pg:psql -a raxx-api-staging
Run each block in order (all blocks are idempotent):
-- Block A: reset to clean slate
REVOKE ALL ON ALL TABLES IN SCHEMA public FROM raptor_app;
REVOKE ALL ON ALL SEQUENCES IN SCHEMA public FROM raptor_app;
REVOKE ALL ON TABLE alembic_version FROM raptor_app;
-- Block B: full DML on non-audit application tables
GRANT SELECT, INSERT, UPDATE, DELETE ON TABLE
audit_log, backup_codes,
bootstrap_tokens_pending, bootstrap_tokens_revoked, bootstrap_tokens_used,
customer_sessions, email_verifications, freescout_ticket_cache,
historical_bars, login_challenges, onboarding_state, paper_orders,
postmark_delivery_events, postmark_inbound_dedup, quebec_waitlist,
support_customer_map, support_pending_submissions, symbols,
trace_events, trace_workflows, users,
vendor_billing_alert_config, vendor_billing_alert_log,
vendor_billing_fixed, vendor_billing_snapshots, webauthn_credentials
TO raptor_app;
-- Block C: INSERT + SELECT only on audit/history tables (append-only invariant)
GRANT INSERT, SELECT ON TABLE
audit_archival_runs, audit_integrity_log, customer_audit_events,
dsr_log, passkeys_history, positions_history, sessions_history,
subscriptions_history, support_audit_log, trades_history, users_history
TO raptor_app;
-- Safety guard: explicit REVOKE of UPDATE/DELETE on audit tables
REVOKE UPDATE, DELETE ON TABLE
audit_archival_runs, audit_integrity_log, customer_audit_events,
dsr_log, passkeys_history, positions_history, sessions_history,
subscriptions_history, support_audit_log, trades_history, users_history
FROM raptor_app;
-- Block D: sequence access for all INSERT-capable tables
GRANT USAGE ON ALL SEQUENCES IN SCHEMA public TO raptor_app;
After applying, verify the invariant holds:
SELECT
t.relname AS table_name,
has_table_privilege('raptor_app', t.relname, 'INSERT') AS ins,
has_table_privilege('raptor_app', t.relname, 'SELECT') AS sel,
has_table_privilege('raptor_app', t.relname, 'UPDATE') AS upd,
has_table_privilege('raptor_app', t.relname, 'DELETE') AS del
FROM pg_class t
JOIN pg_namespace n ON n.oid = t.relnamespace
WHERE n.nspname = 'public' AND t.relkind = 'r'
ORDER BY t.relname;
-- For audit/history tables: ins=t sel=t upd=f del=f
-- For application tables: ins=t sel=t upd=t del=t
-- For alembic_version: ins=f sel=f upd=f del=f
Step 5 — Verify the role
Still in psql:
SELECT rolname, rolcanlogin FROM pg_catalog.pg_roles WHERE rolname = 'raptor_app';
-- Expected: raptor_app | t
-- Spot-check DML on an application table
\dp historical_bars
-- Expected: raptor_app=arwd/... (a=INSERT r=SELECT w=UPDATE d=DELETE)
-- Confirm migration tables are locked out
\dp alembic_version
-- Expected: no raptor_app entry (or raptor_app= with no permissions)
Step 6 — Enable the flag on staging
heroku config:set FLAG_RAPTOR_APP_ROLE_SEPARATION=1 -a raxx-api-staging >/dev/null 2>&1
Restart and tail logs:
heroku restart -a raxx-api-staging
heroku logs -a raxx-api-staging --tail
# Watch for: no "permission denied" errors; normal request processing resumes
Run the smoke suite:
scripts/ci/run_smoke.sh --env=staging
Step 7 — Soak (72 h before prod)
Leave staging running for 72 hours. Check at 24 / 48 / 72 h:
- [ ] Sentry (project: raxx-backend) shows zero new
OperationalErrororpermission deniedevents from staging - [ ]
heroku logs -a raxx-api-staging --since 24h | grep -i "permission denied"returns nothing - [ ] Manual smoke at each mark:
curl -fsS https://api-staging.raxx.app/healthz
Step 8 — Prod cutover (operator action, RM-10 gate)
Repeat Steps 1–6 for raxx-api-prod. Prod cutover is tracked as RM-10 (#1568).
Do NOT perform prod cutover without operator sign-off.
heroku pg:credentials:create DATABASE --name raptor_app -a raxx-api-prod
heroku pg:credentials:url DATABASE --name raptor_app -a raxx-api-prod
heroku config:set RAPTOR_APP_DATABASE_URL="<URL>" -a raxx-api-prod >/dev/null 2>&1
heroku pg:psql -a raxx-api-prod
# (run the same GRANT/REVOKE block from Step 4)
heroku config:set FLAG_RAPTOR_APP_ROLE_SEPARATION=1 -a raxx-api-prod >/dev/null 2>&1
heroku restart -a raxx-api-prod
scripts/ci/run_smoke.sh --env=prod
Re-apply grants if drift occurs
If grants are missing (e.g. after a new migration added tables without explicit GRANTs), re-run the GRANT block from Step 4 against the affected environment:
heroku pg:psql -a raxx-api-staging
GRANT INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO raptor_app;
GRANT USAGE ON ALL SEQUENCES IN SCHEMA public TO raptor_app;
REVOKE ALL ON TABLE alembic_version FROM raptor_app;
REVOKE ALL ON TABLE schema_migrations FROM raptor_app;
This is idempotent. Running it when grants are already correct is safe.
Adding a new table after Phase 0
Every migration that creates a new table must include explicit grants:
CREATE TABLE IF NOT EXISTS my_new_table (
id BIGSERIAL PRIMARY KEY,
...
);
-- Required: explicit grant for raptor_app on new tables
GRANT SELECT, INSERT, UPDATE, DELETE ON my_new_table TO raptor_app;
GRANT USAGE ON SEQUENCE my_new_table_id_seq TO raptor_app;
Omitting this grant causes permission denied for table my_new_table on the
first application request that touches the new table when
FLAG_RAPTOR_APP_ROLE_SEPARATION=1.
Environment variable reference
| Var | App | Description |
|---|---|---|
DATABASE_URL |
raxx-api-* |
Owner credential (Heroku-managed). Used for migrations (alembic upgrade) and emergency fallback. Never used by Raptor application code when flag is 1. |
RAPTOR_APP_DATABASE_URL |
raxx-api-* |
Restricted application credential. Set by operator from heroku pg:credentials:url output. Raptor reads this at startup when FLAG_RAPTOR_APP_ROLE_SEPARATION=1. |
FLAG_RAPTOR_APP_ROLE_SEPARATION |
raxx-api-* |
Feature flag. Set to 1 after role is provisioned and verified. Set to 0 to fall back to DATABASE_URL without a code deploy. |
Rollback
The flag is the fast rollback mechanism. No code deploy required.
heroku config:set FLAG_RAPTOR_APP_ROLE_SEPARATION=0 -a raxx-api-staging >/dev/null 2>&1
heroku restart -a raxx-api-staging
# Verify Raptor is back up and using DATABASE_URL
heroku logs -a raxx-api-staging -n 20
The raptor_app role and its grants do not need to be removed for rollback.
Leave them in place so re-enable is a single config:set.
Emergency stop
Take Raptor off the restricted role instantly (no data impact, no migrations):
heroku config:set FLAG_RAPTOR_APP_ROLE_SEPARATION=0 -a raxx-api-prod >/dev/null 2>&1
heroku restart -a raxx-api-prod
Escalation
Escalate to the operator when:
- Heroku Postgres tier upgrade is needed (cost decision)
- The
raptor_approle needs to be dropped (data integrity risk) pg:credentials:createfails for a reason not covered by this runbook- Prod cutover is being executed (RM-10 requires operator sign-off)
Connects to
- Migration:
backend_v2/db/migrations/015_raptor_app_role.sql - Audit role grant matrix:
backend_v2/db/migrations/031_audit_role_split.sql - Staging cutover SOP:
docs/ops/runbooks/raptor-postgres-staging-cutover.md - Issue: #1481 (SC-A Phase 0), #1455 (append-only invariant, pre-launch-blocker)
- Prod cutover tracking: #1568 (RM-10) — prod blocked on Essential-0 → Standard-0 upgrade
- Feedback:
feedback_heroku_pg_rds_password_gotcha.md - Phase 1 (SC-A2): append-only enforcement on
customer_audit_events— depends onraptor_appbeing the active application role - Prod escalation: Essential-0 upgrade must precede prod role creation
(
pg:credentials:createis unsupported on Essential-tier)