Raxx · internal docs

internal · gated ↑ index

Raptor Postgres Roles Runbook

System: Raptor (raxx-api-staging, raxx-api-prod) Owner: operator / sre-agent Card: #1481 (SC-A Phase 0) Last incident: 2026-05-10 UTC (Essential-tier block + rds_password error — see feedback) Last reviewed: 2026-05-12 UTC


Why a separate role

Raptor's DATABASE_URL credential is the Postgres owner. Running application queries under the owner credential means a bug (or a compromised process) can silently modify any table, alter schema, or wipe migration state — there is no database-level enforcement boundary.

The raptor_app role solves this:

This role is a prerequisite for every SC-A card from Phase 1 onward. Without it, REVOKE-based append-only enforcement is a no-op.


Prerequisites

bash heroku pg:info -a raxx-api-staging # Expect: Plan: Standard 0


How to tell it's broken


How to diagnose (in order)

  1. Verify the role exists in Postgres:

bash heroku pg:psql -a raxx-api-staging -c "SELECT rolname, rolcanlogin FROM pg_catalog.pg_roles WHERE rolname = 'raptor_app';" # Expected: raptor_app | t # If empty: role was never created — run the provisioning procedure below.

  1. Verify grants are intact:

bash heroku pg:psql -a raxx-api-staging -c "\dp historical_bars" | grep raptor_app # Expect: raptor_app=arwd/... (a=INSERT, r=SELECT, w=UPDATE, d=DELETE) # If absent: run the "Re-apply grants if drift occurs" procedure below.

  1. Verify the env var is set and non-empty:

bash heroku config:get RAPTOR_APP_DATABASE_URL -a raxx-api-staging | cut -c1-30 # Expect: postgres://raptor_app:... # If empty: run Step 3 of the provisioning procedure below.

  1. Verify the feature flag:

bash heroku config:get FLAG_RAPTOR_APP_ROLE_SEPARATION -a raxx-api-staging # Expected (once soak is complete): 1 # Expected (during soak period): 0


Known failure modes

Failure mode A: rds_password error on CREATE ROLE

Symptom:

ERROR: must be a member of rds_password to alter passwords

Cause: Heroku Postgres runs on AWS RDS. The DATABASE_URL owner is NOT in the rds_password group. Direct CREATE ROLE ... WITH LOGIN PASSWORD '...' fails because only rds_password members can set passwords.

Fix: Do NOT use raw SQL to create the role. Use heroku pg:credentials:create which goes through Heroku's RDS-aware path. See provisioning procedure below.

Verification: heroku pg:credentials:url DATABASE --name raptor_app -a <app> returns a valid postgres:// URL without error.


Failure mode B: Essential-tier block

Symptom:

Error: You can't create a custom credential on Essential-tier databases.

Cause: Essential-0/1/2 plans do not support the pg:credentials:create command at all. Role separation is impossible on Essential-tier.

Fix: Upgrade the Heroku Postgres addon to Standard-0 before attempting any role-separation work. Standard-0 is the minimum tier that supports this.

heroku addons:upgrade postgresql-adjacent-27271 heroku-postgresql:standard-0 -a raxx-api-staging

Verification: heroku pg:info -a raxx-api-staging shows Plan: Standard 0.


Failure mode C: Grant drift (new table missing grants)

Symptom: permission denied for table <new_table> in Raptor logs after a migration that added a new table.

Cause: GRANT ... ON ALL TABLES applies only to tables that exist at the time the statement runs. Tables created in later migrations do not inherit permissions automatically; the migration must include an explicit GRANT.

Fix: Run the GRANT block in the re-apply procedure below (it uses ON ALL TABLES IN SCHEMA public which re-covers newly added tables).

Then: update the migration file for the new table to include:

GRANT SELECT, INSERT, UPDATE, DELETE ON <new_table> TO raptor_app;
-- If the table has a sequence:
GRANT USAGE ON SEQUENCE <new_table_id_seq> TO raptor_app;

Verification: \dp <new_table> in heroku pg:psql shows raptor_app=arwd.


Failure mode D: RAPTOR_APP_DATABASE_URL unset after dyno restart

Symptom: Raptor logs show connections from the DATABASE_URL owner, not raptor_app. heroku config:get RAPTOR_APP_DATABASE_URL is empty.

Cause: Heroku config vars persist across restarts. If this is empty, the var was never set or was explicitly removed (e.g., an operator ran heroku config:unset).

Fix: Re-provision via heroku pg:credentials:url and re-set the config var. See Step 3 of the provisioning procedure below.

Verification: heroku config:get RAPTOR_APP_DATABASE_URL -a <app> returns a non-empty postgres://raptor_app:... URL.


Provisioning procedure (staging-first)

Run on staging. Repeat for prod only after 72 h staging soak with no errors.

Step 1 — Verify Postgres tier

heroku pg:info -a raxx-api-staging
# Expect: Plan: Standard 0
# If Essential-tier: STOP — upgrade first (see Failure mode B above)

Step 2 — Create the role via Heroku CLI

heroku pg:credentials:create DATABASE --name raptor_app -a raxx-api-staging

DATABASE is the literal addon name (the positional argument to the CLI). --name raptor_app is the credential/role name. Do not swap them — the CLI will return Unknown database: raptor_app if the role name goes in the positional slot.

pg:credentials:create creates the Postgres role, generates a password via RDS, and grants CONNECT + SELECT on all tables automatically.

Step 3 — Retrieve and store the connection URL

heroku pg:credentials:url DATABASE --name raptor_app -a raxx-api-staging
# Copy the "Connection URL:" value (postgres://raptor_app:...) into the next step.

Set in Heroku config (stdout silenced — URL contains the password):

heroku config:set RAPTOR_APP_DATABASE_URL="<URL from above>" -a raxx-api-staging >/dev/null 2>&1

Verify the var is set without revealing the value:

heroku config:get RAPTOR_APP_DATABASE_URL -a raxx-api-staging | cut -c1-30
# Expect: postgres://raptor_app:...

Step 4 — Add DML grants and revoke migration-table access

Open a psql session as the DATABASE owner:

heroku pg:psql -a raxx-api-staging

Run this block (copy-paste as one unit):

-- Full DML on all application tables (SELECT already granted by pg:credentials:create)
GRANT INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO raptor_app;

-- Sequence access (needed for INSERT into tables with SERIAL / BIGSERIAL columns)
GRANT USAGE ON ALL SEQUENCES IN SCHEMA public TO raptor_app;

-- Deny all access to migration-management tables — raptor_app must not stamp migrations
REVOKE ALL ON TABLE alembic_version FROM raptor_app;
REVOKE ALL ON TABLE schema_migrations FROM raptor_app;

Step 5 — Verify the role

Still in psql:

SELECT rolname, rolcanlogin FROM pg_catalog.pg_roles WHERE rolname = 'raptor_app';
-- Expected: raptor_app | t

-- Spot-check DML on an application table
\dp historical_bars
-- Expected: raptor_app=arwd/... (a=INSERT r=SELECT w=UPDATE d=DELETE)

-- Confirm migration tables are locked out
\dp alembic_version
-- Expected: no raptor_app entry (or raptor_app= with no permissions)

Step 6 — Enable the flag on staging

heroku config:set FLAG_RAPTOR_APP_ROLE_SEPARATION=1 -a raxx-api-staging >/dev/null 2>&1

Restart and tail logs:

heroku restart -a raxx-api-staging
heroku logs -a raxx-api-staging --tail
# Watch for: no "permission denied" errors; normal request processing resumes

Run the smoke suite:

scripts/ci/run_smoke.sh --env=staging

Step 7 — Soak (72 h before prod)

Leave staging running for 72 hours. Check at 24 / 48 / 72 h:

Step 8 — Prod cutover (operator action, RM-10 gate)

Repeat Steps 1–6 for raxx-api-prod. Prod cutover is tracked as RM-10 (#1568). Do NOT perform prod cutover without operator sign-off.

heroku pg:credentials:create DATABASE --name raptor_app -a raxx-api-prod
heroku pg:credentials:url DATABASE --name raptor_app -a raxx-api-prod
heroku config:set RAPTOR_APP_DATABASE_URL="<URL>" -a raxx-api-prod >/dev/null 2>&1
heroku pg:psql -a raxx-api-prod
# (run the same GRANT/REVOKE block from Step 4)
heroku config:set FLAG_RAPTOR_APP_ROLE_SEPARATION=1 -a raxx-api-prod >/dev/null 2>&1
heroku restart -a raxx-api-prod
scripts/ci/run_smoke.sh --env=prod

Re-apply grants if drift occurs

If grants are missing (e.g. after a new migration added tables without explicit GRANTs), re-run the GRANT block from Step 4 against the affected environment:

heroku pg:psql -a raxx-api-staging
GRANT INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO raptor_app;
GRANT USAGE ON ALL SEQUENCES IN SCHEMA public TO raptor_app;
REVOKE ALL ON TABLE alembic_version FROM raptor_app;
REVOKE ALL ON TABLE schema_migrations FROM raptor_app;

This is idempotent. Running it when grants are already correct is safe.


Adding a new table after Phase 0

Every migration that creates a new table must include explicit grants:

CREATE TABLE IF NOT EXISTS my_new_table (
  id BIGSERIAL PRIMARY KEY,
  ...
);

-- Required: explicit grant for raptor_app on new tables
GRANT SELECT, INSERT, UPDATE, DELETE ON my_new_table TO raptor_app;
GRANT USAGE ON SEQUENCE my_new_table_id_seq TO raptor_app;

Omitting this grant causes permission denied for table my_new_table on the first application request that touches the new table when FLAG_RAPTOR_APP_ROLE_SEPARATION=1.


Environment variable reference

Var App Description
DATABASE_URL raxx-api-* Owner credential (Heroku-managed). Used for migrations (alembic upgrade) and emergency fallback. Never used by Raptor application code when flag is 1.
RAPTOR_APP_DATABASE_URL raxx-api-* Restricted application credential. Set by operator from heroku pg:credentials:url output. Raptor reads this at startup when FLAG_RAPTOR_APP_ROLE_SEPARATION=1.
FLAG_RAPTOR_APP_ROLE_SEPARATION raxx-api-* Feature flag. Set to 1 after role is provisioned and verified. Set to 0 to fall back to DATABASE_URL without a code deploy.

Rollback

The flag is the fast rollback mechanism. No code deploy required.

heroku config:set FLAG_RAPTOR_APP_ROLE_SEPARATION=0 -a raxx-api-staging >/dev/null 2>&1
heroku restart -a raxx-api-staging
# Verify Raptor is back up and using DATABASE_URL
heroku logs -a raxx-api-staging -n 20

The raptor_app role and its grants do not need to be removed for rollback. Leave them in place so re-enable is a single config:set.


Emergency stop

Take Raptor off the restricted role instantly (no data impact, no migrations):

heroku config:set FLAG_RAPTOR_APP_ROLE_SEPARATION=0 -a raxx-api-prod >/dev/null 2>&1
heroku restart -a raxx-api-prod

Escalation

Escalate to the operator when:


Connects to