Raxx · internal docs

internal · gated

Console review apps runbook

System: raxx-console-pr- (Heroku Review Apps) Owner: operator / sre-agent Last incident: none (new) Last reviewed: 2026-05-12

Overview

Heroku Review Apps spin up a per-PR Console instance when a pull request touches console/**. Each review app is an isolated Heroku app (raxx-console-pr-<N>) with its own Postgres database and stub vault credentials. It is created on PR open/push and torn down automatically when the PR is closed.

This system is Console-only. Raptor (API backend) does not use review apps.

Architecture

PR opened / pushed (touches console/**)
  |
  v
.github/workflows/review-app-console.yml
  |
  +-- Build Tailwind CSS
  +-- Copy feature_flags.yaml
  +-- git subtree split --prefix=console HEAD
  +-- heroku apps:create raxx-console-pr-<N>   (first push only)
  +-- heroku addons:create heroku-postgresql:essential-0  (first push only)
  +-- heroku config:set REVIEW_APP=true + stub vars
  +-- git push heroku <subtree-sha>:refs/heads/main
  +-- Heroku release phase: bash scripts/migrate.sh
  +-- Health check: GET /health (6 retries x 10s)
  +-- Sticky PR comment with URL + status

PR closed (merged or abandoned):

  +-- heroku apps:destroy raxx-console-pr-<N>
  +-- Sticky PR comment updated to "torn down"

Isolation guarantees

Resource Review app behavior
Postgres Dedicated Essential-0 per PR — never shared with staging or prod
Vault / Infisical Stub credentials (stub-review-app) — no vault reads
Raptor API Points to raxx-api-staging — never prod
WebAuthn RP_ID Set to raxx-console-pr-<N>.herokuapp.com — passkeys are throwaway
Feature flags Copied from backend_v2/api/feature_flags.yaml at build time (same as staging/prod)
Sentry SENTRY_DSN_CONSOLE is not set — errors are not reported to Sentry from review apps

One-time operator bootstrap (prerequisite)

Review apps will not deploy until the Heroku pipeline has the feature enabled. This is a one-time manual action by the operator; CI cannot do it.

Step 1 — Confirm the pipeline exists

heroku pipelines:info raxx-console

Expected output includes two stages: staging (raxx-console-staging) and production (raxx-console-prod).

If the pipeline does not exist, create it:

heroku pipelines:create raxx-console \
  --app raxx-console-staging \
  --stage staging
heroku pipelines:add raxx-console \
  --app raxx-console-prod \
  --stage production

Step 2 — Enable Review Apps on the pipeline

Via the Heroku Dashboard:

https://dashboard.heroku.com/pipelines/raxx-console
  1. Click "Enable Review Apps" in the Review Apps section.
  2. Under "Create new review apps for new pull requests automatically" — leave this OFF. The workflow creates apps programmatically; auto-creation would duplicate the deploy.
  3. Under "Destroy stale review apps automatically" — set to 5 days. This is a safety net if the workflow teardown step fails (e.g., a force-closed PR that didn't fire the closed event).
  4. Set the app.json source to console/app.json. Heroku uses this to resolve the addon list and postdeploy script for pipeline-initiated review apps. The workflow-managed path also honours this file.
  5. Save.

Step 3 — Verify HEROKU_API_KEY scope

The HEROKU_API_KEY secret in GitHub must belong to an account with Owner or Collaborator access on the raxx-console-staging and raxx-console-prod apps. The same token used for deploy-console.yml is sufficient.

Check:

HEROKU_API_KEY=<token> heroku apps:info --app raxx-console-staging

Expected: app info output (no "You do not have access" error).

Step 4 — Smoke-test with a draft PR

Open a draft PR that touches console/README.md (trivial change). Confirm:

How to tell it's broken

How to diagnose (in order)

  1. Check workflow run logs: gh run list --workflow=review-app-console.yml --limit 10 Look for the PR number in the run title.

  2. Check whether the Heroku app exists: heroku apps:info --app raxx-console-pr-<N> If it doesn't exist and the PR is open, the create step failed.

  3. Check Heroku release phase logs (migration failures surface here): heroku releases --app raxx-console-pr-<N> heroku logs --app raxx-console-pr-<N> --source app --dyno release

  4. Check dyno state: heroku ps --app raxx-console-pr-<N> Expected: web.1: up. If crashed or starting, check app logs.

  5. Check app logs: heroku logs --tail --app raxx-console-pr-<N> Look for import errors, missing env vars, or migration failures.

  6. Verify the pipeline has Review Apps enabled: heroku pipelines:info raxx-console The output should include a "review_app_config" section.

Known failure modes

Failure mode A: Pipeline not enabled — apps:create succeeds but Heroku ignores app.json

Symptom: Review app is created but scripts/migrate.sh does not run (the release phase is absent from heroku releases output).

Cause: Heroku only honours app.json scripts.postdeploy and environments.review when the app is part of a pipeline with Review Apps enabled. A standalone heroku apps:create without pipeline membership does not read app.json.

Fix: Complete the one-time bootstrap in the "Operator prerequisite" section above. Then destroy and recreate the review app:

heroku apps:destroy --app raxx-console-pr-<N> --confirm raxx-console-pr-<N>
# Re-run the workflow: push an empty commit to the PR branch.
git commit --allow-empty -m "chore: trigger review-app rebuild"
git push

Verification: heroku releases --app raxx-console-pr-<N> shows a Deploy entry followed by a Released entry from the release phase.


Failure mode B: HEROKU_API_KEY expired or wrong scope

Symptom: Workflow fails at "Create or update Heroku review app" with Error: Couldn't find that app. or 401 Unauthorized.

Cause: The API key in the HEROKU_API_KEY GitHub secret has expired or belongs to an account without access to the raxx-console pipeline.

Fix: Rotate the key and update the GitHub secret:

# 1. Generate a new token.
heroku authorizations:create --description "raxx-console-review-apps-ci" --scope=global

# 2. Update the GitHub secret.
gh secret set HEROKU_API_KEY --body "<new-token>"

See runbook docs/ops/runbooks/heroku-api-key-drift-recovery.md for the full rotation procedure.

Verification: Re-run the failed workflow from the GitHub Actions UI.


Failure mode C: Postgres Essential-0 provisioning timeout

Symptom: Workflow fails at "Attaching Postgres Essential-0 ..." with Timed out waiting for addon or similar.

Cause: Heroku Essential-0 provisioning occasionally takes >60s (Heroku infrastructure issue). The --wait flag in heroku addons:create blocks until the addon is ready but does not retry on timeout.

Fix: Destroy the partially-created app and re-run the workflow:

heroku apps:destroy --app raxx-console-pr-<N> --confirm raxx-console-pr-<N>
# Re-run from GitHub Actions UI.

Verification: heroku addons --app raxx-console-pr-<N> shows heroku-postgresql (essential-0) in state created.


Failure mode D: Subtree split fails — no commits touching console/

Symptom: Workflow fails at "Stage deploy-time generated files" or git subtree split with Working tree has modifications.

Cause: The PR branch has uncommitted changes that prevent the staging commit, or the console/ subtree has never had any commits (highly unlikely but possible on fresh repos).

Fix: Inspect the workflow logs. If it is a dirty-worktree issue, the prior build steps left unclean state. This is a workflow bug — file a reliability issue.


Failure mode E: Stale review app left running after PR close

Symptom: heroku apps lists raxx-console-pr-<N> for a PR that has been closed for more than 5 days.

Cause: The teardown step in the workflow failed (network error, expired token, or the closed event was not delivered by GitHub).

Fix (manual teardown):

heroku apps:destroy --app raxx-console-pr-<N> --confirm raxx-console-pr-<N>

Prevention: The "Destroy stale review apps automatically" setting in the pipeline (set to 5 days during bootstrap) is the safety net for this case.


Emergency stop — tear down all review apps

If review apps are accumulating charges unexpectedly:

# List all review apps.
heroku apps --json | python3 -c "
import json, sys
apps = json.load(sys.stdin)
for a in apps:
    if a['name'].startswith('raxx-console-pr-'):
        print(a['name'])
"

# Destroy each one.
for app in raxx-console-pr-1234 raxx-console-pr-5678; do
  heroku apps:destroy --app "$app" --confirm "$app"
done

Cost model

Resource Cost
Eco dyno ~$0/mo (included in Heroku Eco plan)
Essential-0 Postgres ~$0/mo while under free row limit; ~$5/mo if heavy usage
Network egress negligible for operator-only review traffic

Review apps are destroyed on PR close, so cost accrues only for the lifetime of the PR. Open PRs with active review apps cost ~$5/mo each if the database row limit is exceeded (unlikely for operator review sessions).

Escalation

Wake the operator when: - More than 3 review apps are left running for closed PRs (billing risk) - The bootstrap step fails with a permissions error that cannot be resolved with token rotation - A review app is serving unexpected live data (isolation breach)

Contact: Kristerpher via Slack DM (D0AJ7K184TV).

References