SOP — Heroku Rack Apps Bootstrap (Eco Tier, Pre-Launch)
Owner: Operator (Kristerpher) + agent
Last updated: 2026-05-31 UTC
Refs: #1401 (Reasonator service scaffold), #1398 (Pro/Pro+ sentiment surface), docs/architecture/reasonator/design.md, docs/architecture/reasonator/adr/0054-rack-deployment-target.md
Terminology note
The service is documented internally as Reasonator (per project_codenames.md and docs/architecture/reasonator/design.md). The Heroku app name retains the original codename rack (raxx-rack-staging, raxx-rack-prod) per the operator decision 2026-05-31 — the existing parent card (#1401) and the ADR-0054 deployment target both reference rack-raxx / rack-raxx-staging (the proposed names at scaffold time), but the operator's app-naming convention per project_heroku_app_names.md is raxx-<surface>-<env>. This runbook uses raxx-rack-staging and raxx-rack-prod accordingly. The internal feature flag is FLAG_REASONATOR regardless of the Heroku-side naming.
When to run this
You need this runbook the first time the Reasonator service is deployed to Heroku — once for staging, once for prod. Both apps are created at Eco tier ($5/mo each) per the operator decision 2026-05-31 UTC and stay there until the first paid customer of the Reasonator-backed surface signs up; after that they upgrade to Standard-1X (or Standard-2X per ADR-0054's original spec — see "Upgrade trigger" below).
This runbook covers:
- Creating both Heroku apps with the correct naming convention
- Buildpack + add-on selection at Eco tier
- Heroku config wiring (silenced per memory rule)
- Initial feature-flag posture (
FLAG_REASONATOR=0— off until first sentiment event is ready) - Domain attach (
rack.raxx.app+rack-staging.raxx.app); CF DNS update is a separate operator step - Pre-launch cost envelope: ~$25-30/mo combined
Pre-conditions
- Heroku CLI logged in as the operator (
heroku auth:whoamireturnskris@moosequest.net). - Operator is on the Heroku org with permissions to create apps (any Pro-plan member).
- Infisical vault is reachable with the agent's CF Access service token (
INFISICAL_TOKENexported). - The
/rack/folder exists in Infisical (or will be created in Step 4 — must exist before any secret write perfeedback_vault_folder_must_exist.md). - Reasonator code is committed to the repo at
rack/(per #1401 scaffold) and the Procfile is in place. - The Cloudflare DNS update for
rack.raxx.app+rack-staging.raxx.appis a separate operator step (Step 6) — coordinate timing so DNS is not pointed at empty apps for >15 minutes.
Step 1 — Create the apps
heroku apps:create raxx-rack-staging --region us --team mooseQuest >/dev/null 2>&1
heroku apps:create raxx-rack-prod --region us --team mooseQuest >/dev/null 2>&1
If --team mooseQuest errors with "team not found," check the exact team slug with heroku teams first. The default region is us; do NOT use eu — the Reasonator scoring is for US-market news, latency to the US is what matters, and there's no GDPR data-residency obligation per the EU geo-block decision (project_eu_geoblock_decision.md).
Both heroku apps:create calls go to stdout-silenced because the create command echoes the app's HTTPS URL + git remote, both of which are also accessible via heroku apps:info afterward and don't need to be in shell history.
Step 2 — Buildpack + runtime
Reasonator is Python 3.11 per docs/architecture/reasonator/design.md §packaging and the runtime.txt checked into rack/. Heroku detects this from the Procfile and runtime.txt automatically, but explicitly set the buildpack to avoid surprises:
heroku buildpacks:set heroku/python --app raxx-rack-staging >/dev/null 2>&1
heroku buildpacks:set heroku/python --app raxx-rack-prod >/dev/null 2>&1
The transformers + torch (CPU build) wheel install is the load-bearing one; verify requirements.txt pins torch==2.x.x+cpu and --extra-index-url https://download.pytorch.org/whl/cpu is set, otherwise Heroku pulls the GPU build and the slug exceeds the 500 MB limit.
Step 3 — Dyno tier + add-ons (Eco)
Per the operator decision 2026-05-31 UTC: stay at Eco until the first paid customer of the Reasonator-backed surface signs up. The Eco tier sleeps after 30 minutes of inactivity; the keep-alive cron from #1401 (*/10 * * * * ping on /v1/health) is what keeps the dyno warm during operator-testing.
# Eco dyno scale (one web dyno per app)
heroku ps:scale web=1 --app raxx-rack-staging >/dev/null 2>&1
heroku ps:scale web=1 --app raxx-rack-prod >/dev/null 2>&1
# Eco dyno type
heroku ps:type eco --app raxx-rack-staging >/dev/null 2>&1
heroku ps:type eco --app raxx-rack-prod >/dev/null 2>&1
Add-ons at Eco / Mini tier:
# Postgres Mini ($5/mo each) — Reasonator is stateless per design.md §state,
# but Mini Postgres is required if cards under #1401 add a job-queue table
# (e.g., batch_score_jobs). Provision at bootstrap to avoid a downtime swap later.
heroku addons:create heroku-postgresql:mini --app raxx-rack-staging >/dev/null 2>&1
heroku addons:create heroku-postgresql:mini --app raxx-rack-prod >/dev/null 2>&1
# Redis Mini — provision ONLY if the implementation needs it for the
# batch queue. Reasonator's design.md §queue uses an in-memory queue at v1
# (acceptable per I-3 graceful-degradation). Hold Redis provision until
# the first card actually needs a persistent queue.
# heroku addons:create heroku-redis:mini --app raxx-rack-staging >/dev/null 2>&1
# heroku addons:create heroku-redis:mini --app raxx-rack-prod >/dev/null 2>&1
Heroku Postgres Mini gotcha — pg:credentials:create, not CREATE ROLE WITH PASSWORD. Per feedback_heroku_pg_rds_password_gotcha.md, RDS-backed Heroku Postgres reserves password operations to rds_password members and the DATABASE_URL owner is NOT in that group. Any role-creation script must use heroku pg:credentials:create --name <name> --app raxx-rack-prod, not direct SQL.
Cost envelope (Eco):
| Line item | Monthly cost |
|---|---|
| Heroku Eco dyno × 2 (staging + prod) | $10 |
| Heroku Postgres Mini × 2 | $10 |
| Heroku Redis Mini × 2 (if needed) | $6 |
| Combined pre-launch (no Redis) | ~$20/mo |
| Combined pre-launch (with Redis) | ~$26/mo |
This aligns with the operator's stated $25-30/mo envelope.
Step 4 — Vault bootstrap
Create the /rack/ folder first (per feedback_vault_folder_must_exist.md). Use environment-scoped writes — Infisical separates staging and prod:
curl -X POST "https://app.infisical.com/api/v1/folders" \
-H "Authorization: Bearer $INFISICAL_TOKEN" \
-H "Content-Type: application/json" \
-d '{"workspaceId":"<wsid>","environment":"prod","name":"rack","path":"/"}' \
>/dev/null 2>&1
curl -X POST "https://app.infisical.com/api/v1/folders" \
-H "Authorization: Bearer $INFISICAL_TOKEN" \
-H "Content-Type: application/json" \
-d '{"workspaceId":"<wsid>","environment":"staging","name":"rack","path":"/"}' \
>/dev/null 2>&1
Then write the secrets per #1401:
| Secret name | Path | Value |
|---|---|---|
RACK_SERVICE_TOKEN |
/rack/ (env: prod) |
32-byte random base64url; mint with openssl rand -base64 32 \| tr -d '=' \| tr '/+' '_-' |
RACK_SERVICE_TOKEN |
/rack/ (env: staging) |
Separate value from prod; never reuse |
SENTRY_DSN |
/rack/ (env: prod) |
From Sentry project Reasonator → Settings → Client Keys |
SENTRY_DSN |
/rack/ (env: staging) |
Separate DSN per env; do NOT cross-mix |
FINBERT_MODEL_SHA |
/rack/ (env: prod) |
TODO — per #1401 OQ-2, awaiting operator confirmation. Use unknown placeholder until resolved |
Never paste any value into a chat / PR / commit. Per feedback_no_inline_secrets_in_repo.md.
Step 5 — Heroku config
The Reasonator app reads everything from vault at boot via the agent's CF-Access-headered REST calls (per feedback_secrets_in_vault_sop.md). Only three Heroku config keys are needed: the Infisical bootstrap credentials, the env label, and the feature flag.
heroku config:set \
INFISICAL_PROJECT_ID=<from-vault> \
INFISICAL_CLIENT_ID=<from-vault> \
INFISICAL_CLIENT_SECRET=<from-vault> \
CF_ACCESS_CLIENT_ID=<from-vault> \
CF_ACCESS_CLIENT_SECRET=<from-vault> \
ENV=production \
FLAG_REASONATOR=0 \
--app raxx-rack-prod >/dev/null 2>&1
heroku config:set \
INFISICAL_PROJECT_ID=<from-vault-staging> \
INFISICAL_CLIENT_ID=<from-vault-staging> \
INFISICAL_CLIENT_SECRET=<from-vault-staging> \
CF_ACCESS_CLIENT_ID=<from-vault-staging> \
CF_ACCESS_CLIENT_SECRET=<from-vault-staging> \
ENV=staging \
FLAG_REASONATOR=0 \
--app raxx-rack-staging >/dev/null 2>&1
FLAG_REASONATOR=0 means the customer-facing surfaces gated by Reasonator (Pro retrospective panel, Pro+ real-time chip — #1398) stay off until the operator flips it. Per feedback_new_flag_needs_b1_migration_same_pr.md, the PR that lands this flag must also include the console_flag_promotions migration; this runbook only documents the Heroku-side value.
Every heroku config:set is silenced per feedback_heroku_config_set_echoes_secrets.md — even bootstrap IDs that aren't secret-grade are cleaner kept out of shell history.
To verify (length-check, never value-dump):
heroku config:get FLAG_REASONATOR --app raxx-rack-prod
# Expected: 0
heroku config:get RACK_SERVICE_TOKEN --app raxx-rack-prod | wc -c
# Expected: 44 (32 bytes base64 = 44 chars + newline; subtract 1 to confirm)
Step 6 — Domain attach (Heroku + Cloudflare)
Attach the custom domain to each app, then update Cloudflare DNS to point at the Heroku DNS target.
heroku domains:add rack-staging.raxx.app --app raxx-rack-staging
heroku domains:add rack.raxx.app --app raxx-rack-prod
Capture the DNS target each command emits (looks like xxxxxx.herokudns.com). NOTE: these commands DO need to emit their output — the DNS target is needed for Cloudflare. They are not silenced.
Cloudflare DNS update (separate operator step — UI navigation or wrangler / cf-api):
- Cloudflare Dashboard → DNS → Records for
raxx.app - Add CNAME
rack→<prod-heroku-dns-target>.herokudns.com, proxied (orange cloud) - Add CNAME
rack-staging→<staging-heroku-dns-target>.herokudns.com, proxied (orange cloud) - Confirm SSL on both records (Cloudflare Universal SSL covers this by default)
- If Cloudflare Access is enabled on
*.raxx.app(perfeedback_cf_access_does_not_bypass_bot_fight_mode.md), pair the Access policy with the WAF skip rule keyed onCF-Access-Client-Idfor the Raptor → Reasonator call path
Wait for DNS propagation (1-5 minutes for orange-cloud records). Verify:
curl -I https://rack.raxx.app/v1/health
# Expected: HTTP/2 503 with {"status":"warming_up","model_loaded":false} body
# (FinBERT is ~400 MB; first boot takes ~30s; 503 is correct until model load completes)
Step 7 — Deploy workflow + keep-alive
Per #1401, two GitHub Actions workflows ship in the same scaffold PR:
.github/workflows/deploy-rack.yml— Heroku deploy following ADR-0053.github/workflows/rack-keep-alive.yml—*/10 * * * *cron pinging/v1/healthon staging and prod
The deploy workflow needs the Heroku API key in the repo secrets:
- Repo secret
HEROKU_API_KEY— minted viaheroku authorizations:createand copied to the GitHub secret. Do NOT use a long-lived personal API key; the authorization-created token can be scoped + revoked.
The keep-alive workflow is critical at Eco tier — without it, the dyno sleeps after 30 minutes and the first request after sleep takes ~30s for FinBERT to reload. With the 10-minute cron, the dyno stays warm during US-market hours.
Pre-launch digest framing per feedback_pre_launch_digest_notifications.md: keep-alive failures + deploy failures route to the daily digest, not individual Slack pings. Only health-endpoint 5xx persisting >3 successive cron windows (30 minutes) escalates to per-event alerting.
Step 8 — Initial smoke
After Step 6 propagates:
# Health endpoint - should return 503 warming_up immediately,
# then 200 ok within ~60s after the model loads
curl -sf https://rack-staging.raxx.app/v1/health
sleep 60
curl -sf https://rack-staging.raxx.app/v1/health | jq .
# Expected: {"status":"ok","model_loaded":true,"model_sha":"...","queue_depth":{...},"uptime_seconds":N,"version":"1.0.0"}
# Same for prod
curl -sf https://rack.raxx.app/v1/health
sleep 60
curl -sf https://rack.raxx.app/v1/health | jq .
If the prod app's health endpoint returns 200 with model_loaded: true, Step 8 is complete and the bootstrap is done.
If health endpoint returns persistent 503 after >2 minutes, check:
- Slug size:
heroku slugs --app raxx-rack-prod. If >500 MB, the GPU torch wheel slipped in — revertrequirements.txtto pin the CPU build. - Boot log:
heroku logs --tail --app raxx-rack-prod | grep -E "(error|Traceback|finbert)". The most-common boot failure is the FinBERT model fetch from HuggingFace timing out — increaseWEB_CONCURRENCY=1and the gunicorn--timeout 180if seen. - Vault reachability:
heroku run --app raxx-rack-prod python -c "from rack.config import vault_health; print(vault_health())". If this fails, the CF Access headers are stale perproject_session_env_staleness.md— re-rotate.
Upgrade trigger (Eco → Standard-1X)
When to upgrade:
- The first paid customer of the Reasonator-backed surface signs up (Pro or Pro+ tier with sentiment access enabled).
- OR: dyno-sleep latency starts impacting operator-testing — measured by health-endpoint p99 > 2s sustained for 24h.
Upgrade command (no downtime, dyno restarts in place):
heroku ps:type standard-1x --app raxx-rack-prod >/dev/null 2>&1
ADR-0054 originally specced Standard-2X. Re-evaluate the 1X vs 2X choice based on actual queue depth and p99 latency in the first 30 days of paid traffic — the design's "min 1 dyno always running" requirement is satisfied at 1X, the 2X step is a memory-headroom margin that may or may not be needed.
When Standard-1X is in place, also upgrade Postgres from Mini ($5) to Basic ($9) if the row count on the audit / score-event tables exceeds 10K (Mini's row limit).
Per project_oncall_severity_routing.md, the upgrade is a SEV3 routine ops change — agent-autonomous; no PagerDuty needed.
Rollback / teardown (rare)
If the operator decides to retire Reasonator pre-launch (e.g., Alpaca display-rights review per docs/blr/2026-05-31-alpaca-display-rights-memo.md returns "no display permitted" and the surface is dropped):
heroku config:set FLAG_REASONATOR=0 --app raxx-rack-prod >/dev/null 2>&1
heroku ps:scale web=0 --app raxx-rack-prod >/dev/null 2>&1
# Do NOT destroy the app — keeping it at scale=0 retains the config + add-ons
# for forensic and audit-log purposes. Destroy only after a 90-day window
# with operator written approval.
Same pattern for staging if needed.
Common pitfalls
-
Eco dyno sleeps without keep-alive. The keep-alive cron is the load-bearing piece; without it operator-testing latency is awful. Verify the cron is scheduled before Step 8.
-
Slug size blows past 500 MB. PyTorch GPU build is the usual culprit. Pin CPU build with
--extra-index-url https://download.pytorch.org/whl/cpuinrequirements.txt. -
FLAG_REASONATOR=0does not gate Reasonator's own deployment. The flag gates the customer-facing sentiment surfaces in Antlers / Raptor (#1398). The Reasonator service itself runs regardless; flipping the flag to0does not stop the dyno, only the consumer. -
Heroku Postgres Mini row limit (10K) is small. Once the audit table exceeds 10K rows, writes start failing silently for
INSERTs. Monitor viaheroku pg:info --app raxx-rack-prod. Plan the upgrade to Basic before this hits in practice. -
CF DNS update is a separate operator step. Step 6 has both Heroku-side
domains:addand Cloudflare-side DNS. The Heroku side is in this runbook; the Cloudflare side is in the operator's manual flow. Do not assume Cloudflare auto-updates fromheroku domains:add. -
raxx-rack-*vsrack-raxx-*naming inconsistency. Perproject_heroku_app_names.mdthe operator's convention israxx-<surface>-<env>. #1401 was written before this convention was locked and usesrack-raxx-*. This runbook uses the convention-correct names. If the ops-deploy workflow in.github/workflows/deploy-rack.ymlstill uses the older names, update it in the same PR that lands the bootstrap. -
Per
feedback_pre_launch_digest_notifications.md, Reasonator health-check failures route to the daily digest pre-launch, not per-event Slack pings. Verify the alert routing before Step 8 to avoid noise.
Refs
docs/architecture/reasonator/design.md— service designdocs/architecture/reasonator/adr/0054-rack-deployment-target.md— deployment target ADRdocs/architecture/reasonator/adr/0055-reasonator-api-contract-rest.md— REST API contractdocs/architecture/reasonator/adr/0056-reasonator-service-auth.md— service authdocs/architecture/reasonator/adr/0057-reasonator-rescoring-model-sha-provenance.md— model SHA trackingdocs/architecture/reasonator/cost-model.md— cost analysisfeedback_heroku_config_set_echoes_secrets.md— silence everyconfig:setfeedback_heroku_pg_rds_password_gotcha.md—pg:credentials:create, neverCREATE ROLEfeedback_no_inline_secrets_in_repo.md— vault-only secretsfeedback_secrets_in_vault_sop.md— vault is the runtime sourcefeedback_vault_folder_must_exist.md— create/rack/before any writefeedback_new_flag_needs_b1_migration_same_pr.md— flag + promotion migration in same PRfeedback_pre_launch_digest_notifications.md— pre-launch digest routingproject_heroku_app_names.md—raxx-<surface>-<env>namingproject_codenames.md— Reasonator (internal) vs rack (Heroku app name)project_session_env_staleness.md— CF Access token drift-
1398 — Pro/Pro+ sentiment surface (consumer)
-
1401 — Reasonator scaffold (this runbook's parent card)
-
204 — Founders Promo epic (Reasonator-backed surface is a Pro+ feature)