Console deploy — manual break-glass runbook
System: raxx-console-prod (Heroku) Owner: operator / sre-agent Last incident: 2026-05-15 (see #2201 — dispatch timed out, 926-minute silent failure) Last reviewed: 2026-05-15
Context
console/ is a subdirectory of the monorepo. Heroku expects to receive a repo
whose root IS the app. The canonical CI path (deploy-console.yml) handles this
with git subtree split and a direct git push. This runbook documents the
equivalent shell procedure for use when the CI workflow itself is unavailable
(e.g. runner outage, GitHub Actions incident, break-glass scenario).
When to use this runbook
Use this only when deploy-console.yml workflow_dispatch is unavailable or
has failed in a way that cannot be fixed quickly. Normal deploys go through CI.
Prerequisites
- Heroku account email with access to
raxx-console-prod - Heroku API token (from Infisical:
/MooseQuest/heroku/HEROKU_API_KEY) - Local clone of the monorepo with
mainup to date:git fetch origin && git checkout main && git pull
Manual deploy procedure
# 1. Confirm you are on the correct commit.
git log --oneline -5
# 2. Fetch your Heroku token from vault (or export from environment).
# Never paste the raw token in a script stored in the repo.
HEROKU_EMAIL="kris@moosequest.net"
HEROKU_API_KEY="<token from Infisical /MooseQuest/heroku/HEROKU_API_KEY>"
# 3. Write a .netrc for authentication.
# umask 077 ensures the file is created with 600 permissions.
umask 077
cat > "$HOME/.netrc" <<EOF
machine git.heroku.com
login ${HEROKU_EMAIL}
password ${HEROKU_API_KEY}
EOF
chmod 600 "$HOME/.netrc"
# 4. Produce a synthetic root commit whose tree IS the console/ subtree.
SUBTREE_SHA=$(git subtree split --prefix=console HEAD 2>/dev/null)
echo "Subtree SHA: $SUBTREE_SHA"
# 5. Push to Heroku.
git push --force \
"https://git.heroku.com/raxx-console-prod.git" \
"${SUBTREE_SHA}:refs/heads/main"
# 6. Verify the deploy completed.
# Wait ~60 s for dyno restart, then:
curl -fsS https://console.raxx.app/health | python3 -m json.tool
# Expected: HTTP 200, {"status":"ok"} (or 302 redirect to CF Access login)
How to tell it's broken
- Symptom:
deploy-console.ymlworkflow_dispatch fails withprocess.stdin.setRawMode is not a function— akhileshns action on Node 20+ - Symptom:
git pushreturnsInvalid credentials— wrong email or expired token - Symptom:
git subtree splitproduces no output / exits non-zero — theconsole/path does not exist in the current ref; verify the ref is correct - Symptom: health check returns 503 after push — dyno crashed on boot; check
heroku logs --tail --app raxx-console-prod
Known failure modes
Failure mode A: akhileshns setRawMode crash (Node 20+)
Symptom: CI step Deploy to Heroku fails with process.stdin.setRawMode is not a function
Cause: akhileshns/heroku-deploy uses an interactive TTY API that does not exist in CI runners
Fix: The CI workflow now uses the netrc + git-push pattern (PR #776). If this runbook is being used it means CI itself is down — proceed with the manual steps above.
Verification: curl -fsS https://console.raxx.app/health returns 200
Failure mode B: Heroku git auth rejection
Symptom: git push returns error: authentication failed or Do not authenticate with username and password using git
Cause: The .netrc login is not the registered Heroku account email, OR the token is expired/revoked
Fix:
1. Confirm the email matches the Heroku account: heroku auth:whoami (if CLI available)
2. Rotate the token: Heroku dashboard -> Account -> API Key -> Regenerate
3. Update the secret in Infisical and in the GitHub Environment secrets
Verification: Retry the push; it should proceed without auth errors
Failure mode C: run_id_not_found_within_window — dispatch accepted but no run appears
Symptom: Console deploy modal shows "Deploy timed out — GitHub Actions run could not be matched within 90 seconds." Failure stage shows DISPATCH. All downstream stages (Smoke gate / Freeze check / Deploy / Health check) show "not reached." Log pane is empty.
Cause: workflow_dispatch returned HTTP 204 (accepted) but no GH Actions run materialized in the API within the reconciler's 30-minute window. Known triggers:
1. Transient GH runner availability gap — run entered queued and disappeared before first reconciler poll.
2. Smoke gate failed before building callback fired — the run exists but never called back, so github_run_id was not back-filled. The console row times out.
3. Token scope insufficient — token lacks actions:write permission; dispatch silently accepts but queues nothing.
Diagnose:
# 1. Check if a run started at all in the dispatch window
gh run list \
--repo raxx-app/TradeMasterAPI \
--workflow=deploy-console.yml \
--created '2026-05-15T06:20:00Z..2026-05-15T06:30:00Z' \
--json databaseId,displayTitle,conclusion,createdAt,event
# 2. If a run exists but failed — check which job failed
gh run view <run_id> --repo raxx-app/TradeMasterAPI --json jobs
# 3. If no run exists — check GH status for active incidents
# https://www.githubstatus.com/
# 4. Verify token has correct scopes (needs actions:write for dispatch)
curl -H "Authorization: Bearer $GITHUB_API_DISPATCH_TOKEN" \
https://api.github.com/rate_limit | python3 -m json.tool
Fix: If no run started due to GH transient issue, retry the dispatch from the console UI. If smoke failed, fix the smoke failure first (check the failing run logs above), then retry. No rollback needed — no Heroku push occurred. Verification: Retry dispatch succeeds and progresses past DISPATCH stage in the modal within 2 minutes.
Failure mode D: subtree split fails (no output)
Symptom: git subtree split --prefix=console HEAD exits 0 but prints nothing; push target SHA is empty
Cause: The console/ directory does not exist at HEAD, or the git history has no commits touching console/
Fix: Verify the working directory and ref:
git log --oneline --follow -- console/app.py | head -3
ls console/
If console/ is present but subtree split still fails, try with --rejoin:
git subtree split --prefix=console --rejoin HEAD
Verification: SUBTREE_SHA is a 40-character SHA
Enabling runtime-dyno-metadata (one-time setup)
Required for HEROKU_SLUG_COMMIT to be available in the dyno environment
(used by the version-footer commit link per #775). Run once per app:
heroku labs:enable runtime-dyno-metadata --app raxx-console-prod
Verify after the next deploy:
heroku run "printenv | grep HEROKU_SLUG_COMMIT" --app raxx-console-prod
Emergency stop
To take raxx-console-prod offline cleanly (scale all dynos to 0):
heroku ps:scale web=0 --app raxx-console-prod
To bring it back:
heroku ps:scale web=1 --app raxx-console-prod
Escalation
Wake the operator when:
- The Heroku API token is revoked and cannot be rotated without account access
- The console/ subtree has a corrupted git history that blocks subtree split
- The dyno crashes on boot after a successful push (application-level bug, not deploy-level)
Contact: Kristerpher via Slack DM (D0AJ7K184TV) or kris@moosequest.net
References
- CI workflow:
.github/workflows/deploy-console.yml - Pattern origin: PR #730 (netrc auth applied to deploy-heroku.yml)
- Root cause issue: #776
- Heroku git auth docs: https://devcenter.heroku.com/articles/git#http-git-authentication
- Heroku runtime-dyno-metadata: https://devcenter.heroku.com/articles/dyno-metadata