Raxx · internal docs

internal · gated ↑ index

Deploy Freeze — Operator Runbook

Issue #635. Part of prod-deploy gating (#617).


What is the deploy freeze?

A global operator-controlled gate that blocks all prod deploy workflows when activated. It is the outermost gate — it does not replace per-deploy workflow confirmations, but it fires before them.

Every toggle (freeze/release) is written to the console audit log (console.deploy.freeze.flip).


When to activate (SEV1 / SEV2 incidents)

Freeze deploys when: - A SEV1 or SEV2 incident is in progress and you need to prevent any additional changes from landing in production while investigation is underway. - You need a maintenance window where no automated or manual deploys should fire. - An on-call operator requests a hold while a hotfix is being prepared and reviewed.

Do NOT freeze for routine work. The freeze is a blunt instrument — use it only when you need a hard stop.


How to freeze (primary path — console UI)

  1. Navigate to https://console.raxx.app/console/deploy-freeze (superadmin session required).
  2. The current state card shows whether deploys are active or frozen.
  3. Click Freeze Deploys.
  4. In the confirmation modal, enter a concise reason (e.g. SEV1 — API regression in prod, investigation in progress). The reason is stored and surfaced in workflow failure messages.
  5. Click Freeze Deploys in the modal to confirm.
  6. The page reloads and the status card turns red: FROZEN — all prod deploys blocked.

Every prod deploy workflow that runs after this will fail at the Deploy freeze check job with:

DEPLOY FROZEN — reason: <your reason> — toggle off at https://console.raxx.app/console/deploy-freeze

How to release (primary path)

  1. Navigate to https://console.raxx.app/console/deploy-freeze.
  2. Click Release Freeze.
  3. Confirm in the modal.
  4. Prod deploys resume on the next workflow run.

Break-glass: console is down

If the console is unreachable (e.g. console itself is down during the incident) and you need to freeze deploys:

gh secret set DEPLOY_FREEZE_OVERRIDE --repo raxx-app/TradeMasterAPI --body 1

This sets a GitHub Actions repo secret. On the next workflow run, the check-deploy-freeze action will see DEPLOY_FREEZE_OVERRIDE=1 and fail the job with:

DEPLOY FROZEN (break-glass) — DEPLOY_FREEZE_OVERRIDE=1 is set.
Clear the repo secret to release.

To release the break-glass freeze:

gh secret delete DEPLOY_FREEZE_OVERRIDE --repo raxx-app/TradeMasterAPI

Or set it to an empty value / 0:

gh secret set DEPLOY_FREEZE_OVERRIDE --repo raxx-app/TradeMasterAPI --body ""

Once the console recovers, also toggle the UI freeze off if it was set there too (check the audit log — see below).


Fail-open behaviour (when both signals are absent)

If the console API is unreachable AND DEPLOY_FREEZE_OVERRIDE is not set, the check-deploy-freeze action logs a ::warning:: and PROCEEDS. This is intentional: fail-closed on every console hiccup would brick CI. The audit trail means you can always reconstruct what deployed while the console was down.


Audit log location

All freeze flips are written to console_audit_log with action console.deploy.freeze.flip.

To review post-incident:

  1. Console superadmin session → check audit log if a UI view is available.
  2. Direct DB query (Heroku Postgres): sql SELECT created_at, payload_redacted FROM audit_log WHERE action = 'console.deploy.freeze.flip' ORDER BY created_at DESC LIMIT 20;
  3. Heroku logs for the failed workflow runs will show the freeze reason inline.

Two secrets to provision (operator action, one-time setup)

The check-deploy-freeze composite action authenticates to the console internal API using a Cloudflare Access service token. Provision and set these two repo secrets:

Secret name Where to find the value
CF_DEPLOY_FREEZE_CLIENT_ID CF Zero Trust → Access → Service Tokens → create deploy-freeze-ci → Client ID
CF_DEPLOY_FREEZE_CLIENT_SECRET Same token → Client Secret

These are distinct from CF_ACCESS_CLIENT_ID/SECRET (the vault bootstrap pair). Store them in vault at /MooseQuest/console/ for rotation reference.

Also ensure the corresponding env vars are set on the console Heroku app:

heroku config:set CF_DEPLOY_FREEZE_CLIENT_ID=<value> CF_DEPLOY_FREEZE_CLIENT_SECRET=<value> --app raxx-console-prod

Future scope (out of this card)