Gatekeeper — develop to release runbook
System: .github/workflows/gatekeeper-develop-to-release.yml
Owner: sre-agent / operator
Last incident: 2026-06-30 (key leak + 403 push failure — see security note below)
Last reviewed: 2026-06-30
Security note — 2026-06-30 — raxx-ops-bot App private key exposed in logs
Incident: The initial run of cut-release-candidate.yml logged the raxx-ops-bot GitHub App RSA private key in cleartext (base64 PEM lines visible in step output).
Root cause: load-vault-secrets/action.yml used echo "::add-mask::${VALUE}" to register secrets for masking. GitHub Actions' ::add-mask:: command only processes the FIRST line of its argument; subsequent lines of a multiline PEM key appeared unmasked in the step log.
Fix applied (this PR): Replaced the single-echo mask with a per-line loop in load-vault-secrets/action.yml so every line of every fetched secret is individually masked before the value is written to GITHUB_ENV.
Operator action required — rotate the raxx-ops-bot App private key: The private key was briefly visible in GitHub Actions step logs. Although Actions logs are restricted to repo collaborators, the key must be treated as compromised and rotated.
# 1. Generate a new private key in the GitHub App settings:
# GitHub → Settings → Developer settings → GitHub Apps → raxx-ops-bot
# → Private keys → Generate a private key → download the .pem file
# 2. Update the key in Infisical vault:
# Vault path: /MooseQuest/raxx-ops-bot/PRIVATE_KEY_PEM
# Update via the Infisical UI or API with the new PEM content.
# 3. Revoke the old private key in the GitHub App settings.
Until the key is rotated, treat any operations authenticated as raxx-ops-bot during the exposure window as potentially unauthorized.
Model change — 2026-06-30
The gatekeeper was redesigned from a push-triggered model to a tag-triggered model.
Old model (removed): on: push: branches: [develop] — fired on every push to develop, polled ci.yml for up to 30 minutes per push. On a rapid merge-train this fired repeatedly, churned cancelled runs, and one run hit the 30-min timeout and failed.
New model: on: push: tags: ['release-*'] — fires only when a release-* tag is pushed. The tag is the explicit "this develop commit is release-worthy" signal. Fixes accumulate on develop freely and roll up under the next release tag.
The cut-release-candidate workflow (cut-release-candidate.yml) is the standard entry point for creating a release tag.
What the gatekeeper does (new model)
When a release-* tag is pushed to the repository:
- Asserts the tagged commit is on
develop— refuses to promote a tag that is not an ancestor oforigin/develop. - Verifies the tagged SHA's
ci.ymlrun concluded success — short bounded check (10 probes × 30 s = 5 min max). Fails fast if CI is not green. - Idempotency — if the tagged commit is already in
release, exits cleanly with no action. - Merges
develop→releasewith--no-ff(merge commit per ADR-0115 §Merge strategy). - Pushes
release— the push fires the existing staging deploy workflows automatically (deploy-heroku.yml,deploy-antlers-next-staging.yml,deploy-console.yml,deploy-queue.yml,deploy-velvet.yml). No deploy logic lives in the gatekeeper.
What cut-release-candidate does
cut-release-candidate.yml is a workflow_dispatch workflow that:
- Checks out
developHEAD. - Verifies
ci.ymlis green for that commit (non-polling — fails immediately if not complete). - Computes the release tag name (
release-YYYY.MM.DDorrelease-YYYY.MM.DD-<sha7>if a tag for today already exists). - Creates an annotated tag on develop HEAD and pushes it.
- The tag push fires
gatekeeper-develop-to-release.ymlautomatically.
Going to production (release → main)
The release → main boundary is governed by a semver tag (v*.*.*). Pushing a semver tag triggers promote-release-to-main.yml, which runs the heavy scan suite and on success merges release → main (→ prod deploy). See docs/ops/runbooks/promote-release-to-main.md (or the workflow header) for the full operator procedure.
Operator commands
Cut a release candidate and promote develop → staging:
# Standard path — let cut-release-candidate.yml do the work:
gh workflow run cut-release-candidate.yml
# Monitor the tag creation:
gh run list --workflow=cut-release-candidate.yml --limit=3
# Monitor the gatekeeper promotion:
gh run list --workflow=gatekeeper-develop-to-release.yml --limit=3
# Monitor staging deploys:
gh run list --workflow=deploy-heroku.yml --branch=release --limit=5
Ship a prod release (release → main via semver tag):
# Apply a semver tag to the tip of the release branch:
git fetch origin release
git tag -a v1.4.0 origin/release -m "Release v1.4.0"
git push origin v1.4.0
# Monitor the heavy-scan + promotion:
gh run list --workflow=promote-release-to-main.yml --limit=3
# Monitor prod deploys:
gh run list --workflow=deploy-heroku.yml --branch=main --limit=5
Manual gatekeeper trigger (after a freeze or when re-tagging):
# Re-push an existing tag (force) to re-fire the gatekeeper on the same SHA:
git fetch origin --tags
git push origin refs/tags/release-2026.06.30 --force
# Or cut a new tag for the same SHA (preferred — creates an audit record):
git tag -a release-2026.06.30-abc1234 <sha> -m "Re-cut RC"
git push origin release-2026.06.30-abc1234
How to tell the gatekeeper is broken
- A
release-*tag exists in the repo butreleasebranch did not advance within ~10 min. - No staging deploy fired after the tag was pushed.
- The workflow run in GitHub Actions → "Gatekeeper — develop to release" shows a failed run.
How to diagnose (in order)
-
Find the relevant gatekeeper run:
bash gh run list --workflow=gatekeeper-develop-to-release.yml --limit=5 -
View the logs for the failed run:
bash gh run view <run-id> --log -
Identify which step failed: - "Load raxx-ops-bot credentials" — vault unreachable or INFISICAL_ secrets stale. - "Mint raxx-ops-bot installation token" — GitHub App credentials invalid or expired. - "Assert tagged commit is on develop" — the tag was applied to a commit not on develop. - "Verify ci.yml is green for tagged commit" — ci.yml failed or not yet completed for the tagged SHA. - "Merge develop into release"* — merge conflict or push permission error.
-
For CI gate failures, find the specific ci.yml run:
bash gh run list --workflow=ci.yml --branch=develop --limit=5 gh run view <run-id> --log-failed
Known failure modes
Failure mode A: ci.yml not green for tagged commit
Symptom: Step "Verify ci.yml is green for tagged commit" exits with ci.yml concluded 'failure' or times out after 5 min.
Cause A1 (failure conclusion): The tagged commit's ci.yml run failed. This is the gate working correctly — a red commit was tagged by mistake.
Fix: Fix the failing check on develop, then cut a new release tag:
# After the fix PR merges to develop:
gh workflow run cut-release-candidate.yml
Cause A2 (timeout — no completed run within 5 min): ci.yml was never triggered for this commit, OR it is queued behind other runs. The 5-min bounded check is designed to be short precisely because we only tag commits we believe are already green.
Fix: Verify ci.yml ran for the tagged SHA:
TAG_SHA=$(git rev-parse refs/tags/<tag-name>)
gh api "repos/raxx-app/TradeMasterAPI/actions/workflows/ci.yml/runs?head_sha=${TAG_SHA}&per_page=5" \
--jq '.workflow_runs[] | {id, status, conclusion, html_url}'
If ci.yml succeeded but the API returned it slowly, re-push the tag to re-fire the gatekeeper:
git push origin refs/tags/<tag-name> --force
Verification: gh run list --workflow=gatekeeper-develop-to-release.yml --limit=3
Failure mode B: tagged commit not on develop
Symptom: Step "Assert tagged commit is on develop" exits with "NOT on origin/develop."
Cause: The release-* tag was applied to a commit that is not in the develop branch history — possibly a hotfix branch commit, a detached HEAD, or the wrong SHA.
Fix: Delete the erroneous tag and re-apply it to the correct develop commit:
# Delete the bad tag locally and remotely:
git push origin :refs/tags/<tag-name>
git tag -d <tag-name>
# Apply to the correct develop commit:
git fetch origin develop
git tag -a <tag-name> origin/develop -m "RC <tag-name>"
git push origin <tag-name>
Failure mode C: merge conflict on promote step
Symptom: Step "Merge develop into release" exits with a git merge conflict error.
Cause: Someone pushed directly to release outside this workflow (e.g. a manual hotfix merge), creating a divergence from develop.
This is an escalation. Do NOT force-push to resolve it. The conflict represents a real divergence that requires human triage.
Fix (operator action):
1. Identify what is on release but not on develop:
bash
git log origin/develop..origin/release --oneline
2. Cherry-pick the hotfix commits to develop:
bash
git fetch origin develop
git checkout develop
git cherry-pick <hotfix-sha>
git push origin develop
3. Cut a new release tag once develop is clean:
bash
gh workflow run cut-release-candidate.yml
See ADR-0115 §Emergency hotfix path for the full hotfix procedure.
Failure mode D: YAML syntax error in workflow file (run duration = 0 s)
Symptom: Every gatekeeper run fails instantly — start_time equals end_time, no step logs are produced.
Cause: A change to gatekeeper-develop-to-release.yml introduced a YAML syntax error. Common trigger: a multi-line string inside a run: | literal block where inner lines are at column 0. YAML treats those as the end of the block.
This was the failure mode that triggered the original redesign (recurring workflow failures, 2026-06-30).
Fix: Validate the workflow YAML before pushing:
python3 -c "import yaml; yaml.safe_load(open('.github/workflows/gatekeeper-develop-to-release.yml').read()); print('OK')"
actionlint .github/workflows/gatekeeper-develop-to-release.yml
Failure mode E: vault unreachable
Symptom: Step "Load raxx-ops-bot credentials" fails with a vault auth error.
Fix: See docs/ops/runbooks/vault-access.md. Once vault is restored, re-push the tag to re-fire the gatekeeper:
git push origin refs/tags/<tag-name> --force
Failure mode F: release branch protection blocks the push (GH006)
Symptom: Step "Merge develop into release" exits with GH006 (protected branch update failed).
Cause: The release branch protection requires status checks or a reviewer for direct pushes, which conflicts with the bot push from the gatekeeper.
Correct release protection state for the tag-gated model:
- required_status_checks: null (none)
- required_pull_request_reviews: null (none)
- allow_force_pushes: false
- allow_deletions: false
- enforce_admins: false
The CI gate lives inside the gatekeeper workflow itself. The heavy PR-gate scans (ZAP, Playwright, etc.) are on the release → main boundary (promote-release-to-main.yml), not here.
Fix:
gh api -X PUT repos/raxx-app/TradeMasterAPI/branches/release/protection \
--input - <<'JSON'
{
"required_status_checks": null,
"required_pull_request_reviews": null,
"restrictions": null,
"enforce_admins": false,
"allow_force_pushes": false,
"allow_deletions": false
}
JSON
# Verify
gh api repos/raxx-app/TradeMasterAPI/branches/release/protection \
--jq '{has_checks: (.required_status_checks != null), has_reviews: (.required_pull_request_reviews != null)}'
# → {"has_checks":false,"has_reviews":false}
# Re-push the tag to re-fire the gatekeeper
git push origin refs/tags/<tag-name> --force
How to pause auto-promotion (maintenance freeze)
gh workflow disable gatekeeper-develop-to-release.yml
While disabled, no promotions fire regardless of how many release-* tags are pushed. When ready to resume:
gh workflow enable gatekeeper-develop-to-release.yml
# Cut a new tag to trigger promotion of the current develop HEAD:
gh workflow run cut-release-candidate.yml
Failure mode G: push fails with 403 — App token lacks Contents:write
Symptom: Step "Merge develop into release" (or "Create and push annotated release tag" in cut-release-candidate.yml) exits with a 403. The run log shows either:
- remote: Permission to org/repo.git denied to raxx-ops-bot[bot].
- The ::warning:: line: App token push failed — raxx-ops-bot App needs Contents:write on this repo.
Cause: The raxx-ops-bot GitHub App installation does not have Contents: Read & Write permission on this repository. The App token mints successfully but the push operation is denied.
Note on GITHUB_TOKEN: GITHUB_TOKEN is NOT a valid fix here. GitHub Actions docs specify that pushes authenticated with GITHUB_TOKEN do not trigger other workflow runs — using it would silently break the downstream deploy trigger chain.
Permanent fix (preferred least-privilege end state):
GitHub → Settings → Developer settings → GitHub Apps → raxx-ops-bot
→ Permissions → Repository permissions → Contents: Read & Write → Save
After saving, re-run the failed workflow:
gh run rerun <failed-run-id>
Required App permissions (full end state):
| Permission | Level | Required by |
|---|---|---|
| Contents | Read & Write | Push tags + push branch merge commits |
| Pull requests | Read & Write | release.yml (release-please opens/updates PRs) |
Immediate fix (while App permission is pending):
Add the RAXX_OPS_BOT_PAT repo secret (classic GitHub PAT with repo scope):
GitHub → Repo settings → Secrets and variables → Actions → New repository secret
Name: RAXX_OPS_BOT_PAT
Value: <classic PAT from the raxx-ops-bot machine user account, scope: repo>
Once set, re-run the failed workflow — the push step falls back to the PAT automatically and emits a ::warning:: annotation. The warning persists until the App is granted Contents:write, at which point the PAT fallback is unused.
Manual promote (when gatekeeper is disabled or paused)
git fetch origin develop release
git checkout -B release origin/release
git merge --no-ff origin/develop -m "chore(release): manual promote develop->release $(date -u +%Y.%m.%d)"
git push origin release
Concurrency model
The gatekeeper uses concurrency: group: gatekeeper-develop-to-release, cancel-in-progress: false.
- At most one run is in-progress and one is queued at any time.
- Multiple
release-*tags pushed rapidly: runs queue sequentially; no run is cancelled. - An in-flight merge push is never interrupted (git push is atomic, but interruption mid-run is still undesirable for auditability).
Emergency stop
# Cancel an in-flight run:
gh run cancel $(gh run list --workflow=gatekeeper-develop-to-release.yml \
--status=in_progress --json databaseId --jq '.[0].databaseId')
If the push to release completed before cancellation, staging will have received a deploy. Roll back staging if needed:
gh workflow run deploy-heroku.yml -f environment=staging -f ref=<previous-good-sha>
Escalation
Wake the operator when:
- Failure mode C (merge conflict) — requires human decision on the divergent hotfix.
- The gatekeeper has failed 3+ times in a row.
- Vault is unreachable for > 30 min.
- The release branch has been force-pushed or its protection removed.
References
- ADR:
docs/architecture/adr/0115-develop-release-main-branching-model.md(§Tag-gated promotion model) - Workflow:
.github/workflows/gatekeeper-develop-to-release.yml - RC entry point:
.github/workflows/cut-release-candidate.yml - Prod promotion:
.github/workflows/promote-release-to-main.yml - Related workflows:
ci.yml,deploy-heroku.yml,deploy-antlers-next-staging.yml
How to tell the gatekeeper is broken
- A merge landed on
developbutreleasedid not advance after ~20 min. - A merge landed on
developbut no staging deploy fired. - The workflow run in GitHub Actions → Workflows → "Gatekeeper — develop to release" shows a failed or skipped run.
- The expected tag (
release-YYYY.MM.DD-<sha7>) is absent from the git tag list.
How to diagnose (in order)
-
Find the relevant gatekeeper run:
gh run list --workflow=gatekeeper-develop-to-release.yml --limit=5 -
View the logs for the failed run:
gh run view <run-id> --log -
Identify which step failed: - "Load raxx-ops-bot credentials" — vault unreachable or INFISICAL_ secrets stale. - "Mint raxx-ops-bot installation token" — GitHub App credentials invalid or expired. - "Wait for CI — develop" — ci.yml failed or timed out (see ci.yml run for details). - "Push release tag" — tag collision or push permission error. - "Promote develop → release"* — merge conflict or push permission error.
-
For CI gate failures, find the specific ci.yml run:
gh run list --workflow=ci.yml --branch=develop --limit=5 gh run view <run-id> --log-failed -
For push permission errors, verify the raxx-ops-bot GitHub App has
contents: writeon this repository.
Known failure modes
Failure mode A: CI gate timeout (30 min)
Symptom: Step "Wait for CI — develop" exits with "Timed out after 1800s."
Cause: ci.yml is taking more than 30 min, OR runner queue pressure delayed the ci.yml run, OR the ci.yml run was never created (possible if workflow file is broken).
Fix:
# Confirm ci.yml ran for the SHA
SHA=$(git rev-parse origin/develop)
gh api "repos/raxx-app/TradeMasterAPI/actions/workflows/ci.yml/runs?head_sha=${SHA}&per_page=5" \
--jq '.workflow_runs[] | {id, status, conclusion, html_url}'
If ci.yml ran and succeeded, re-trigger the gatekeeper manually:
gh workflow run gatekeeper-develop-to-release.yml --ref develop
If ci.yml is genuinely slow (> 30 min), increase MAX_WAIT in the workflow (file a type:reliability ticket first).
Verification: gh run list --workflow=gatekeeper-develop-to-release.yml --limit=3
Failure mode B: CI gate fails (develop CI red)
Symptom: Step "Wait for CI — develop" exits with "CI — develop failure for
Cause: This is the gate working as intended. A merge landed on develop with a failing check.
Fix: Fix the failing check on develop (not on a feature branch — the merge is already on develop). Options:
- If the failure is a flaky test: manually re-run the ci.yml run, then re-run the gatekeeper.
- If the failure is a real bug: open a PR targeting develop with the fix; the gatekeeper fires again after that merge.
Do NOT bypass the gate by manually merging develop → release. The gate exists to keep staging green.
Verification: Staging should not receive a deploy until the fix lands on develop and the gatekeeper passes.
Failure mode C: tag collision
Symptom: Step "Push release tag" exits with "Tag release-YYYY.MM.DD-sha7 already exists but points to a different SHA."
Cause: A tag with this name was manually created on a different commit. Should not occur in normal operation.
Fix:
# Identify the conflicting tag
git fetch origin --tags
git tag -v release-YYYY.MM.DD-<sha7>
# Delete the conflicting tag (operator action — confirm correct SHA first)
git push origin :refs/tags/release-YYYY.MM.DD-<sha7>
# Re-run the gatekeeper
gh workflow run gatekeeper-develop-to-release.yml --ref develop
Verification: git ls-remote origin 'refs/tags/release-*'
Failure mode D: merge conflict on promote step
Symptom: Step "Promote develop → release" exits with a git merge conflict error.
Cause: Someone pushed directly to release outside this workflow (e.g. a manual hotfix merge), creating a divergence from develop.
This is an escalation. Do NOT force-push to resolve it. The conflict represents a real divergence that requires human triage.
Fix (operator action):
1. Identify what is on release but not on develop:
bash
git log origin/develop..origin/release --oneline
2. Cherry-pick the hotfix commits to develop:
bash
git checkout develop
git cherry-pick <hotfix-sha>
git push origin develop
3. The gatekeeper fires again after the cherry-pick merge, and the next merge will succeed.
See ADR-0115 §Emergency hotfix path for the full hotfix procedure.
Failure mode F: YAML syntax error in workflow file (run duration = 0s)
Symptom: Every gatekeeper run fails instantly — start_time equals end_time, no step logs are produced. GitHub Actions may report "This workflow is invalid" or "ScannerError: while scanning a simple key."
Cause: A change to gatekeeper-develop-to-release.yml introduced a YAML syntax error. Common trigger: a multi-line string inside a run: | literal block where inner lines are at column 0. YAML treats those as the end of the block, and any : in subsequent lines is parsed as a YAML key, producing a ScannerError. This class of error is invisible to GitHub's PR preview but fails at runner startup.
This is the failure mode that occurred on 2026-06-30 (16-day gap in promotions).
Fix: Validate the workflow YAML before pushing:
python3 -c "import yaml; yaml.safe_load(open('.github/workflows/gatekeeper-develop-to-release.yml').read()); print('OK')"
If actionlint is available (CI check — see ci-hygiene.md):
actionlint .github/workflows/gatekeeper-develop-to-release.yml
Fix the indentation/quoting error and push a corrected commit to develop.
Verification: A new gatekeeper run appears for the fix commit and progresses past the YAML parse phase (at least one step log is produced).
Failure mode E: vault unreachable
Symptom: Step "Load raxx-ops-bot credentials" fails with a vault auth error.
Cause: Infisical vault is down, or the CF Access service token for CI has expired.
Fix: See docs/ops/runbooks/vault-access.md. Once vault is restored, re-run the gatekeeper:
gh workflow run gatekeeper-develop-to-release.yml --ref develop
Failure mode G: release branch protection blocks the push (GH006)
Symptom: Step "Promote develop → release" exits with:
remote: error: GH006: Protected branch update failed for refs/heads/release
remote: error: Required status checks are expected.
or
remote: error: At least 1 approving review is required ...
Cause: The release branch protection was configured with required status checks
and/or a required PR review — settings appropriate for a PR-based gate but NOT
compatible with the gatekeeper's direct-push model. The gatekeeper checks develop CI
itself before pushing; requiring additional checks or reviews on release directly
blocks the bot push and defeats the automation.
Root cause documented: 2026-06-30 — cutover (ADR-0115 Phase 6) applied the PR-gate
spec from the ADR to release, which required 9 status checks + 1 reviewer. This
contradicted the gatekeeper model where develop CI is the gate.
Correct release protection state:
- required_status_checks: null (none)
- required_pull_request_reviews: null (none)
- allow_force_pushes: false (protected)
- allow_deletions: false (protected)
- enforce_admins: false
The gate on develop CI lives INSIDE the gatekeeper workflow (step "Wait for CI —
develop to complete"). Heavy scans (ZAP, Queue Docker, Playwright e2e, etc.) gate the
release → main boundary (prod promotion), NOT the develop → release boundary.
Fix:
# Verify current state (should have no required_status_checks or required_pull_request_reviews)
gh api repos/raxx-app/TradeMasterAPI/branches/release/protection
# If either field is present, clear them:
gh api -X PUT repos/raxx-app/TradeMasterAPI/branches/release/protection \
--input - <<'JSON'
{
"required_status_checks": null,
"required_pull_request_reviews": null,
"restrictions": null,
"enforce_admins": false,
"allow_force_pushes": false,
"allow_deletions": false
}
JSON
# Verify (response should NOT contain required_status_checks or required_pull_request_reviews keys)
gh api repos/raxx-app/TradeMasterAPI/branches/release/protection
# Re-run the last failed gatekeeper run
FAILED_RUN=$(gh run list --workflow=gatekeeper-develop-to-release.yml \
-R raxx-app/TradeMasterAPI --json databaseId,conclusion \
--jq '[.[] | select(.conclusion == "failure")] | .[0].databaseId')
gh run rerun "$FAILED_RUN" -R raxx-app/TradeMasterAPI
Note: The gatekeeper does NOT have a workflow_dispatch trigger; use gh run rerun
to re-trigger without a new develop push.
Note: After gh run rerun, the rerun's push to release may not fire staging
deploy webhooks (GitHub suppresses push events for workflow reruns). The staging deploys
will fire correctly on the NEXT normal develop push that triggers the gatekeeper.
To force staging deploys immediately after a fix:
# Dispatch each staging deploy workflow manually for the current release HEAD
gh workflow run deploy-heroku.yml -R raxx-app/TradeMasterAPI \
-f environment=staging -f ref=release
Verification:
# Confirm gatekeeper passed
gh run view "$FAILED_RUN" --json conclusion
# → {"conclusion":"success"}
# Confirm release advanced
gh api repos/raxx-app/TradeMasterAPI/git/ref/heads/release --jq '.object.sha'
# Confirm protection is still correct (no required checks crept back in)
gh api repos/raxx-app/TradeMasterAPI/branches/release/protection \
--jq '{has_checks: (.required_status_checks != null), has_reviews: (.required_pull_request_reviews != null)}'
# → {"has_checks":false,"has_reviews":false}
How to pause auto-promotion (maintenance freeze)
Option 1 — Disable the workflow (preferred):
GitHub → Actions → Workflows → "Gatekeeper — develop to release" → ... → Disable workflow
Or via CLI:
gh workflow disable gatekeeper-develop-to-release.yml
While disabled, no promotions fire. When ready to resume:
gh workflow enable gatekeeper-develop-to-release.yml
After re-enabling, trigger a manual run to promote any commits that landed during the freeze:
gh workflow run gatekeeper-develop-to-release.yml --ref develop
Option 2 — Block the CI gate (temporary hold on a specific commit): If you want to hold promotion for a specific commit without disabling all future promotions, let the gatekeeper's CI gate naturally block until you're ready. The ci.yml run on that commit will time out after 30 min. For a longer hold, disable the workflow (Option 1).
Option 3 — Branch protection on release (emergency):
Set release branch protection to require a human reviewer for direct pushes. This blocks the bot push from the gatekeeper without disabling the workflow, and allows a manual merge when ready.
Manual promote (when gatekeeper is disabled or paused)
To manually promote develop → release when the gatekeeper is disabled:
git fetch origin develop release
git checkout -B release origin/release
git merge --no-ff origin/develop -m "chore(release): manual promote develop → release $(date -u +%Y.%m.%d)"
git push origin release
This pushes to release, which fires the staging deploy workflows automatically.
Concurrency model
The gatekeeper uses concurrency: group: gatekeeper-develop, cancel-in-progress: false. This means:
- At most one gatekeeper run is in-progress and one is pending at any time.
- If two merges land rapidly on develop, the second run is queued, not cancelled.
- If three merges land rapidly, the second pending run is dropped (replaced by the third). The middle commit's code is present in the third commit anyway — nothing is lost.
ci.ymluses the same cancel-in-progress: false model, so a running gatekeeper always polls the ci.yml run for its own SHA (not a superseded run).
Emergency stop
To stop a promotion that is currently in progress:
# Cancel the in-flight run
gh run cancel $(gh run list --workflow=gatekeeper-develop-to-release.yml --status=in_progress --json databaseId --jq '.[0].databaseId')
The cancellation will leave release in whatever state it was in before the cancelled merge push (git push is atomic). If the push completed before cancellation, staging will still deploy. In that case, roll back staging via the staging deploy workflow dispatch:
gh workflow run deploy-heroku.yml -f environment=staging -f ref=<previous-good-sha>
Escalation
Wake the operator when:
- Failure mode D (merge conflict) — requires human decision on the divergent hotfix.
- The gatekeeper has failed 3+ times in a row on the same SHA.
- Vault is unreachable for > 30 min.
- The release branch has been force-pushed or its protection has been removed.
References
- ADR:
docs/architecture/adr/0115-develop-release-main-branching-model.md§Gatekeeper automation - Workflow:
.github/workflows/gatekeeper-develop-to-release.yml - Related workflows:
ci.yml,deploy-heroku.yml,deploy-antlers-next-staging.yml - Emergency hotfix path: ADR-0115 §Emergency hotfix path