Slack notification policy runbook

System: GitHub Actions — Slack notification gate Owner: sre-agent Last reviewed: 2026-06-18

Policy summary

Slack notifications from CI/CD workflows are gated by a single repo-level Actions variable: SLACK_NOTIFY_LEVEL.

Variable value	Effect
(not set / any value except `all`)	Only loud-tier paths fire (see below)
`all`	Every Slack step fires as originally authored

The variable is intentionally unset by default post-gate deployment, which means the silenced tier produces no noise.

How to revert (re-enable all notifications)

Go to the repo Settings → Secrets and variables → Actions → Variables tab.
Create or update SLACK_NOTIFY_LEVEL with value all.
Effective immediately on next workflow trigger — no deploy needed.

To silence again: delete the variable or set it to any other value.

Loud tier (always fires, gate does NOT apply)

These paths are never gated and remain loud regardless of SLACK_NOTIFY_LEVEL:

Workflow	What it posts	Channel / target
`nightly-security-scan.yml`	Alert when the scan job itself fails, is skipped, or cancelled (operational gap — a dark scan cannot be silenced)	`SLACK_WEBHOOK_URL`
`security-zap.yml`	Alert when ZAP scan job fails or is skipped	`SLACK_WEBHOOK_URL`
`synth-probe-waitlist.yml`	SEV2 alert when the prod waitlist API probe fails (customer-facing lead capture)	`SLACK_WEBHOOK_URL` → `#raxx-ops-alert-sev2`
`bcp-vault-snapshot-daily.yml`	Alert when the vault snapshot (BCP Win 3) fails — explicitly marked pageable in the workflow	`SLACK_WEBHOOK_URL`
`deploy-console.yml` (health check step only)	Immediate DM when the console prod health check fails post-deploy	`SLACK_BOT_TOKEN` → `D0AJ7K184TV`

Silenced tier (gated by `SLACK_NOTIFY_LEVEL == 'all'`)

Workflow	What was being posted	Channel / target
`slack-notify.yml`	CI failure on main	`SLACK_WEBHOOK_URL`
`ci-digest-cron.yml`	Daily CI digest	`SLACK_WEBHOOK_URL`
`bcp-smoke-monthly.yml`	Monthly BCP smoke failure	`SLACK_WEBHOOK_URL`
`daily-bot-token-smoke.yml`	Bot token smoke failure	`SLACK_WEBHOOK_URL`
`daily-card-groomer.yml`	Groomer completion DM	`SLACK_BOT_TOKEN` → `D0AJ7K184TV`
`deploy-antlers-cutover.yml`	Cutover + rollback outcome	`SLACK_WEBHOOK_URL`
`deploy-antlers-next-prod.yml`	Antlers prod deploy failure	`SLACK_WEBHOOK_URL`
`deploy-antlers-next-staging.yml`	Antlers staging deploy failure	`SLACK_WEBHOOK_URL`
`deploy-console.yml` (notify job, DM step)	Console prod deploy outcome DM	`SLACK_BOT_TOKEN` → `D0AJ7K184TV`
`deploy-customer-docs.yml`	Docs health check failure DM	`SLACK_BOT_TOKEN` → `D0AJ7K184TV`
`deploy-failure-streak-alert.yml`	Deploy failure streak alert	`SLACK_WEBHOOK_URL` → `#raxx-ops-alert-sev2-5`
`deploy-heroku.yml`	Staging SEV-3 alert	`SLACK_WEBHOOK_URL`
`deploy-queue-failure-monitor.yml`	Queue deploy streak alert	`SLACK_WEBHOOK_URL` → `#raxx-ops-alert-sev2-5`
`deploy-queue.yml`	Queue deploy outcome DM	`SLACK_BOT_TOKEN` → `D0AJ7K184TV`
`deploy-velvet.yml`	Velvet deploy outcome DM	`SLACK_BOT_TOKEN` → `D0AJ7K184TV`
`drift-orchestrator-cron.yml`	Drift reconciler failure DM	`SLACK_BOT_TOKEN` → `D0AJ7K184TV`
`flag-drift-check.yml`	Flag drift detection (embedded in Python)	`SLACK_WEBHOOK_URL` / `SLACK_BOT_TOKEN`
`freescout-apply.yml`	Terraform apply failure DM	`SLACK_BOT_TOKEN` → `D0AJ7K184TV`
`freescout-backup.yml`	FreeScout backup failure	`SLACK_WEBHOOK_URL`
`heroku-config-health-nightly.yml`	Config health critical findings	`SLACK_WEBHOOK_URL`
`queue-zero-dyno-monitor.yml`	Zero-dyno alert	`SLACK_WEBHOOK_URL` → `#raxx-ops-alert-sev2-5`
`synthetic-gate.yml`	Staging synthetic check failure	`SLACK_WEBHOOK_URL`
`terraform-email-delivery-stack.yml`	TF plan/apply failure DM	`SLACK_BOT_TOKEN` → `D0AJ7K184TV`
`tickets-e2e-smoke.yml`	Tickets e2e smoke failure	`SLACK_WEBHOOK_URL`
`waf-synthetic-probe.yml`	WAF probe failure DM	`SLACK_BOT_TOKEN` → `D0AJ7K184TV`

Workflows with no Slack posting (no change needed)

Workflow	Note
`alembic-version-cron.yml`	Comment reference only — no Slack post step
`console-degraded-auto-file.yml`	GH issues only — no Slack post
`launch-readiness-check.yml`	Step summary only — no Slack post
`mbt-assignment-at-expiry-cron.yml`	Sentry + digest only — no Slack post
`mbt-drift-daily.yml`	Digest only — no Slack post
`mbt-drift-per-symbol-weekly.yml`	Digest only — no Slack post
`mbt-resting-orders-cron.yml`	Sentry + digest only — no Slack post
`freescout-apply.yml`	`TF_VAR_license_slack` is a Terraform input, not a Slack post
`ci-pr.yml`	Reference to slack-notify workflow path in step summary only

Implementation notes

Gate is applied at the individual step if: level (not job level) wherever possible, to preserve job-level transitive-skip guards (feedback_gh_actions_transitive_skip).
flag-drift-check.yml uses an embedded Python call for Slack. The gate is applied via SLACK_NOTIFY_LEVEL env var passed to the step and checked in the Python code.
Workflow cron/logic is fully intact — only the Slack posting side-effect is gated.

Escalation

If a silenced path represents a genuine prod incident, escalate via: 1. GH Actions run URL (always visible in the Actions tab) 2. ops@raxx.app (Postmark alerts remain active where configured) 3. Set SLACK_NOTIFY_LEVEL=all to re-enable all paths during a high-tempo incident window