Raxx · internal docs

internal · gated ↑ index

Postmark runbook

System: Postmark (email delivery for raxx.app) Owner: operator Last incident: 2026-05-13 (see docs/incidents/2026-05-13-postmark-bounce-alert-misfire.md) Last reviewed: 2026-05-13 UTC


Architecture

Postmark is the transactional email provider for raxx.app. Two parallel notification paths exist:

Path A — Postmark native notifications (currently active):

Email event (Bounce/SpamComplaint)
  -> Postmark internal notification engine
    -> Postmark dashboard: Server -> Settings -> Notifications -> Slack webhook
      -> TradeMasterAPI-Notify Slack channel

This path is active and has no minimum-denominator floor or dedup window.

Path B — Raptor in-process delivery monitor (currently dormant):

Postmark -> POST /webhooks/postmark/delivery (Raptor)
  -> postmark_delivery_events DB table
    -> _check_alert_thresholds() -> Slack DM (with 60-min suppression + min-denominator floor)

This path is dormant: FLAG_POSTMARK_DELIVERY_MONITOR=false, POSTMARK_SERVER_TOKEN is empty on raxx-api-prod. Blocked by CF Access on the webhook endpoint (issue #669).

Pre-launch posture: Path A produces per-event pings. Per feedback_pre_launch_digest_notifications.md, routine CI/cron Slack pings should be in a daily digest. Remove or reconfigure the Postmark Slack webhook when Path B (Raptor monitor) is enabled post-launch.


Vault and credentials

Secret Vault path Heroku config var
Server API token (transactional send) /MooseQuest/postmark/POSTMARK_SERVER_API_KEY POSTMARK_SERVER_TOKEN
Account API token (admin) /MooseQuest/postmark/POSTMARK_ACCOUNT_API_KEY not set on Heroku
Delivery webhook secret /MooseQuest/postmark/POSTMARK_DELIVERY_WEBHOOK_SECRET POSTMARK_DELIVERY_WEBHOOK_SECRET

Fetch the server token:

# From vault (if vault is accessible):
export POSTMARK_SERVER_TOKEN=$(python3 scripts/ops/postmark_bounce_check.py 2>/dev/null || echo "")

# Or directly from Heroku (currently empty — must set first):
heroku config --app raxx-api-prod | grep POSTMARK_SERVER_TOKEN

How to tell it's broken


How to diagnose (in order)

  1. Check Postmark dashboard — sign in at https://account.postmarkapp.com/ → select raxx.app server - Activity tab: filter by Bounced — any recent bounces? - Suppressions: any addresses stuck in the suppression list?

  2. Run diagnostic script: bash export POSTMARK_SERVER_TOKEN=<token-from-vault> python3 scripts/ops/postmark_bounce_check.py

  3. Check Raptor delivery monitor status (post-launch, when enabled): bash curl -H "CF-Access-Client-Id: $CF_ACCESS_CLIENT_ID" \ -H "CF-Access-Client-Secret: $CF_ACCESS_CLIENT_SECRET" \ "https://api.raxx.app/api/_internal/postmark/recent-deliveries"

  4. Check alert path: the Postmark Slack webhook is configured in Postmark dashboard → Server → Settings → Notifications. If alerts are reaching Slack but the Raptor monitor shows nothing, Path A (Postmark native) is the source.

  5. Check DNS authentication: bash dig TXT _dmarc.raxx.app dig TXT pm._domainkey.raxx.app dig TXT google._domainkey.raxx.app # All should return records


Known failure modes

Failure mode A: Low-denominator alert misfire (pre-launch / low-volume)

Symptom: Repeated Slack pings "Bounce rate 100.0% (1/1)", "50.0% (1/2)", "33.3% (1/3)" Cause: One hard-bounce event (typically ops@raxx.app suppressed pre-provisioning) stays in Postmark's trailing window while new successful deliveries increment the denominator. Postmark fires a notification on each new threshold crossing. No dedup window. Fix: 1. Clear the suppression list (see "Reactivate suppressed addresses" below) 2. Remove or pause the Postmark Slack notification webhook in the dashboard 3. (Code) Raptor delivery monitor now has a minimum-denominator floor — won't fire at N<10 Verification: No new Slack pings after clearing suppressions + sending a test email Incident: docs/incidents/2026-05-13-postmark-bounce-alert-misfire.md

Failure mode B: ops@raxx.app hard-bounce (address suppressed)

Symptom: Every automated email to ops@raxx.app generates a bounce notification Cause: ops@raxx.app was emailed before the Google Group was provisioned (pre-2026-05-06), resulting in a hard-bounce record in Postmark's suppression list Fix: bash export POSTMARK_SERVER_TOKEN=<token> python3 scripts/ops/postmark_bounce_check.py --reactivate ops@raxx.app # Repeat for billing@raxx.app and no-reply@raxx.app if suppressed Verification: python3 scripts/ops/postmark_bounce_check.py --suppressions-only shows no ops@ entry

Failure mode C: DKIM signing not active in Google Workspace

Symptom: Sends from Google-signed @raxx.app addresses show dkim=fail in headers; DMARC failures at p=quarantine cause some receiving servers to bounce mail Cause: google._domainkey.raxx.app DNS record is live but "Start authentication" not clicked in Workspace Admin Console Fix: 1. https://admin.google.com → Apps → Google Workspace → Gmail → Authenticate email 2. Select raxx.app → confirm status is "Authenticating email" (green) 3. If not, click "Start authentication" Verification: Send test email from no-reply@raxx.app → check headers for dkim=pass Incident: docs/ops/sre-reports/2026-05-07-postmark-bounce-alerts.md

Failure mode D: Postmark account suspended / deactivated

Symptom: All sends fail with 401 or 422 from Postmark API; Activity tab shows "Server inactive" Cause: Account spam ratio exceeded Postmark's platform threshold; account reviewed and restricted Fix: Contact Postmark support at https://account.postmarkapp.com/support Verification: Test send succeeds; server status shows "Active"

Failure mode E: CF Access blocks Postmark delivery webhook

Symptom: FLAG_POSTMARK_DELIVERY_MONITOR=1 is set but no events appear in recent-deliveries; Postmark delivery webhook Activity shows "Failed" with HTTP 302 Cause: /webhooks/postmark/delivery is behind Cloudflare Access; Postmark's IPs cannot authenticate Fix: Add CF Access bypass rule for Postmark IPs on the /webhooks/postmark/delivery path: - CF Zero Trust → Access → Applications → api.raxx.app → add Policy: Bypass for Postmark IP ranges - Postmark IP list: https://postmarkapp.com/support/article/800-ips-for-postmark-servers Verification: POST https://api.raxx.app/webhooks/postmark/delivery with valid token returns 200


Reactivate suppressed addresses

export POSTMARK_SERVER_TOKEN=<token-from-vault>

# See the full suppression list
python3 scripts/ops/postmark_bounce_check.py --suppressions-only

# Reactivate specific addresses
python3 scripts/ops/postmark_bounce_check.py --reactivate ops@raxx.app
python3 scripts/ops/postmark_bounce_check.py --reactivate billing@raxx.app
python3 scripts/ops/postmark_bounce_check.py --reactivate no-reply@raxx.app

# Or via Postmark API directly:
curl -s -X PUT \
  -H "X-Postmark-Server-Token: $POSTMARK_SERVER_TOKEN" \
  -H "Content-Type: application/json" \
  "https://api.postmarkapp.com/bounces/reactivate" \
  -d '{"Address": "ops@raxx.app"}'

Enable Raptor delivery monitor (post-launch)

Prerequisites (all must be true before enabling): 1. CF Access bypass in place for Postmark IP ranges on /webhooks/postmark/delivery path 2. POSTMARK_SERVER_TOKEN set on raxx-api-prod 3. POSTMARK_DELIVERY_WEBHOOK_SECRET set on raxx-api-prod 4. Postmark dashboard → Delivery webhook URL configured to https://api.raxx.app/webhooks/postmark/delivery 5. Postmark Slack native notification webhook removed (or it will double-alert)

heroku config:set FLAG_POSTMARK_DELIVERY_MONITOR=1 --app raxx-api-prod >/dev/null 2>&1
heroku config:set POSTMARK_DELIVERY_WEBHOOK_SECRET=<value-from-vault> --app raxx-api-prod >/dev/null 2>&1

Verify:

curl -H "X-Postmark-Webhook-Token: $POSTMARK_DELIVERY_WEBHOOK_SECRET" \
     -H "Content-Type: application/json" \
     -d '{"RecordType":"Delivery","MessageID":"test-001","Recipient":"kris@moosequest.net"}' \
     https://api.raxx.app/webhooks/postmark/delivery
# Expect: {"ok": true, "event_type": "Delivery", ...}

Alert threshold reference (Raptor in-process monitor)

Alert Threshold Window Minimum denominator Suppression
Bounce rate >1% 1h 10 (configurable) 60 min in-memory
Spam complaint rate >0.1% 24h 25 (configurable) 60 min in-memory

Override minimum denominators without redeploy:

heroku config:set POSTMARK_ALERT_MIN_DENOMINATOR_BOUNCE=50 --app raxx-api-prod >/dev/null 2>&1
heroku config:set POSTMARK_ALERT_MIN_DENOMINATOR_SPAM=100 --app raxx-api-prod >/dev/null 2>&1

Emergency stop

To stop all Postmark-originated Slack pings immediately:

Option A — Remove the Postmark Slack webhook (recommended): 1. https://account.postmarkapp.com/ → raxx.app server → Settings → Notifications 2. Remove the Slack webhook entry

Option B — Disable Raptor delivery monitor (if it's active):

heroku config:set FLAG_POSTMARK_DELIVERY_MONITOR=0 --app raxx-api-prod >/dev/null 2>&1

Escalation

Wake the operator when: - Spam complaint is from an external (non-raxx.app) address — sender reputation at risk - Postmark account is suspended or restricted - Hard-bounce rate exceeds 5% with a denominator above 100 - Any email to the operator's personal address (kris@moosequest.net) bounces