Raxx · internal docs

internal · gated

RCA — Queue billing schema absent from prod: sqitch.plan format + release-phase pipeline mismatch

Incident ID: 2026-06-17-queue-billing-sqitch-plan-format Date: 2026-06-17 Severity: SEV-2 Duration: ~3.5h total (13:20 UTC first signal → 15:07 UTC first 2xx webhook verified) Blast radius: raxx-queue-prod had no billing tables; every Stripe webhook with a valid signature resulted in a DB 500 after signature verification passed; Queue billing was non-functional at go-live. Author: sre-agent

Summary

Two independent bugs prevented billing schema from being applied to the prod database on every Container Registry deploy since Queue launched. First, queue/migrations/sqitch/sqitch.plan had timestamps wrapped in brackets ([2026-05-11T00:00:00Z]) — sqitch 1.3.1 parses [...] after a change name as a dependency list, not a timestamp. This caused "Invalid name" syntax error and sqitch exited 0 with no changes deployed. Second, the heroku.yml release phase command is not honored when deploying via heroku container:release — the Container Registry and heroku.yml pipelines are mutually exclusive; the release command was silently skipped on every deploy. PR #3620 (merged 2026-06-17) correctly copied migrations into the Docker image, which was a necessary prerequisite, but the sqitch.plan format bug and release-phase pipeline gap meant migrations still did not apply. Resolution: sqitch.plan timestamps corrected (brackets removed), deploy workflow updated to run heroku run sqitch deploy explicitly after container:release, and migrations applied manually via heroku run on 2026-06-17 to restore prod.

Timeline (all times UTC)

Impact

What went well

What didn't go well

Root cause analysis

Detection

Resolution

Action items

# Action Owner Due Issue
1 Merge PR #3635 (sqitch.plan format + workflow sqitch step) operator 2026-06-18 #3635
2 Add post-deploy \dt billing_* verification step to deploy-queue.yml sre-agent (in #3635) 2026-06-18 #3635
3 Add sqitch plan format validation to queue-docker-smoke.yml (parse sqitch.plan and verify all change names have timestamps) sre-agent 2026-06-20 TBD
4 Document Heroku Container Registry + heroku.yml limitation in queue runbook sre-agent 2026-06-20 TBD
5 Verify staging billing schema via same heroku run path (raxx-queue-staging) sre-agent 2026-06-18 TBD

References