Raxx · internal docs

internal · gated ↑ index

Queue CF Edge Protection — Design

Status: Accepted Date: 2026-05-11 UTC ADR: ADR-0078 Parent card: #1733 WAF threat model: docs/security/waf-threat-model-2026-05-11.md (HIGH-WAF-2) Sister doc: docs/architecture/waf-strategy.md (ADR-0077, just merged)


1. Context

Queue is the centralized identity / RBAC / customer / audit service (ADR-0076: C++, Drogon, running on raxx-queue-{prod,staging}.herokuapp.com). As of 2026-05-11 UTC the Heroku origin URL is directly routable from the internet: no Cloudflare proxy sits in front of it, so all of the following are bypassed:

The WAF threat model (security/waf-threat-model-2026-05-11.md §S11) classified this HIGH-WAF-2. Endpoints reachable in the current state include auth paths that are password-equivalents when Phase 2 ships (backup-codes/redeem) and the billing CRUD endpoints in Phase 1.

This design closes that gap using the same layered pattern applied to Raptor (FLAG_ENFORCE_CF_ORIGIN, cloudflare_origin_guard.py) extended for a C++ / Drogon service.


2. Invariants (non-negotiable)

  1. No stored credentials. This design does not change how credentials are handled. CF origin guard headers contain no user credential material.
  2. Passkeys / WebAuthn only. Not affected by this design. Edge protection sits in front of auth, not inside it.
  3. Audit trail for every state change affecting money, permissions, or data access. The origin guard's reject events are logged at WARN with direct_origin_blocked structured key for audit-trail searchability.
  4. Credentials into infra, not into code. CLOUDFLARE_ZONE_ID, CLOUDFLARE_API_TOKEN, and all Heroku domain-attachment secrets are loaded from Infisical at operator runtime; none are committed.
  5. Paper-first gating. Not directly relevant here; order-execution paths are in Raptor, not Queue Phase 1.
  6. Kill-switch. FLAG_ENFORCE_QUEUE_CF_ORIGIN defaults false; origin guard can be disabled without redeploy.
  7. GDPR. The origin guard logs only path and remote_addr; it logs no PII from request bodies. Retention follows Queue's audit log policy (2 years, DPA).

3. Decision Matrix

Three deployment patterns for a Heroku app behind Cloudflare:

Dimension Option A — Direct CNAME (chosen) Option B — CF Tunnel Option C — Cloudflare Pages proxy
DNS path queue.raxx.app CNAME → raxx-queue-prod.herokuapp.com, proxied=true CF Tunnel (cloudflared daemon) inside Heroku dyno Not applicable — Queue is a dynamic API, not a static site
Heroku custom domain Required — operator attaches queue.raxx.app to Heroku app Not needed (tunnel outbound) N/A
WAF coverage Full — all traffic transits CF edge Full N/A
Operational complexity Low — same pattern as api.raxx.app, console.raxx.app High — cloudflared binary in Dockerfile, tunnel provisioning, tunnel ID in config N/A
SSL CF-managed (Universal SSL or ACM) + Heroku SNI cert Tunnel terminates TLS at edge N/A
False-positive risk on Heroku-to-Heroku traffic Real (see §5 FM-7) — Raptor → Queue calls transit CF None (private tunnel) N/A
Bot Fight Mode Applies to all traffic including service-to-service Not applicable (tunnel is authenticated) N/A
Verdict Chosen. Matches established platform pattern. Lowest operator burden. Heroku-to-Heroku cross-CF traffic is an accepted constraint (see §5 FM-7). Rejected — significant ops complexity; cloudflared binary adds a new non-audited dependency; tunnel credentials add a new secret path. Rejected — not applicable for an API service.

Origin guard approach: CF-Connecting-IP header presence check. Cloudflare injects this on every proxied request. Requests arriving at the Heroku origin URL directly carry no CF-Connecting-IP. This is the same proof used by cloudflare_origin_guard.py in Raptor.

Service-to-service auth: Existing InternalAuthFilter (shipped in QP-C7 / #1717) uses X-Queue-Service-Token / Authorization: Bearer <token> on /api/internal/* routes. This is sufficient for in-band service identity. CF Access service tokens are NOT added for service-to-service. Rationale: CF Access service tokens introduce a paired WAF skip rule requirement (see feedback_cf_access_does_not_bypass_bot_fight_mode.md, incident 2026-05-12). Instead, the WAF skip rule is scoped more precisely: it skips Bot Fight Mode when the request has a valid Authorization: Bearer header on /api/internal/* paths (see §4 Layer 1 note).

CF Access posture for Queue: Queue has no human callers in Phase 1. All callers are machine services (Raptor, Console, Velvet) using Bearer tokens, or Stripe webhook delivery. CF Access is NOT applied to Queue in Phase 1. The gate for /api/internal/* is the existing InternalAuthFilter. When Phase 2 ships admin endpoints for Console operators to manage identity, a CF Access policy is added at that time (open question OQ-2).


4. Layered Defense — Queue in the 5-Layer Model

Queue fits into the WAF strategy's 5-layer model (ADR-0077) as follows:

LAYER 1 — CF Edge (WAF + Bot Fight Mode)
  Queue receives:
    ├── Cloudflare Free Managed Ruleset → Block
    ├── OWASP CRS → Log (Phase 3 soak), then Block
    ├── Auth endpoint rate limits (Phase 2 auth endpoints, not in Phase 1)
    │     POST /api/v1/auth/webauthn/login/begin    → 10 req/min/IP → Block
    │     POST /api/v1/auth/webauthn/register/begin → 5 req/min/IP → Block
    │     POST /api/v1/auth/backup-codes/redeem     → 5 req/min/IP → Block
    ├── Billing webhook rate limit (Phase 1)
    │     POST /api/v1/billing/webhook              → 60 req/min/IP → Block
    │     (Stripe webhook IPs are known; stricter than general traffic)
    ├── Bot Fight Mode → ON for all paths
    └── WAF skip rule for service-to-service on /api/internal/*:
          Expression: starts_with(http.request.uri.path, "/api/internal")
                      AND len(http.request.headers["authorization"]) gt 0
          Action: skip (bic, hot, rateLimit, securityLevel, uaBlock)
          Rationale: internal callers (Raptor, Console, Velvet) run on Heroku dynos
                     (AWS ASN) which trip Bot Fight Mode. Skip only on /api/internal
                     where Bearer token is required. The InternalAuthFilter (Layer 4)
                     validates the token before any handler runs.

LAYER 2 — CF Access
  Phase 1: NOT APPLIED. Queue has no human callers.
  Phase 2+: Apply CF Access policy on /admin/* paths when Console operators
            call Queue directly. See OQ-2.

LAYER 3 — Heroku Origin Guard (FLAG_ENFORCE_QUEUE_CF_ORIGIN)
  New: queue/src/middleware/cf_origin_guard.cpp (see §4.3)
  Allowlisted paths: /health (Heroku liveness probe)
  Reject path: CF-Connecting-IP absent → 403 {"error":"direct_origin_blocked"}
  Default: false (enabled post-soak in Phase 3)

LAYER 4 — Service Token Middleware (InternalAuthFilter, already shipped)
  /api/internal/* → Bearer token from env-loaded allowlist
  /api/v1/billing/webhook → Stripe HMAC-SHA256 signature verification (QP-C5)
  /api/v1/billing/* → service-to-service Bearer token

LAYER 5 — RBAC / Row-Level (Phase 2+)
  Not in Phase 1 billing scope.

4.1 WAF Rule Priority Order for Queue Surface

Cloudflare processes rules in priority order (lower number = higher priority):

Priority Rule Action Phase
1 Skip Bot Fight Mode on /api/internal/* with Authorization header Skip Phase 1
2 Block requests to *.herokuapp.com hostnames (belt-and-suspenders) Block Phase 1 (no-op for CF-proxied traffic; catches misconfigured callers)
3 Rate limit POST /api/v1/billing/webhook → 60/min/IP Block Phase 1
4 Rate limit POST /api/v1/auth/webauthn/login/begin → 10/min/IP Block Phase 2
5 Rate limit POST /api/v1/auth/backup-codes/redeem → 5/min/IP Block Phase 2
6 Global rate limit → 300/min/IP Managed Challenge Phase 1
7 CF Managed Ruleset (OWASP et al.) Block Phase 1
8 OWASP CRS Log → Block after 7-day soak Phase 3

4.2 Bot Fight Mode and Service-to-Service Traffic

Cloudflare Bot Fight Mode (BFM) evaluates BEFORE CF Access service tokens are inspected (incident documented in feedback_cf_access_does_not_bypass_bot_fight_mode.md, 2026-05-12). Heroku dynos egress from AWS ASNs (AS14618 / AS16509), which Cloudflare flags as bot-origin.

The WAF skip rule (Priority 1) covers only /api/internal/* paths — the exact paths guarded by InternalAuthFilter. This is tighter than zone-wide BFM disable. The pairing:

For billing CRUD and webhook endpoints (/api/v1/billing/*), Raptor and Console call these as authenticated callers. Stripe's webhook delivery IPs are AWS-based and will trigger BFM. The billing webhook path needs its own skip rule:

Priority Rule Action
1b Skip BFM on /api/v1/billing/webhook with Stripe-Signature header Skip

This header is present on all Stripe webhook deliveries. The HMAC verification in the webhook handler (QP-C5) validates the signature before processing.

4.3 C++ Origin Guard Middleware Spec

Feature-developer implements queue/src/middleware/cf_origin_guard.cpp with this interface:

// cf_origin_guard.h
#pragma once

#include <drogon/HttpFilter.h>
#include <string>
#include <unordered_set>

namespace queue {

/**
 * CfOriginGuard — Drogon HttpFilter.
 *
 * Rejects requests that did not transit Cloudflare when
 * FLAG_ENFORCE_QUEUE_CF_ORIGIN is set to "true" or "1".
 *
 * Proof: Cloudflare injects CF-Connecting-IP on every proxied request.
 * Direct-to-origin requests via raxx-queue-{prod,staging}.herokuapp.com
 * carry no CF-Connecting-IP header.
 *
 * Invariants:
 *   I-ALLOW: /health is always permitted (Heroku dyno liveness probe)
 *   I-FLAG:  FLAG_ENFORCE_QUEUE_CF_ORIGIN read at construction; restart
 *            required to change. Default = false (safe during rollout soak).
 *   I-LOG:   Blocked requests log path + remote_addr at WARN level with
 *            key "direct_origin_blocked". No PII from request body is logged.
 *   I-RAII:  No raw new/delete.
 *
 * Registration (in main.cpp):
 *   app().registerFilter("queue::CfOriginGuard");
 *   // Then attach to routes via ADD_METHOD_TO(..., "queue::CfOriginGuard")
 *   // OR register as global pre-filter via app().registerFilter() with
 *   // Drogon's global filter registration (Drogon v1.8+).
 */
class CfOriginGuard : public drogon::HttpFilter<CfOriginGuard> {
public:
    CfOriginGuard();

    void doFilter(const drogon::HttpRequestPtr& req,
                  drogon::FilterCallback&&      callback,
                  drogon::FilterChainCallback&& chainCallback) override;

private:
    bool enforcement_enabled_;

    // Paths that must remain reachable regardless of CF-Connecting-IP.
    // /health: Heroku platform liveness probe (not routed via CF).
    static const std::unordered_set<std::string> kAllowlistedPaths;

    static drogon::HttpResponsePtr make403();
};

} // namespace queue

Key implementation notes for feature-developer:


5. Concrete Terraform Design

Module layout

terraform/queue/
├── versions.tf         # Cloudflare provider ~> 4.0, S3 backend
├── variables.tf        # cloudflare_zone_id, heroku_* vars
├── dns.tf              # cloudflare_record for queue.raxx.app + queue-staging.raxx.app
├── waf.tf              # cloudflare_ruleset for rate limits + WAF rules + skip rules
├── outputs.tf          # record IDs, zone info
├── terraform.tfvars.example
└── README.md

dns.tf sketch

# ---------------------------------------------------------------------------
# Cloudflare DNS — Queue identity service
# ---------------------------------------------------------------------------
# Heroku custom domain must be attached BEFORE DNS is proxied.
# Operator action: see README.md §Operator Bootstrap.
#
# queue.raxx.app → raxx-queue-prod.herokuapp.com (proxied)
# queue-staging.raxx.app → raxx-queue-staging.herokuapp.com (proxied)

resource "cloudflare_record" "queue_prod" {
  zone_id = var.cloudflare_zone_id
  name    = "queue"
  type    = "CNAME"
  value   = "raxx-queue-prod.herokuapp.com"
  proxied = true

  comment = "Queue identity service prod — managed by Terraform"
}

resource "cloudflare_record" "queue_staging" {
  zone_id = var.cloudflare_zone_id
  name    = "queue-staging"
  type    = "CNAME"
  value   = "raxx-queue-staging.herokuapp.com"
  proxied = true

  comment = "Queue identity service staging — managed by Terraform"
}

waf.tf sketch (CF ruleset)

# ---------------------------------------------------------------------------
# CF WAF ruleset for queue.raxx.app + queue-staging.raxx.app
# ---------------------------------------------------------------------------
# Phase 1 rules (billing surface only).
# Phase 2 auth rules activate when WebAuthn endpoints ship (auth endpoints
# are gated behind FLAG_QUEUE_AUTH which is off in Phase 1).

resource "cloudflare_ruleset" "queue_prod_waf" {
  zone_id = var.cloudflare_zone_id
  name    = "Queue prod WAF rules"
  kind    = "zone"
  phase   = "http_request_firewall_custom"

  # Priority 1: Skip Bot Fight Mode for service-to-service /api/internal/*
  # Bearer header presence is a pre-filter; InternalAuthFilter does the real
  # validation. Without this skip, Heroku-egress IPs (AWS ASN) trigger BFM.
  rule {
    action      = "skip"
    expression  = <<-EOT
      (http.host eq "queue.raxx.app" AND
       starts_with(http.request.uri.path, "/api/internal") AND
       len(http.request.headers["authorization"]) gt 0)
    EOT
    description = "Skip Bot Fight Mode for service-to-service calls to /api/internal/*"
    action_parameters {
      phases   = ["http_ratelimit", "http_request_firewall_managed"]
      products = ["bic", "hot", "uaBlock"]
    }
    enabled = true
  }

  # Priority 1b: Skip Bot Fight Mode for Stripe webhook delivery
  rule {
    action      = "skip"
    expression  = <<-EOT
      (http.host eq "queue.raxx.app" AND
       http.request.uri.path eq "/api/v1/billing/webhook" AND
       len(http.request.headers["stripe-signature"]) gt 0)
    EOT
    description = "Skip Bot Fight Mode for Stripe webhook delivery"
    action_parameters {
      phases   = ["http_ratelimit", "http_request_firewall_managed"]
      products = ["bic", "hot", "uaBlock"]
    }
    enabled = true
  }

  # Priority 3: Rate limit billing webhook
  rule {
    action     = "block"
    expression = <<-EOT
      (http.host eq "queue.raxx.app" AND
       http.request.uri.path eq "/api/v1/billing/webhook" AND
       http.request.method eq "POST")
    EOT
    description = "Rate limit Stripe webhook endpoint"
    ratelimit {
      characteristics      = ["ip.src"]
      period               = 60
      requests_per_period  = 60
      mitigation_timeout   = 60
    }
    enabled = true
  }

  # Priority 6: Global rate limit
  rule {
    action     = "managed_challenge"
    expression = "(http.host eq \"queue.raxx.app\")"
    description = "Global rate limit — Queue all paths"
    ratelimit {
      characteristics      = ["ip.src"]
      period               = 60
      requests_per_period  = 300
      mitigation_timeout   = 60
    }
    enabled = true
  }
}

# Staging ruleset mirrors prod (replace host expression with queue-staging.raxx.app).
# Feature-developer may use a module or locals to DRY the two rulesets.

Heroku custom domain attachment

Terraform cannot manage Heroku custom domains (no official Heroku TF provider in the CF TF stack). This is an operator-side CLI action (Phase 0):

# Attach custom domain — prod
heroku domains:add queue.raxx.app --app raxx-queue-prod >/dev/null
heroku domains:wait --app raxx-queue-prod

# Attach custom domain — staging
heroku domains:add queue-staging.raxx.app --app raxx-queue-staging >/dev/null
heroku domains:wait --app raxx-queue-staging

Heroku generates an ACM cert automatically after domain attachment. The Heroku DNS target returned by heroku domains --app raxx-queue-prod is in the format <hash>.herokudns.com. The Terraform CNAME value must use raxx-queue-prod.herokuapp.com (the canonical Heroku origin URL), NOT the herokudns.com target. Cloudflare resolves the CNAME and proxies it correctly.


6. Failure Modes

# Failure Trigger Impact Detection Recovery
FM-1 CF outage (full zone offline) CF network incident Origin guard becomes load-bearing. Phase 1: billing CRUD callers (Raptor, Console, Velvet) call Queue directly; since FLAG_ENFORCE_QUEUE_CF_ORIGIN=false is default until Phase 4, billing continues. Phase 4 (guard ON): all non-allowlisted traffic returns 403. CF status page, Sentry alert on 5xx spike Operator: heroku config:set FLAG_ENFORCE_QUEUE_CF_ORIGIN=false --app raxx-queue-prod >/dev/null && heroku restart --app raxx-queue-prod to temporarily disable guard.
FM-2 Heroku custom domain DNS drift (cert expiry or domain detach) Heroku ACM cert not renewed, or operator accidentally runs heroku domains:remove CF CNAME resolves to dead origin; queue.raxx.app returns 521/522. Billing updates stop. Sentry alert on 5xx at queue.raxx.app; Raptor mirror-sync lag Reattach domain: heroku domains:add queue.raxx.app --app raxx-queue-prod >/dev/null. If cert issue: heroku certs:auto:enable --app raxx-queue-prod.
FM-3 Existing callers using raxx-queue-*.herokuapp.com directly break when origin guard flips Phase 4 flag flip before caller migration (Phase 2) completes Raptor mirror-sync returns 403; Stripe webhook delivery blocked at origin guard 403 log entries with direct_origin_blocked in Heroku logs + Sentry alert Disable guard flag. Audit callers (SC-Q-CF-4). Re-flip after migration complete.
FM-4 Raptor → Queue internal calls trip Bot Fight Mode BFM evaluates before WAF skip rule; skip rule misconfigured /api/internal/billing/mirror-sync calls return 403 from CF BFM before reaching Queue Sentry 403 alert on mirror-sync; billing mirror staleness alert Verify WAF skip rule Priority 1 is applied in Cloudflare dashboard. Check that Authorization header is present on Raptor's outbound calls.
FM-5 Stripe webhook delivery blocked by BFM Stripe webhook skip rule missing or misconfigured POST /api/v1/billing/webhook returns 403 from CF; Stripe marks endpoint degraded; events queue in Stripe Stripe dashboard webhook failure alert; Queue Sentry 403 alert Add/verify WAF skip rule Priority 1b for Stripe-Signature header. Check CF WAF logs.
FM-6 mTLS desired but unavailable on Heroku WAF threat model §S11 suggests mTLS for service-to-service mTLS would prevent token exfiltration even if CF is bypassed. Heroku does not support mutual TLS at the dyno layer (no client cert enforcement at the TLS termination point). Architectural constraint, not a runtime failure Accepted constraint. Compensating control: InternalAuthFilter + short-lived service tokens (≤90 day rotation per Velvet). mTLS revisit: if Queue migrates off Heroku to Fly.io or AWS ECS, mTLS is feasible.
FM-7 Heroku-to-Heroku calls traverse CF (latency overhead) All services on Heroku; Queue behind CF proxy Raptor → queue.raxx.app → CF edge (US/EU PoP) → Heroku US → Queue. Adds ~5–20ms round-trip vs. direct. Auth hot path (Phase 2 JWT mint) is affected. Latency spike visible in Sentry performance traces Accepted in Phase 1 (billing, not auth hot path). Phase 2 mitigation: Raptor verifies JWTs offline (JWKS cache); does NOT call Queue per request. Only token refresh and explicit revocation calls go via Queue.
FM-8 CF-Connecting-IP header spoofed by external caller Attacker directly hits Heroku origin URL with a forged CF-Connecting-IP header Origin guard passes the spoofed request Detection: CF publishes its IP ranges; Phase 2 of origin guard validates source IP against CF CIDR list Short-term: accept header-only check (consistent with Raptor pattern). Phase 2 hardening: validate request->getPeerAddr() is in CF IP CIDR list (published at cloudflare.com/ips-v4).
FM-9 Terraform apply destroys DNS record in wrong order terraform destroy or erroneous resource rename removes CNAME before Heroku domain is detached queue.raxx.app goes NXDOMAIN; all callers fail N/A — post-hoc terraform import cloudflare_record.queue_prod <zone_id>/<record_id> to restore.
FM-10 FLAG_ENFORCE_QUEUE_CF_ORIGIN=true flipped on prod before staging soak completes Phase 4 runbook not followed Legitimate callers using old Heroku URL hard-coded return 403 Sentry 403 spike; Raptor billing mirror staleness heroku config:set FLAG_ENFORCE_QUEUE_CF_ORIGIN=false --app raxx-queue-prod >/dev/null && heroku restart --app raxx-queue-prod
FM-11 Bot Fight Mode false-positive on /health endpoint Heroku platform sends health probe from AWS-range IP Dyno reported unhealthy → restart loop Heroku dyno restart alerts /health is allowlisted in CfOriginGuard. BFM skip rule does not cover /health. Add /health to WAF skip rule if BFM triggers on Heroku health check probes.

7. Migrations

No database schema changes in this design. The origin guard is pure middleware.

The only state changes are: - Cloudflare DNS record creation (Terraform apply, reversible with destroy) - Heroku custom domain attachment (CLI, reversible with heroku domains:remove) - Feature flag flip (env var, reversible with heroku config:unset)

Rollback for each phase:

Phase Rollback
Phase 0 (domain attach) heroku domains:remove queue.raxx.app --app raxx-queue-prod >/dev/null
Phase 1 (DNS record) terraform destroy -target cloudflare_record.queue_prod
Phase 2 (caller migration) Revert env var QUEUE_API_BASE_URL to raxx-queue-prod.herokuapp.com in callers
Phase 3 (flag on staging) heroku config:set FLAG_ENFORCE_QUEUE_CF_ORIGIN=false --app raxx-queue-staging >/dev/null
Phase 4 (flag on prod) heroku config:set FLAG_ENFORCE_QUEUE_CF_ORIGIN=false --app raxx-queue-prod >/dev/null

8. Rollout Plan

Phase 0 — Operator actions (no enforcement, no risk)

Target: 2026-05-11 UTC (immediate)

  1. Attach custom domain to Heroku apps: bash heroku domains:add queue.raxx.app --app raxx-queue-prod >/dev/null heroku domains:wait --app raxx-queue-prod heroku domains:add queue-staging.raxx.app --app raxx-queue-staging >/dev/null heroku domains:wait --app raxx-queue-staging
  2. Note the ACM DNS targets returned by heroku domains --app raxx-queue-prod for verification.
  3. Verify heroku certs:auto --app raxx-queue-prod shows ACM cert provisioning in progress.

No traffic change. The Heroku origin URL remains directly reachable. This just prepares the custom domain cert.

Phase 1 — CF proxied DNS live (no enforcement)

Precondition: Phase 0 complete, Heroku ACM cert provisioned.

  1. terraform apply in terraform/queue/ — creates CNAME records, proxied=true.
  2. Verify: curl -i https://queue.raxx.app/health returns 200 (CF-proxied).
  3. Verify: curl -i https://queue.raxx.app/health | grep CF-RAY — presence of CF-RAY header confirms CF is in the path.
  4. FLAG_ENFORCE_QUEUE_CF_ORIGIN remains false (default). No traffic impact.
  5. Soak 48 hours. Monitor heroku logs --tail --app raxx-queue-prod for unexpected traffic patterns.
  6. Update surface matrix in docs/security/web-surface-posture.md to reflect Queue now proxied.

Phase 2 — Caller migration audit

Precondition: Phase 1 soak clean.

Feature-developer runs SC-Q-CF-4: audit all callers for hard-coded raxx-queue-*.herokuapp.com URLs. Callers:

Service Config variable Action
Raptor (backend_v2/) QUEUE_API_BASE_URL env var Confirm set to https://queue.raxx.app; rotate on staging first
Console QUEUE_API_BASE_URL env var Same
Velvet QUEUE_API_BASE_URL env var Same
iOS client QUEUE_BASE_URL build config Confirm; TestFlight build for staging
GH Actions cron jobs Hard-coded URL in workflow? Audit .github/workflows/ for any hard-coded Heroku URLs

All callers should be calling queue.raxx.app (or queue-staging.raxx.app) before Phase 3.

Phase 3 — Origin guard ON (staging)

Precondition: Phase 2 migration complete on staging.

  1. Feature-developer ships cf_origin_guard.cpp (SC-Q-CF-3).
  2. heroku config:set FLAG_ENFORCE_QUEUE_CF_ORIGIN=true --app raxx-queue-staging >/dev/null && heroku restart --app raxx-queue-staging
  3. Verify: curl -i https://raxx-queue-staging.herokuapp.com/health → 403 direct_origin_blocked.
  4. Verify: curl -i https://queue-staging.raxx.app/health → 200 (CF-proxied, allowlisted path).
  5. Run full integration test suite against queue-staging.raxx.app (not the Heroku URL).
  6. Soak 48 hours. Watch Sentry for unexpected 403s.

Phase 4 — Origin guard ON (prod)

Precondition: Phase 3 soak clean, Phase 2 migration complete on prod.

  1. heroku config:set FLAG_ENFORCE_QUEUE_CF_ORIGIN=true --app raxx-queue-prod >/dev/null && heroku restart --app raxx-queue-prod
  2. Verify: curl -i https://raxx-queue-prod.herokuapp.com/health → 403.
  3. Verify: curl -i https://queue.raxx.app/health → 200.
  4. Update docs/security/web-surface-posture.md surface matrix row for Queue.
  5. Close #1733.

Total elapsed time from Phase 0 start to Phase 4 completion: minimum 96 hours (two 48h soaks). Realistically 5–7 days including caller migration.


9. Security Considerations

Question Answer
What PII does this collect? Origin guard logs only path and remote_addr (IP address). No request body, no auth tokens, no user identifiers are logged by the guard itself.
What is the retention period? Guard reject events are structured log lines in Heroku's Logplex. Logplex retention is 1,500 lines (~1 hour). Persistent retention: when Heroku log drain to Papertrail or S3 is configured, guard events follow Queue's audit log retention (2 years, DPA).
How is it deleted on DSR? IP addresses in Heroku logs are not directly linked to a data subject record. If a DSR requests log deletion, follow Queue's audit log erasure runbook. Guard logs contain no PII beyond IP.
What is logged for audit? Blocked request: direct_origin_blocked, path, remote_addr. Allowed request: no additional log line (Drogon access log handles request logging).
Does any part store a credential in a form that could be replayed? No. The guard reads no credentials; it inspects only the presence of CF-Connecting-IP.
What happens on breach? CF edge protection does not store data. A breach of Queue's database follows Queue's existing breach response (GDPR Art. 33 72h notification). The guard itself is not a breach surface.
Where are secrets? CLOUDFLARE_API_TOKEN for Terraform apply: Infisical /MooseQuest/cloudflare/. FLAG_ENFORCE_QUEUE_CF_ORIGIN: Heroku config var (not a secret; a boolean). No new secret paths introduced.
Is there a kill-switch for live execution paths? FLAG_ENFORCE_QUEUE_CF_ORIGIN=false disables the guard without redeploy. Raptor's billing mirror-sync will fail-open (continue using stale mirror) per existing Raptor logic.
Are secrets rotatable without redeploy? N/A — no new secrets introduced by this design. The feature flag can be toggled via heroku config:set without redeploy (requires dyno restart).

10. Open Questions (require operator decisions)

OQ-1 — Subdomain: queue.raxx.app or api-queue.raxx.app?

queue.raxx.app is clean and matches the service codename. api-queue.raxx.app is more descriptive but breaks the short-subdomain pattern (console.raxx.app, api.raxx.app). Recommended: queue.raxx.app. Feature-developer cannot proceed with Phase 0 until this is confirmed.

OQ-2 — CF Access on Queue's admin endpoints (Phase 2+)?

When Phase 2 ships WebAuthn admin endpoints or Console-operator-to-Queue admin paths (e.g., GET /api/admin/customers, POST /api/admin/rbac/grant), should these be gated by CF Access? Recommended: yes — apply CF Access with the operator allowlist policy, matching the pattern for console.raxx.app. This blocks anyone who discovers the queue.raxx.app URL from probing admin endpoints even if they have a stolen service token. Operator decision needed before Phase 2 sub-cards are groomed.

OQ-3 — Heroku custom domain SSL: automatic (ACM) or BYO cert?

Heroku's Automated Certificate Management (ACM) provisions a free cert automatically on custom domain attachment. This is sufficient. BYO cert adds operational burden. Recommended: ACM. No action needed unless operator has a specific cert requirement.

OQ-4 — Bot Fight Mode policy for Queue auth endpoints: on or super?

super (Cloudflare Super Bot Fight Mode) provides fingerprint-based detection in addition to ASN-based detection. More effective against sophisticated bots but can generate more false-positives. Recommended for Phase 2 auth endpoints: start with on during soak, evaluate false-positive rate on WebAuthn flows, then consider super. Operator decides the final posture.

OQ-5 — WAF log ingestion pipeline?

Cloudflare WAF logs are available via Logpush (Enterprise) or in the dashboard (all plans). For systematic threat monitoring, WAF logs should flow to the same sink as Sentry. This is noted in ADR-0077 D2 (shared open question). Not blocking for Phase 0–4; needed before Queue is customer-facing (Phase 2+).


11. Sequence Diagram

sequenceDiagram
    participant Caller as Caller<br/>(Raptor/Console/Velvet)
    participant CF as Cloudflare Edge<br/>(WAF + BFM)
    participant Origin as Queue Origin<br/>(Heroku dyno)
    participant Guard as CfOriginGuard<br/>(Layer 3)
    participant Auth as InternalAuthFilter<br/>(Layer 4)
    participant Handler as Route Handler

    Caller->>CF: POST queue.raxx.app/api/internal/billing/mirror-sync\n  Authorization: Bearer <token>
    CF->>CF: Priority 1 skip rule matches\n(path starts /api/internal, Authorization present)\nBFM skipped
    CF->>CF: WAF managed rules evaluated\nRate limits evaluated
    CF->>Origin: Forward request\n  CF-Connecting-IP: <client IP>\n  Authorization: Bearer <token>
    Origin->>Guard: doFilter()
    Note over Guard: enforcement_enabled_=true (Phase 4)\nCF-Connecting-IP present → pass
    Guard->>Auth: chainCallback()
    Note over Auth: extract Bearer token\nlookup in token_map_\ncaller=raptor → pass
    Auth->>Handler: chainCallback()
    Handler-->>Caller: 200 OK

    Note over Caller,Handler: --- Direct origin bypass attempt (Phase 4) ---

    Caller->>Origin: POST raxx-queue-prod.herokuapp.com/api/internal/billing/mirror-sync
    Note over CF: Traffic never transits CF\n(direct Heroku URL)
    Origin->>Guard: doFilter()
    Note over Guard: CF-Connecting-IP absent\npath not allowlisted\nenforcement_enabled_=true
    Guard-->>Caller: 403 {"error":"direct_origin_blocked"}

References