ADR-0067 — Queue: Signed JWT for Session Tokens (Offline Verification)
Date: 2026-05-09 UTC
Status: Accepted — values locked 2026-05-09 UTC by security-agent research
Deciders: software-architect (design), security-agent (TTL + rotation + revocation values)
Context: docs/architecture/queue/design.md §Session Token Strategy
Research: docs/architecture/queue/jwt-ttl-research-2026-05-09.md
Context
Every authenticated Raptor request must verify a session token issued by Queue. Two models:
Option A — Opaque token, online verification: Queue issues a random opaque token stored in queue_sessions. Raptor calls GET /api/v1/sessions/current/status on Queue for every request to validate the token. Revocation is instant.
Option B — Signed JWT, offline verification: Queue issues a short-lived RS256 JWT embedding customer_id, session_id, roles, paper_first_gate, fresh_until. Raptor verifies the signature offline using QUEUE_JWT_PUBLIC_KEY. No Queue call per request. Revocation window = JWT TTL (max 15 minutes).
Decision
Option B — Signed JWT (RS256, offline verification).
Access-token values
| Parameter | Value | Source |
|---|---|---|
| Algorithm | RS256 only; algorithms allowlist hardcoded |
OWASP JWT Cheat Sheet; alg:none prevention |
| TTL | 15 minutes | Security-agent research 2026-05-09; OWASP, Schwab Trader API, industry consensus |
aud claim |
raxx-raptor-v1 (pinned) |
Prevents cross-service replay |
jti claim |
UUID per token | Required for optional incident blocklist |
| Clock skew tolerance | 30 seconds (leeway) |
OWASP max recommendation |
JWT claims:
{
"sub": "<customer_id>",
"sid": "<session_id>",
"jti": "<uuid>",
"aud": "raxx-raptor-v1",
"iss": "raxx-queue-v1",
"tier": "free|founders|pro",
"roles": ["antlers-user"],
"paper_first_gate": false,
"fresh_until": "<iso8601_utc>",
"iat": <unix_ts>,
"exp": <unix_ts>
}
Refresh-token values
| Parameter | Value | Operator working value | Notes |
|---|---|---|---|
| TTL | 7 days | 30 days | Counter-recommended; see research doc §2 |
| Rotation | On every use | (not specified) | RFC 6819 §5.2.2.3; Okta SPA default |
| Rotation grace period | 30 seconds | (not specified) | Allows mobile/slow-network retries |
| Reuse detection | Full session-chain invalidation | (not specified) | Simultaneous attacker + victim lockout |
| Storage (browser) | HttpOnly, Secure, SameSite=Strict cookie | (not specified) | RT never readable by Antlers JS |
| Storage (iOS, future) | iOS Keychain (platform WebAuthn layer) | (not specified) | To be confirmed when iOS auth work begins |
If the operator retains 30-day RT TTL: rotation-on-every-use is mandatory (not optional), and the ADR must document the 23-day residual risk window explicitly.
Key management
QUEUE_JWT_SIGNING_KEY(RSA private key, PEM or KMS key ARN) → SSM at/raxx/queue/jwt-signing-keyQUEUE_JWT_PUBLIC_KEY(RSA public key, PEM) → env var on Raptor, Console, and all consumers- Rotation: new key pair generated in SSM; both old and new public keys accepted by Raptor for a 5-minute overlap window (dual-accept via
QUEUE_JWT_PUBLIC_KEY_PREV) kidclaim: static identifier (queue-v1or rotation index); Raptor matches to one of two pre-loaded keys; no dynamic key fetching, no JWKS URL resolution
Revocation strategy
Primary: wait-for-expiry. AT expires in ≤15 minutes. Paired with immediate RT invalidation on logout (attacker cannot refresh; window bounded by remaining AT TTL at logout time).
Incident response: targeted Redis blocklist via FLAG_QUEUE_INSTANT_REVOKE. When enabled, Queue writes the revoked session_id to a Redis sorted set with TTL = remaining AT lifetime. Raptor checks this set on each request. Off by default. Enabled per-incident by operator or automated anomaly system. Redis must be in the same availability zone as Raptor when enabled.
This hybrid satisfies ADR-0068's resilience constraint (Redis is not on the default hot path; Raptor continues to verify offline when the flag is off) while providing a sub-second revocation path for incident response.
Pen-test threats addressed
| Threat | Mitigation |
|---|---|
| alg:none | algorithms=["RS256"] hardcoded; reject all others |
| RS256 → HS256 confusion | Key loaded as RSA key object; type enforcement prevents HMAC reuse |
| JWKS poisoning / JWK header injection | Pinned key in env; no URL-based key fetching |
| JWT replay within TTL | 15-min TTL; aud scoping; optional Redis blocklist for incidents |
| kid injection | Static kid matching; no DB lookup or filesystem read |
| Weak signing key | KMS-backed RSA >= 2048 bit; private key never on dyno |
| RT theft | Rotation-on-every-use + reuse detection (full chain invalidation) |
For step-up-gated routes (live trading, credential changes, GDPR erasure), Raptor calls Queue's POST /api/v1/auth/sessions/step-up to get a fresh JWT with an updated fresh_until. This is low-frequency and the latency cost is acceptable.
Consequences
Positive: - Zero cross-service latency on the hot path. Verification is a local crypto operation (~0.1ms vs 10-15ms for a network call). - Queue downtime does not block requests with valid in-flight JWTs (up to 15-minute grace window). - RT rotation provides theft detection without a per-request blocklist check. - alg allowlist + pinned keys eliminate the three most commonly exploited JWT attack classes.
Negative: - Revocation window: a revoked session remains technically valid for up to 15 minutes (AT lifetime). Acceptable for v1 at stated scale; incident-response path (Redis blocklist) available when needed. - Signing key rotation requires a transient dual-accept period. Coordinated deployment required. - 7-day RT TTL (recommended) requires weekly re-auth for inactive users. Passkey experience is fast; UX cost is low.
Alternatives Considered
Option A (opaque token, online verification): Adds 10-15ms to every authenticated Raptor request. Creates a hard dependency on Queue availability for every request — compounding the failure surface. Opaque tokens with instant revocation are the correct choice once Queue runs as a standalone service with a Redis-backed session store; deferred to Phase 4.
Redis blocklist on every request: Adds Redis as a hard dependency to every Raptor request. Contradicts ADR-0068 (fail-closed posture) — if Redis is down and Raptor fails-closed, Redis downtime equals full Raptor downtime. If Raptor fails-open when Redis is down, the revocation control is defeated. Neither is acceptable for v1.
30-day RT TTL (operator working value): Defensible for native mobile with keychain storage. Not recommended as default for browser-based SPA (Antlers) given the larger RT theft blast radius. Compensating control: rotation-on-every-use is mandatory if 30-day TTL is retained.