Raxx · internal docs

internal · gated

ADR-0067 — Queue: Signed JWT for Session Tokens (Offline Verification)

Date: 2026-05-09 UTC Status: Accepted — values locked 2026-05-09 UTC by security-agent research Deciders: software-architect (design), security-agent (TTL + rotation + revocation values) Context: docs/architecture/queue/design.md §Session Token Strategy Research: docs/architecture/queue/jwt-ttl-research-2026-05-09.md


Context

Every authenticated Raptor request must verify a session token issued by Queue. Two models:

Option A — Opaque token, online verification: Queue issues a random opaque token stored in queue_sessions. Raptor calls GET /api/v1/sessions/current/status on Queue for every request to validate the token. Revocation is instant.

Option B — Signed JWT, offline verification: Queue issues a short-lived RS256 JWT embedding customer_id, session_id, roles, paper_first_gate, fresh_until. Raptor verifies the signature offline using QUEUE_JWT_PUBLIC_KEY. No Queue call per request. Revocation window = JWT TTL (max 15 minutes).


Decision

Option B — Signed JWT (RS256, offline verification).

Access-token values

Parameter Value Source
Algorithm RS256 only; algorithms allowlist hardcoded OWASP JWT Cheat Sheet; alg:none prevention
TTL 15 minutes Security-agent research 2026-05-09; OWASP, Schwab Trader API, industry consensus
aud claim raxx-raptor-v1 (pinned) Prevents cross-service replay
jti claim UUID per token Required for optional incident blocklist
Clock skew tolerance 30 seconds (leeway) OWASP max recommendation

JWT claims:

{
  "sub": "<customer_id>",
  "sid": "<session_id>",
  "jti": "<uuid>",
  "aud": "raxx-raptor-v1",
  "iss": "raxx-queue-v1",
  "tier": "free|founders|pro",
  "roles": ["antlers-user"],
  "paper_first_gate": false,
  "fresh_until": "<iso8601_utc>",
  "iat": <unix_ts>,
  "exp": <unix_ts>
}

Refresh-token values

Parameter Value Operator working value Notes
TTL 7 days 30 days Counter-recommended; see research doc §2
Rotation On every use (not specified) RFC 6819 §5.2.2.3; Okta SPA default
Rotation grace period 30 seconds (not specified) Allows mobile/slow-network retries
Reuse detection Full session-chain invalidation (not specified) Simultaneous attacker + victim lockout
Storage (browser) HttpOnly, Secure, SameSite=Strict cookie (not specified) RT never readable by Antlers JS
Storage (iOS, future) iOS Keychain (platform WebAuthn layer) (not specified) To be confirmed when iOS auth work begins

If the operator retains 30-day RT TTL: rotation-on-every-use is mandatory (not optional), and the ADR must document the 23-day residual risk window explicitly.

Key management

Revocation strategy

Primary: wait-for-expiry. AT expires in ≤15 minutes. Paired with immediate RT invalidation on logout (attacker cannot refresh; window bounded by remaining AT TTL at logout time).

Incident response: targeted Redis blocklist via FLAG_QUEUE_INSTANT_REVOKE. When enabled, Queue writes the revoked session_id to a Redis sorted set with TTL = remaining AT lifetime. Raptor checks this set on each request. Off by default. Enabled per-incident by operator or automated anomaly system. Redis must be in the same availability zone as Raptor when enabled.

This hybrid satisfies ADR-0068's resilience constraint (Redis is not on the default hot path; Raptor continues to verify offline when the flag is off) while providing a sub-second revocation path for incident response.

Pen-test threats addressed

Threat Mitigation
alg:none algorithms=["RS256"] hardcoded; reject all others
RS256 → HS256 confusion Key loaded as RSA key object; type enforcement prevents HMAC reuse
JWKS poisoning / JWK header injection Pinned key in env; no URL-based key fetching
JWT replay within TTL 15-min TTL; aud scoping; optional Redis blocklist for incidents
kid injection Static kid matching; no DB lookup or filesystem read
Weak signing key KMS-backed RSA >= 2048 bit; private key never on dyno
RT theft Rotation-on-every-use + reuse detection (full chain invalidation)

For step-up-gated routes (live trading, credential changes, GDPR erasure), Raptor calls Queue's POST /api/v1/auth/sessions/step-up to get a fresh JWT with an updated fresh_until. This is low-frequency and the latency cost is acceptable.


Consequences

Positive: - Zero cross-service latency on the hot path. Verification is a local crypto operation (~0.1ms vs 10-15ms for a network call). - Queue downtime does not block requests with valid in-flight JWTs (up to 15-minute grace window). - RT rotation provides theft detection without a per-request blocklist check. - alg allowlist + pinned keys eliminate the three most commonly exploited JWT attack classes.

Negative: - Revocation window: a revoked session remains technically valid for up to 15 minutes (AT lifetime). Acceptable for v1 at stated scale; incident-response path (Redis blocklist) available when needed. - Signing key rotation requires a transient dual-accept period. Coordinated deployment required. - 7-day RT TTL (recommended) requires weekly re-auth for inactive users. Passkey experience is fast; UX cost is low.


Alternatives Considered

Option A (opaque token, online verification): Adds 10-15ms to every authenticated Raptor request. Creates a hard dependency on Queue availability for every request — compounding the failure surface. Opaque tokens with instant revocation are the correct choice once Queue runs as a standalone service with a Redis-backed session store; deferred to Phase 4.

Redis blocklist on every request: Adds Redis as a hard dependency to every Raptor request. Contradicts ADR-0068 (fail-closed posture) — if Redis is down and Raptor fails-closed, Redis downtime equals full Raptor downtime. If Raptor fails-open when Redis is down, the revocation control is defeated. Neither is acceptable for v1.

30-day RT TTL (operator working value): Defensible for native mobile with keychain storage. Not recommended as default for browser-based SPA (Antlers) given the larger RT theft blast radius. Compensating control: rotation-on-every-use is mandatory if 30-day TTL is retained.