ADR-0057: Break-Glass Grant — Time-Limited, Justification-Required, Alert-First
Date: 2026-05-09 UTC
Status: Accepted
Deciders: software-architect
Refs: docs/architecture/rbac-v2/design.md §6, ADR-0020, docs/architecture/rbac-design.md §12
Context
The break-glass group provides emergency access to all roles. ADR-0020 established that break-glass is a group with one member, not a special user type. Two questions remain:
- How is break-glass access time-bounded?
- What accountability mechanism exists for its use?
The prior design (rbac-design.md §12) proposed a 2-hour session cap. This ADR finalizes the design.
Decision
Break-glass grants are time-limited (1h default, 4h max), require a written justification (minimum 20 chars), require a passkey re-authentication (WebAuthn step-up), and trigger a Slack + Postmark alert to ops@raxx.app before the session proceeds.
The grant is stored in rbac_break_glass_sessions. The expiry is enforced server-side: any request that presents a session where the break-glass session has expired receives a 403 and a prompt to re-justify and re-authenticate.
The alert fires before the session is granted, not after. If the alert channel (Slack or Postmark) is unreachable, the grant is held until the alert succeeds or times out (30s timeout). This may delay break-glass access in a degraded infra scenario. The tradeoff is accepted: accountability is not optional.
Self-grant prohibition applies. The operator requesting break-glass access must already be in the raxx-platform-admins group. An operator who is not in raxx-platform-admins cannot escalate to break-glass; they must contact another raxx-platform-admins member to grant it.
Consequences
Positive: - Time-limiting prevents a forgotten break-glass session from becoming a persistent privilege. - The step-up WebAuthn challenge creates a strong accountability signal: the session is tied to a physical authenticator interaction, timestamped. - Alert-before-grant means the operator is always visibly accountable before accessing sensitive data. - Justification creates a searchable record for post-incident review.
Negative: - The 30-second alert timeout could delay break-glass access during a real outage where both the production environment and the alert channel are degraded simultaneously. Mitigation: the alert is sent to a separate Slack workspace (not the same infra hosting the alert) and via Postmark (separate delivery path). Two independent channels reduces the probability of simultaneous unavailability. - 4-hour maximum may be insufficient for extended incident response. Operator can re-request break-glass access after expiry; each re-request requires re-authentication and re-justification, creating an auditable renewal record.
Alternatives Considered
Permanent break-glass group membership
The operator is always in the break-glass group; no time limit.
Rejected: ADR-0020 established that superadmin is break-glass only and is never a working role. Permanent membership contradicts this invariant and leaves the operator with persistent elevated access that accumulates risk over time.
Session cap without DB table
Time-bound the session itself (shorter session TTL for break-glass logins) rather than storing a separate record.
Rejected: The session TTL cap is less granular and does not provide a searchable audit trail separate from the session log. The rbac_break_glass_sessions table is queryable by incident responders independently of the session store, which is valuable during an active incident.
See docs/architecture/rbac-v2/design.md §6 for the break-glass design.