ADR-0104: Customer-Facing Error Code Format
Date: 2026-05-20 UTC
Status: Accepted
Deciders: software-architect; pending operator confirmation
Related: customer-facing-error-codes-2026-05-20.md
Context
At v1 launch (2026-05-23 UTC), customers who hit errors in the app have no
quotable reference they can give to support@raxx.app. The system already
mints a request_id UUID on every request (via logging.py) and returns it
in error response JSON. It is never surfaced to the customer. Three options
were evaluated:
- Option A: surface the full
wfl_*workflow UUID (36 chars, flag-dependent) - Option B: define a
RAX-NNNinteger registry (requires a sprint to bootstrap 100+ error sites) - Option C: hybrid short code —
RAX-<DOMAIN>-<8-hex>derived from the last 8 hex characters of the already-mintedrequest_idUUID
Decision
Adopt Option C: RAX-<DOMAIN>-<8-hex> format.
The DOMAIN is a 3–5 letter category (AUTH, TRD, BCK, HST, MKT, ONB, BIL,
SYS). The 8-hex suffix is the last 8 characters of the existing request_id
UUID, providing ~1-in-4-billion collision resistance at current request volume.
The error_code field is added to all error responses alongside the existing
request_id field. Frontend error renderers surface error_code to the
customer with a support email hint.
Consequences
Positive:
- Human-quotable: a customer can read and type RAX-TRD-a3f91b2c in an email.
- No new database writes, no new flags, no migration.
- Deployable in one day per surface (two focused sub-cards).
- Support agent looks up the full request_id in the log drain using the
8-hex suffix as a suffix filter.
- Composable: when FLAG_TRACE_MIDDLEWARE is stable, the full workflow_id
can be included alongside without replacing the short code.
Negative:
- Less descriptive than an integer registry (e.g. RAX-042 = broker rejected).
This is accepted; the domain prefix provides enough triage signal for v1.
- If log drain retention is less than 72 hours (the Heroku default), the
correlation window is short. Mitigation: the trace_events row (when trace
middleware is on) carries the same request lineage for 7 years.
Alternatives Considered
Option A — surface wfl_* UUID directly
Rejected for v1 because: (a) 32–36 characters is not human-quotable over
email or a support chat; (b) requires FLAG_TRACE_MIDDLEWARE to be on,
adding a flag dependency to every error path; (c) UUIDs are not meaningful to
customers as error identifiers.
Post-launch, workflow_id can accompany the short code in error responses
without replacing it.
Option B — RAX-NNN integer registry
Rejected for v1 because: bootstrapping a registry for ~100 raw-exception passthrough sites is a sprint-sized task that cannot be completed before 2026-05-23 UTC. The registry also requires ongoing maintenance for every new error type. The hybrid format provides the human-quotability benefit with a fraction of the bootstrapping cost.
Option B remains available as a post-launch enhancement if the domain-prefix
format proves insufficient for support triage. An integer registry can be
layered on top of Option C (the error_code field can carry either format)
without a breaking change.