Customer-Facing Error Code Traceability — Audit and Recommendation
Status: Decision pending
Owner: software-architect
Date: 2026-05-20 UTC
Related ADR: 0104
Related designs: workflow-uuid-tracing.md, support-raxx-app.md
Parent card: #2619 (SC-D12 troubleshooting.md)
1. Context
T-3 days to v1 launch (2026-05-23 UTC). When a customer hits an error — in the
app, on the marketing site, or in the demo flow — the current system gives them
either a raw HTTP status code, a generic "Something went wrong" string, or
occasionally a raw Python exception message that may include internal details
(including vendor names). None of these are quotable to support@raxx.app.
This document audits the current state across all customer-facing surfaces, documents the gaps, and recommends a traceability scheme that is ready for v1 and survives into GA.
2. Invariants
The following non-negotiable constraints apply to this design:
- No stored credentials. Error codes and trace IDs must never encode or expose credential material, session tokens, or API keys.
- No vendor names in customer-facing copy. Error messages shown to users
must not name Alpaca, SnapTrade, Alpha Vantage, or any other third-party
service. This applies to error toasts, error boundaries, JSON
messagefields in API responses that bubble to the UI, and any support-quoted code. - Audit trail for every state change that affects money, permissions, or data access. Error events that terminate a trade workflow are state changes; they must be traceable server-side.
- GDPR by default. Error context logged for support lookups must not contain raw PII beyond what is already retained per ADR-0003. Error codes themselves are not PII; trace IDs reference records that are.
- Error codes are rotatable artifacts. A code-to-description mapping lives in a config file, not hardcoded in templates. Support tooling reads the same mapping. The mapping must survive a support agent turnover without losing context.
3. Audit Findings by Surface
3.1 raxx.app (Antlers — authenticated app)
Error Boundary (ErrorBoundary.js)
- Renders:
<h2>Something went wrong.</h2>plus a<details>block witherror.toString()and the React component stack. - The component stack and raw error string are rendered inline in production
HTML. A crash in a component that handles broker API state could expose
vendor names via
error.toString(). - No stable code or trace ID is shown. No Sentry event ID is surfaced.
- Gap: customer sees no quotable identifier. Support cannot correlate without the customer's browser console output.
PageStateCard (PageStateCard.js)
- Shows:
message || 'Something went wrong.' - The
messageprop originates from API error responses. When the backend returns{"error": str(e)}, the raw exception string is the message. - No trace ID, no error code, no support contact hint.
- Gap: no correlation anchor for support.
TradeForm (TradeForm.js)
- On order failure:
Failed to place order: ${err.message || 'Unknown error'} err.messageis the raw network error or the API responseerrorfield.- No Workflow ID surfaced to the customer.
- Gap: the most critical customer flow (placing an order) shows an untraced, potentially internal message.
TradingModeModal / TradingModeToggle
- On failure:
error.response?.data?.message || 'An error occurred while changing trading mode' - Passes through backend
messagefield directly; the backendset_trading_modeerror handler returns{"error": str(e)}which can contain internal exception text. - Gap: raw exception passthrough to the UI.
Header / Dashboard — broker connection status
alpaca_api_statusandalpaca_api_messageare JSON keys returned by/api/system/status. The frontend rendersalpaca_api_messagedirectly in the header popover and the Dashboard status card.- The key name
alpaca_api_statusis never shown directly to the customer, but thealpaca_api_messagevalue — e.g. "Connected — paper account" — is rendered verbatim. When an error occurs, the message comes fromtrading_readiness.get("error")which can include broker exception text. - Gap: vendor name leak vector via message passthrough.
alpaca_api_messagelabel is internal-only butmessagevalue is rendered.
Settings page (Settings.js, line 550)
- Renders:
{systemStatus.alpaca_api_message || 'Broker API gateway'} - Same passthrough risk as above.
Backtesting page (Backtesting.js)
- Has a
friendlyErrorfunction that sanitizes some raw error text, but falls through to'Something went wrong. Check your inputs and try again.' - No trace ID in the displayed message.
- Gap: error is user-friendlier than most, but still unquotable to support.
HistoricalData API client (historicalDataAPI.js)
- Constructs:
`Server error (${error.response.status}): ${error.response.data?.error || 'Unknown error'}` - Returns raw backend error field including status code; this is displayed as the error message.
- Gap: status code is developer-facing, not user-friendly or quotable.
3.2 getraxx.com (marketing — CF-Access-gated)
WaitlistSection (WaitlistSection.js)
- On error:
'Something went wrong. Please try again in a moment.' - Hard-coded string, no code, no trace anchor.
- Impact is low (pre-launch gated surface), but the waitlist is customer-facing at v1.
- Gap: generic, no correlation possible.
3.3 demo.raxx.app (Demo flow)
DemoFlow.jsrenders adata-testid="demo-session-error"block on session errors.- Error state from API calls is surfaced via
proposalsError/fillErrorprops with no code or trace ID. - Gap: no quotable anchor; demo is customer-facing before signup.
3.4 Backend API — systematic raw-exception passthrough
The pattern return jsonify({"error": str(e)}), 500 appears approximately 100
times across route files. This means:
- Any Python exception whose string representation includes a vendor name
(e.g.
alpaca_trade_api.rest.APIError: order rejected) is returned verbatim in the JSONerrorfield. - That
errorfield is rendered directly by most frontend components. - No error code, no obfuscation, no stable ID.
Files with highest raw-exception count: historical_data.py (13 instances),
backtest.py (13 instances), trading.py (7 instances), market_data.py
(6 instances).
One confirmed vendor name leak: trading.py line 136 returns the literal
string "Live trading mode requires valid trading credentials (ALPACA_API_KEY
and ALPACA_API_SECRET)." with specific env var names. This is rendered in
the mode-switch modal.
3.5 Sentry integration — current state
ErrorBoundary.js calls Sentry.captureException(error) when the Sentry SDK
is initialized. The Sentry event ID is available (Sentry.lastEventId()) but
is not surfaced to the user anywhere. The backend error handlers do not tag
request_id on Sentry events.
3.6 X-Request-ID — current state
The logging middleware (logging.py) mints a UUID request_id for every
request and sets it on g.request_id. The error handlers include
"request_id": _request_id() in the JSON response. The response header
X-Request-ID is set. The trace middleware also propagates X-Workflow-ID
when flags are enabled.
However:
- No frontend component reads X-Request-ID or request_id from error
responses to show the user.
- The request_id in the error JSON is invisible because frontend error
renderers show only error.message or data.message, not data.request_id.
- X-Workflow-ID is propagated via response header but never displayed.
4. Gap Summary
| Gap | Surface | Severity | Pre-launch-blocking |
|---|---|---|---|
| G1: ErrorBoundary leaks raw exception + component stack in production HTML | raxx.app | High | Yes |
G2: str(e) passthrough — ~100 backend error returns expose internal exception text including vendor names |
Raptor API | High | Yes |
| G3: No error code or trace ID surfaced to customer on any error | All surfaces | High | Yes |
G4: alpaca_api_message / alpaca_api_status key name can carry broker exception text to rendered UI |
raxx.app (Header, Dashboard, Settings) | Medium | Yes |
| G5: No error code shown in TradeForm (money-affecting flow) | raxx.app | High | Yes |
| G6: Sentry event ID available but not shown to customer | raxx.app | Medium | No (post-launch) |
| G7: WaitlistSection generic error — no code, no email hint | getraxx.com | Low | No |
| G8: Demo flow errors have no quotable anchor | demo.raxx.app | Low | No |
G9: request_id in error JSON is never rendered by any frontend component |
All surfaces | Medium | Yes (infra exists, just not wired) |
Pre-launch-blocking gaps: G1, G2, G3, G4, G5, G9 (6 gaps).
5. Option Analysis
Option A — Surface existing workflow UUID directly
Surface g.trace_workflow_id (format: wfl_<32-hex>) in error responses and
show it in the error UI. Support looks it up in trace_workflows.
Pros: no new infrastructure; existing trace_middleware already mints and stores workflow IDs.
Cons: wfl_a3f91b2c4d5e6f70... is 36 characters — not human-quotable over a
support email. Requires trace middleware to be flag-enabled and database-backed,
which is controlled by FLAG_WORKFLOW_TRACE_SCHEMA and FLAG_TRACE_MIDDLEWARE.
At v1 launch, these flags may or may not be on.
Option B — New RAX-NNN scheme
Define a registry of RAX-001 through RAX-NNN codes mapped to error classes.
Store the code on the Sentry tag and in the log line. Show it in the UI.
Pros: extremely human-quotable. Familiar pattern (Stripe, Twilio, GitHub).
Cons: maintenance burden — every new error type needs a registry entry. The registry becomes a source of truth that drifts. With 100+ raw exception passthrough sites, bootstrapping the registry is a sprint-sized task, not a day's work. Cannot ship in 2 days.
Option C — Hybrid: short domain prefix + truncated request ID
Format: RAX-<DOMAIN>-<8-hex> where DOMAIN is a 3–4 letter surface/category
code and the 8-hex is the last 8 characters of the request_id UUID already
minted by logging middleware.
Examples:
- RAX-TRD-a3f91b2c — trade domain error
- RAX-BCK-7e4d9f01 — backtest domain error
- RAX-AUTH-c8b21e44 — auth domain error
- RAX-SYS-000000ff — generic system error
Support looks up request_id ending in a3f91b2c in the log drain (Heroku
log search or Sentry). The full request_id UUID is already in every log line
via logging.py. The error JSON already contains request_id.
Pros:
- Human-quotable (12 characters vs 36).
- Leverages existing request_id infrastructure — no new DB writes.
- Domain prefix tells support which subsystem to look in immediately.
- 8-hex suffix has ~1-in-4-billion collision rate at current request volume.
- Rollout is 1–2 days: wire the code into the error JSON and frontend display.
- No flag dependency — request_id is always minted.
Cons: slightly more complex display than a pure integer code. Domain mapping adds a small maintenance surface.
Recommendation: Option C.
It is the only option implementable before the 2026-05-23 UTC launch. Options A
and B require either flag-dependent infrastructure (A) or a new registry sprint
(B). Option C ships with two focused sub-cards: one backend (compute and include
the error_code field in error responses) and one frontend (surface it in the
UI). It is also composable with Option A post-launch: once trace middleware is
confirmed on, the full wfl_* ID can be appended alongside the short code.
6. Design: Option C Implementation
6.1 Domain codes
| Domain | Prefix | Covers |
|---|---|---|
| auth | AUTH | WebAuthn, session, RBAC |
| trading | TRD | order placement, mode switches, positions |
| backtest | BCK | backtest run, comparison, export |
| historical data | HST | data fetch, source queries |
| market data | MKT | quote, price feed |
| onboarding | ONB | wizard, account setup |
| billing / DSR | BIL | subscriptions, erasure requests |
| system | SYS | all other / generic |
6.2 Backend: error_code field in all error responses
The _request_id() helper in error_handlers.py already has the full UUID.
Add a _error_code(domain) helper that returns RAX-<DOMAIN>-<last-8-hex>.
The domain is derived from the request path prefix using the same prefix map
used by trace_middleware._derive_action_type.
The error handlers, route-level return jsonify({"error": str(e)}), 500, and
the Exception catch-all must all include error_code.
For the str(e) passthrough sites: the immediate fix is to replace
{"error": str(e)} with {"error": "An unexpected error occurred.", "error_code": <code>}.
This kills two birds: no more raw exception leakage, and a quotable code
appears.
For the vendor name leak at trading.py:136: replace with
"Your trading account could not be connected. Check your credentials in Settings.".
For alpaca_api_status / alpaca_api_message response keys: rename to
broker_connection_status / broker_connection_message and ensure the
message value is sanitized through the same redaction list used by
logging.py (_SENSITIVE_PATTERNS). A separate broker-name redaction pass
(strip known vendor names) should run on this field before it is serialized.
6.3 Frontend: error code display
The ErrorBoundary must:
1. Remove the inline component stack from the production render (expose it
only when process.env.NODE_ENV === 'development').
2. Show a support-reference line: "Reference code: RAX-SYS-XXXXXXXX — quote
this when contacting support@raxx.app."
3. Read the Sentry event ID if available and append it as a secondary reference.
PageStateCard error state must accept an optional errorCode prop and render
it below the message.
TradeForm must read error.response?.data?.error_code and show it alongside
the order failure message.
All API client wrappers that propagate error.message should be updated to
propagate error.response?.data?.error_code as a secondary field so callers
can display it.
6.4 Sequence: customer reports error to support
sequenceDiagram
participant C as Customer
participant A as Antlers (UI)
participant R as Raptor (API)
participant L as Log drain
participant S as Support agent
C->>A: action triggers error
A->>R: POST /api/trading/orders
R-->>A: 500 {"error":"An unexpected error occurred.","error_code":"RAX-TRD-a3f91b2c","request_id":"...full-uuid..."}
A-->>C: "Order failed. Reference code: RAX-TRD-a3f91b2c — quote this when contacting support@raxx.app"
C->>S: email support@raxx.app "I got RAX-TRD-a3f91b2c"
S->>L: search logs for request_id ending a3f91b2c
L-->>S: full request log line with user_id, path, traceback, duration_ms
S->>C: "Found your request — the order was rejected because..."
6.5 Relationship to workflow tracing
When FLAG_TRACE_MIDDLEWARE is on, g.trace_workflow_id is also available.
The error response can include workflow_id alongside error_code. The
frontend can then offer a two-level reference: the short human-readable code
for phone/email, and the full workflow ID for paste-into-ticket scenarios.
This composition does not require any changes to the trace schema.
7. Migrations
No database migrations required for Option C. The request_id UUID is already
minted in-memory per request by logging.py. The error_code field is derived
from it at response time and is not stored (it is reconstructible from any log
line containing request_id).
Post-launch (ADR-0104 follow-up): when FLAG_TRACE_MIDDLEWARE is confirmed on
in production, wire workflow_id into the error response so the longer trace
lookup is also possible. This is additive, no migration needed.
8. Rollout Plan
| Phase | Gate | Description |
|---|---|---|
| Pre-launch (v1) | Deploy before 2026-05-23 UTC | Backend: error_code in all error responses; vendor-name sanitization on str(e) passthrough sites; trading.py credential message fixed. Frontend: ErrorBoundary hides stack in prod; shows error code + support email. TradeForm shows error code. |
| Post-launch (v1.1) | After soak | Add workflow_id to error responses when trace middleware is confirmed on. Update support tooling to accept both the short code and the full workflow ID as lookup keys. |
| Post-launch (v1.2) | After SC-D12 ships | Troubleshooting docs (SC-D12) reference the RAX-<DOMAIN>-<8-hex> format with examples. Support runbook documents log search procedure. |
9. Security Considerations
- PII: The
request_idsuffix (8 hex chars) is not PII. It identifies a request, not a person. The fullrequest_idUUID in the log drain is internal-only and not shown to the customer. - Retention:
request_idis in the log drain. Heroku log retention applies (72 hours default; extended via a log drain to a retained store). For audit purposes, thetrace_eventsrow for the same request carries the samerequest_idviag.trace_action_id, queryable for 7 years. - Credential exposure: the
error_codefield encodes no credential material. The domain prefix is a category label, not a route or user identifier. - Vendor name sanitization: the
logging.py_SENSITIVE_PATTERNSlist should be extended to strip vendor names from any value that passes through thebroker_connection_messagefield. Vendor names in exception strings from broker API calls (thestr(e)passthrough sites) must be replaced with the generic message before the response is serialized. - Audit trail: error events on money-affecting flows (
TRDdomain) that have a trace workflow ID will appear intrace_eventsasact_*events withaction_type: trade.submit. The error code is not stored separately; therequest_idsuffix is sufficient for log correlation. - Breach: if
error_codevalues leak in a data dump, they reveal request volume and rough error distribution by domain. This is operational metadata and is not personal data.
10. Open Questions
None blocking the recommended pre-launch sub-cards. Post-launch questions are noted below for the operator's awareness but do not block v1.
OQ-1: Should workflow_id appear in error responses at v1 launch, or only
post-launch when FLAG_TRACE_MIDDLEWARE is confirmed stable? Recommendation:
include it conditionally (if g.trace_workflow_id: response["workflow_id"] = ...)
so it appears when the flag is on and is absent otherwise. No customer-facing
copy should reference it until the troubleshooting docs are updated.
OQ-2: The alpaca_api_status / alpaca_api_message key rename affects the
Header.js, Dashboard.js, and Settings.js frontend consumers. This is a
breaking change to the /api/system/status response shape. Should this be
shipped as part of the error-code sub-cards or deferred to a separate
cleanup card? Recommendation: ship with the error-code sub-cards since it
resolves a pre-launch-blocking vendor name leak.
11. Sub-Cards
| # | Title | Size | Blocking |
|---|---|---|---|
| SC-ERR-1 | Raptor: add error_code field to all error responses; sanitize str(e) passthrough; fix trading credential message |
S | Yes — pre-launch |
| SC-ERR-2 | Antlers: ErrorBoundary hides stack in prod, shows error code + support@ hint; TradeForm + PageStateCard surface error_code |
S | Yes — pre-launch |
| SC-ERR-3 | Raptor + Antlers: rename alpaca_api_* to broker_connection_* in /api/system/status + all frontend consumers |
XS | Yes — pre-launch |