ADR 0062 — Deny-List + Per-Action Allowlist for State-Diff PII in Audit Rows
Status: Accepted
Date: 2026-05-09 UTC
Deciders: operator (Kristerpher), software-architect
Refs: customer-audit-unified/design.md §4, ADR-0002, ADR-0058, docs/security/customer-audit-unified-threat-model.md §T-PEN-4
Context
The before_state and after_state JSONB columns in customer_audit_events record what changed when an action was taken. These fields can carry PII: a trade row contains a symbol and quantity (fine to store); a user row contains an email address and possibly tax-identifiable fields (must not be stored).
The v1 design applied a deny-list (strip known-sensitive keys) as the primary control, with a recommendation to consider a per-action allowlist. The security agent and operator have now locked the stricter model: both gates are required, and the per-action allowlist is mandatory and CI-enforced.
The risk of deny-list-only: a developer adds a new column to the users table (e.g., date_of_birth) without updating the deny-list. The new field silently flows into audit rows. The deny-list depends on developer discipline to stay current.
The risk of allowlist-only without a deny-list fallback: a developer forgets to register an allowlist for a new action namespace. The write is either rejected (if the allowlist is mandatory) or passes through without filtering (if the allowlist is advisory). A global deny-list as a second layer ensures that the most dangerous fields never appear in audit rows even when the allowlist is missing.
Decision
Two-gate model: global deny-list applied first; per-action allowlist applied second. Both mandatory.
Gate 1 — Global Deny-List
Applied unconditionally to all before_state and after_state JSONB payloads before INSERT. Any matching key is replaced with "<REDACTED>" and a Sentry WARNING is fired (key name only, not value). The deny-list is defined in backend_v2/api/services/audit_writer.py as a Python frozenset and is the single authoritative source.
AUDIT_GLOBAL_DENY_LIST = frozenset({
"email", "password", "password_hash", "token", "secret", "api_key",
"api_secret", "credential", "passkey", "passkey_id", "webauthn_credential_id",
"seed", "otp", "mfa_secret", "totp_secret", "nonce", "private_key",
"bank_account", "bank_routing", "account_number", "ssn", "tax_id",
"dob", "date_of_birth", "card_number", "cvv", "event_hash", "prev_event_hash",
})
Any key that is a substring match of a deny-list entry in a case-insensitive comparison is also redacted (e.g., "api_key_prefix" matches "api_key").
Gate 2 — Per-Action Allowlist
Each action namespace has a registered frozenset of permitted field names in AUDIT_ACTION_ALLOWLISTS in the same module. Fields not on the allowlist (and not caught by Gate 1) are replaced with "<REDACTED>". If an action namespace has no registered allowlist, the writer returns 422.
CI Enforcement
A lint job (scripts/ci/audit_action_lint.py) greps application source files for action= and "action": patterns in audit writer call sites. For each found action namespace, it checks that a corresponding entry exists in AUDIT_ACTION_ALLOWLISTS. Missing registration → pipeline failure. This prevents registration drift as new features ship.
Sentry escalation for repeated deny-list hits
If the same code path triggers a deny-list hit more than 3 times in a 24-hour window, the Sentry event escalates from WARNING to ERROR. This surfaces systematic registration failures (e.g., a new field added to the users table that keeps flowing into audit rows without registration).
Consequences
Positive
- Defense in depth. A developer who forgets the allowlist still has the deny-list as a backstop. A developer who forgets the deny-list entry still gets blocked by the allowlist (the field is not registered → redacted).
- CI enforcement prevents registration drift. The allowlist stays current with the codebase.
- Sentry escalation surfaces systematic failures before they become audit integrity issues.
- The deny-list is explicit and auditable. A security reviewer can check it directly rather than scanning all JSONB write paths.
Negative
- Per-action allowlist registration is developer overhead. Every new action namespace requires a corresponding allowlist entry. CI enforcement ensures this is not forgotten, but it adds a step to the development workflow.
- Over-redaction risk: if a developer registers a too-narrow allowlist, useful audit data is redacted. Mitigation: the CI lint can warn when an allowlist has fewer than 3 entries (very narrow allowlists may be incomplete).
- The deny-list is a substring match. An overly broad deny-list could redact legitimate fields (e.g.,
"token_count"matching"token"). Mitigation: the deny-list uses exact key matching by default; substring matching is a secondary opt-in for known prefix patterns (e.g.,*_secret,*_token).
Alternatives Considered
Deny-list only (v1 recommendation)
Single deny-list, advisory per-action allowlist. Simpler developer experience; lower CI complexity.
Rejected by operator: deny-list depends on developer discipline to stay current. A new PII field added to a primary table will silently flow into audit rows until someone notices. The operator wants explicit registration with CI enforcement for all action namespaces.
Allowlist only (no global deny-list)
Each action namespace registers the complete set of permitted fields. No global deny-list.
Rejected: the allowlist alone provides no protection when the action namespace is not registered (new code, forgotten registration). The global deny-list is the last-resort backstop. Removing it creates a gap where a missing allowlist registration means all fields flow through unfiltered.
Schema-level JSONB constraints
Postgres CHECK constraints on before_state and after_state to prevent certain keys.
Rejected: Postgres JSONB CHECK constraints are not expressive enough to do key-by-key validation across variable schemas. Application-layer enforcement is the right layer for this.
Separate PII-scrubber service
A dedicated sidecar that scrubs PII from audit payloads before they reach the writer endpoint.
Rejected for v1: adds infrastructure complexity (deployment, availability dependency, latency). The writer service can perform the scrubbing in-process with negligible overhead at v1 scale. Revisit at 10K customers if the scrubber logic becomes complex enough to warrant isolation.