ADR 0062 — Deny-List + Per-Action Allowlist for State-Diff PII in Audit Rows

Status: Accepted
Date: 2026-05-09 UTC
Deciders: operator (Kristerpher), software-architect
Refs: customer-audit-unified/design.md §4, ADR-0002, ADR-0058, docs/security/customer-audit-unified-threat-model.md §T-PEN-4

Context

The before_state and after_state JSONB columns in customer_audit_events record what changed when an action was taken. These fields can carry PII: a trade row contains a symbol and quantity (fine to store); a user row contains an email address and possibly tax-identifiable fields (must not be stored).

The v1 design applied a deny-list (strip known-sensitive keys) as the primary control, with a recommendation to consider a per-action allowlist. The security agent and operator have now locked the stricter model: both gates are required, and the per-action allowlist is mandatory and CI-enforced.

The risk of deny-list-only: a developer adds a new column to the users table (e.g., date_of_birth) without updating the deny-list. The new field silently flows into audit rows. The deny-list depends on developer discipline to stay current.

The risk of allowlist-only without a deny-list fallback: a developer forgets to register an allowlist for a new action namespace. The write is either rejected (if the allowlist is mandatory) or passes through without filtering (if the allowlist is advisory). A global deny-list as a second layer ensures that the most dangerous fields never appear in audit rows even when the allowlist is missing.

Decision

Two-gate model: global deny-list applied first; per-action allowlist applied second. Both mandatory.

Gate 1 — Global Deny-List

Applied unconditionally to all before_state and after_state JSONB payloads before INSERT. Any matching key is replaced with "<REDACTED>" and a Sentry WARNING is fired (key name only, not value). The deny-list is defined in backend_v2/api/services/audit_writer.py as a Python frozenset and is the single authoritative source.

AUDIT_GLOBAL_DENY_LIST = frozenset({
    "email", "password", "password_hash", "token", "secret", "api_key",
    "api_secret", "credential", "passkey", "passkey_id", "webauthn_credential_id",
    "seed", "otp", "mfa_secret", "totp_secret", "nonce", "private_key",
    "bank_account", "bank_routing", "account_number", "ssn", "tax_id",
    "dob", "date_of_birth", "card_number", "cvv", "event_hash", "prev_event_hash",
})

Any key that is a substring match of a deny-list entry in a case-insensitive comparison is also redacted (e.g., "api_key_prefix" matches "api_key").

Gate 2 — Per-Action Allowlist

Each action namespace has a registered frozenset of permitted field names in AUDIT_ACTION_ALLOWLISTS in the same module. Fields not on the allowlist (and not caught by Gate 1) are replaced with "<REDACTED>". If an action namespace has no registered allowlist, the writer returns 422.

CI Enforcement

A lint job (scripts/ci/audit_action_lint.py) greps application source files for action= and "action": patterns in audit writer call sites. For each found action namespace, it checks that a corresponding entry exists in AUDIT_ACTION_ALLOWLISTS. Missing registration → pipeline failure. This prevents registration drift as new features ship.

Sentry escalation for repeated deny-list hits

If the same code path triggers a deny-list hit more than 3 times in a 24-hour window, the Sentry event escalates from WARNING to ERROR. This surfaces systematic registration failures (e.g., a new field added to the users table that keeps flowing into audit rows without registration).

Consequences

Positive

Defense in depth. A developer who forgets the allowlist still has the deny-list as a backstop. A developer who forgets the deny-list entry still gets blocked by the allowlist (the field is not registered → redacted).
CI enforcement prevents registration drift. The allowlist stays current with the codebase.
Sentry escalation surfaces systematic failures before they become audit integrity issues.
The deny-list is explicit and auditable. A security reviewer can check it directly rather than scanning all JSONB write paths.

Negative

Per-action allowlist registration is developer overhead. Every new action namespace requires a corresponding allowlist entry. CI enforcement ensures this is not forgotten, but it adds a step to the development workflow.
Over-redaction risk: if a developer registers a too-narrow allowlist, useful audit data is redacted. Mitigation: the CI lint can warn when an allowlist has fewer than 3 entries (very narrow allowlists may be incomplete).
The deny-list is a substring match. An overly broad deny-list could redact legitimate fields (e.g., "token_count" matching "token"). Mitigation: the deny-list uses exact key matching by default; substring matching is a secondary opt-in for known prefix patterns (e.g., *_secret, *_token).

Alternatives Considered

Deny-list only (v1 recommendation)

Single deny-list, advisory per-action allowlist. Simpler developer experience; lower CI complexity.

Rejected by operator: deny-list depends on developer discipline to stay current. A new PII field added to a primary table will silently flow into audit rows until someone notices. The operator wants explicit registration with CI enforcement for all action namespaces.

Allowlist only (no global deny-list)

Each action namespace registers the complete set of permitted fields. No global deny-list.

Rejected: the allowlist alone provides no protection when the action namespace is not registered (new code, forgotten registration). The global deny-list is the last-resort backstop. Removing it creates a gap where a missing allowlist registration means all fields flow through unfiltered.

Schema-level JSONB constraints

Postgres CHECK constraints on before_state and after_state to prevent certain keys.

Rejected: Postgres JSONB CHECK constraints are not expressive enough to do key-by-key validation across variable schemas. Application-layer enforcement is the right layer for this.

Separate PII-scrubber service

A dedicated sidecar that scrubs PII from audit payloads before they reach the writer endpoint.

Rejected for v1: adds infrastructure complexity (deployment, availability dependency, latency). The writer service can perform the scrubbing in-process with negligible overhead at v1 scale. Revisit at 10K customers if the scrubber logic becomes complex enough to warrant isolation.