ADR 0003 — GDPR by Default

Status: Accepted Date: 2026-04-21 Deciders: product owner (user), software-architect Related: ADR 0001, ADR 0002, docs/architecture/auth.md

Context

TradeMasterAPI will process personal data of EU users (email address, display name, IP prefix for audit, paper-trade history, device/passkey metadata). GDPR applies from the first EU user. Retrofitting compliance is expensive and often impossible; we design for it from day one.

The hard asks of GDPR (simplified): lawful basis, purpose limitation, data minimization, defined retention, accuracy (rectification), right of access + portability, right of erasure, breach notification within 72 hours, and maintainable records of processing.

This ADR records the design decisions that discharge each requirement.

Decision

GDPR compliance is built into the core — not a feature flag, not a plugin, not a separate "EU mode".

Data subject rights — implementation

Right	Endpoint / Mechanism	SLA
Access	`POST /api/gdpr/export`	Asynchronous; bundle available within 30 days (GDPR Art. 12 allows up to 1 month). We target 24h.
Portability	Same bundle: JSON + CSV machine-readable	—
Erasure ("right to be forgotten")	`POST /api/gdpr/erase` — requires fresh WebAuthn step-up	Soft-delete immediately; PII purge after 30-day cooling period; audit rows retained with pseudonymized actor id for 2 years.
Rectification	`PATCH /api/gdpr/profile` (display_name + email-change flow with re-verification)	Immediate
Restriction	Account freeze: admin-only; user can request via email	Case-by-case
Objection	Handled at the point of processing (no marketing opt-outs because we do not do marketing)	—

Retention schedule

Data	Retention	Why
`users.email` (active)	Lifetime of account	Contact channel
`webauthn_credentials.*`	Lifetime of account	Required for auth
`sessions`	30 days after `expires_at` or `revoked_at`	Audit trail for anomalous-session investigation
`email_verifications` (consumed or expired)	30 days	Anti-replay + support
`audit_log` (security events)	2 years	DPA / regulatory requirement for financial-adjacent service
`audit_log` (trade-affecting events)	7 years	Brokerage regulatory norms — we align even though we are not the broker of record
Paper-trade history	3 years (proposed; open question in auth.md §10)	Product need balanced against minimization
Server logs	90 days	Ops
Breach-notification records	6 years	Art. 33(5) accountability

A background retention job (backend_v2/jobs/retention.py, to be created in implementation sub-card) runs nightly, scans tables against this schedule, and deletes/pseudonymizes. The job writes a single audit_log row per run summarizing counts, never per-record PII.

Erasure semantics

On POST /api/gdpr/erase (step-up verified):

users.deleted_at = now(), users.email = null, users.display_name = null.
All sessions revoked, all passkeys deleted (cascade).
A scheduled hard-purge at deleted_at + 30 days removes paper-trade history, email_verifications, and any JSON fields referencing the user.
audit_log rows keep the action row but replace actor_user_id with sha256(actor_user_id || per-user-salt) — a one-way pseudonym that preserves the record without identifying the person. The salt is stored encrypted and destroyed at the end of the 2-year audit retention, after which the pseudonym is irreversible even to us.

The 30-day cooling period exists so a user who changes their mind (or whose account was erased after credential compromise) can request reinstatement. After 30 days it is gone.

Audit logging

Every state-changing action writes an audit_log row. Redaction rules:

Email addresses never appear in context.
IP addresses are stored as /24 (IPv4) or /48 (IPv6) prefixes; full IP is discarded at ingress.
User agent is truncated to product + major version.
Request bodies are never logged; only parameter names when relevant.

A nightly hash-chain summary (audit_log_digest — a small table holding (date, sha256_of_day_rows)) gives us tamper-evidence without the cost of a per-row chain. If a future row disagrees with its digest, the integrity alarm fires.

Breach notification pipeline

Any detection — failed integrity digest, suspicious admin access, credential-storage CI failure shipping to production, external report — results in an audit_log row with action breach.detected.
A GitHub Actions workflow watching for that action (via a tiny webhook in backend_v2/jobs/breach_notifier.py) files a severity:critical issue, pages on-call, and starts a 72-hour timer.
The runbook docs/agents/security_response.md is extended with the GDPR-Art.-33 notification template and the supervisory-authority contact list. We do not duplicate it here.
If the breach involves personal data of EU users, the DPO (currently: the product owner) is responsible for sending the Art. 33 notification within 72 hours and Art. 34 user notification "without undue delay" if risk is high.

Data Protection by Design (Art. 25)

Minimization: we collect email, display name, device metadata. We do not collect phone, address, DOB, real name.
Default: no marketing opt-in because no marketing system exists.
Pseudonymization: audit trail as above; IP prefixing.
Encryption at rest: filesystem-level; WebAuthn public keys are not secret but PII on the same disk is.
Encryption in transit: TLS-only; HSTS; WebAuthn rejects non-HTTPS origins by spec.

Records of Processing (Art. 30)

A living document at docs/architecture/gdpr-ropa.md (to be filed in a sub-card) enumerates: categories of data subjects, categories of data, purposes, retention, cross-border transfers, security measures. This ADR does not duplicate it.

Data Protection Impact Assessment (Art. 35)

Given that we will process financial-adjacent behavioral data and route real orders, a DPIA is appropriate at the live-trading unlock milestone, not at multi-user MVP. Tracked as a follow-up card.

Consequences

Positive

Launch-ready for EU users without scrambling.
Each right has a single endpoint that maps 1:1 to an article — defensible to an auditor.
Retention job is one module, not scattered cleanup scripts.

Negative

Export + erasure are real engineering work (bundle format, async job, notification email). First sub-cards will feel heavyweight relative to feature value.
Pseudonymization salt management is a moving part that must itself be audited.
The 7-year trade-affecting audit retention conflicts with user intuition ("you said you'd delete everything"). The DSR endpoint must be explicit about what is retained and why.

Alternatives considered

Rejected. Impossible to reliably detect jurisdiction, and retrofitting export/erasure to "all users who flip on GDPR mode" just means we did the work twice.

Rejected for v1. Adds a processor (more compliance surface), is overkill for our data volume, and locks us in. Revisit at scale.

No audit-log retention (delete everything on erasure)

Rejected. We have a legal obligation under financial-adjacent rules to retain certain logs. Pseudonymization is the compromise.

Compliance checklist

[x] Lawful basis documented (contract for service, legitimate interest for audit).
[x] Each GDPR right has a concrete endpoint or procedure.
[x] Retention periods are numeric, not "as needed".
[x] Breach pipeline starts the 72-hour clock automatically.
[x] Minimization baked into schema, not bolted on.
[x] Pseudonymization strategy for post-erasure audit.

Revisit when

Live-trading unlocks (DPIA due).
We add a new data source (news, sentiment) that may contain third-party PII.
Volume crosses a threshold where manual retention review is impractical (already automated — trigger would be scale-related).
EU regulator guidance changes on "financial-adjacent" log retention.