ADR 0003 — GDPR by Default
Status: Accepted
Date: 2026-04-21
Deciders: product owner (user), software-architect
Related: ADR 0001, ADR 0002, docs/architecture/auth.md
Context
TradeMasterAPI will process personal data of EU users (email address, display name, IP prefix for audit, paper-trade history, device/passkey metadata). GDPR applies from the first EU user. Retrofitting compliance is expensive and often impossible; we design for it from day one.
The hard asks of GDPR (simplified): lawful basis, purpose limitation, data minimization, defined retention, accuracy (rectification), right of access + portability, right of erasure, breach notification within 72 hours, and maintainable records of processing.
This ADR records the design decisions that discharge each requirement.
Decision
GDPR compliance is built into the core — not a feature flag, not a plugin, not a separate "EU mode".
Data subject rights — implementation
| Right | Endpoint / Mechanism | SLA |
|---|---|---|
| Access | POST /api/gdpr/export |
Asynchronous; bundle available within 30 days (GDPR Art. 12 allows up to 1 month). We target 24h. |
| Portability | Same bundle: JSON + CSV machine-readable | — |
| Erasure ("right to be forgotten") | POST /api/gdpr/erase — requires fresh WebAuthn step-up |
Soft-delete immediately; PII purge after 30-day cooling period; audit rows retained with pseudonymized actor id for 2 years. |
| Rectification | PATCH /api/gdpr/profile (display_name + email-change flow with re-verification) |
Immediate |
| Restriction | Account freeze: admin-only; user can request via email | Case-by-case |
| Objection | Handled at the point of processing (no marketing opt-outs because we do not do marketing) | — |
Retention schedule
| Data | Retention | Why |
|---|---|---|
users.email (active) |
Lifetime of account | Contact channel |
webauthn_credentials.* |
Lifetime of account | Required for auth |
sessions |
30 days after expires_at or revoked_at |
Audit trail for anomalous-session investigation |
email_verifications (consumed or expired) |
30 days | Anti-replay + support |
audit_log (security events) |
2 years | DPA / regulatory requirement for financial-adjacent service |
audit_log (trade-affecting events) |
7 years | Brokerage regulatory norms — we align even though we are not the broker of record |
| Paper-trade history | 3 years (proposed; open question in auth.md §10) | Product need balanced against minimization |
| Server logs | 90 days | Ops |
| Breach-notification records | 6 years | Art. 33(5) accountability |
A background retention job (backend_v2/jobs/retention.py, to be created in implementation sub-card) runs nightly, scans tables against this schedule, and deletes/pseudonymizes. The job writes a single audit_log row per run summarizing counts, never per-record PII.
Erasure semantics
On POST /api/gdpr/erase (step-up verified):
users.deleted_at = now(),users.email = null,users.display_name = null.- All sessions revoked, all passkeys deleted (cascade).
- A scheduled hard-purge at
deleted_at + 30 daysremoves paper-trade history, email_verifications, and any JSON fields referencing the user. audit_logrows keep the action row but replaceactor_user_idwithsha256(actor_user_id || per-user-salt)— a one-way pseudonym that preserves the record without identifying the person. The salt is stored encrypted and destroyed at the end of the 2-year audit retention, after which the pseudonym is irreversible even to us.
The 30-day cooling period exists so a user who changes their mind (or whose account was erased after credential compromise) can request reinstatement. After 30 days it is gone.
Audit logging
Every state-changing action writes an audit_log row. Redaction rules:
- Email addresses never appear in
context. - IP addresses are stored as
/24(IPv4) or/48(IPv6) prefixes; full IP is discarded at ingress. - User agent is truncated to product + major version.
- Request bodies are never logged; only parameter names when relevant.
A nightly hash-chain summary (audit_log_digest — a small table holding (date, sha256_of_day_rows)) gives us tamper-evidence without the cost of a per-row chain. If a future row disagrees with its digest, the integrity alarm fires.
Breach notification pipeline
- Any detection — failed integrity digest, suspicious admin access, credential-storage CI failure shipping to production, external report — results in an
audit_logrow with actionbreach.detected. - A GitHub Actions workflow watching for that action (via a tiny webhook in
backend_v2/jobs/breach_notifier.py) files aseverity:criticalissue, pages on-call, and starts a 72-hour timer. - The runbook
docs/agents/security_response.mdis extended with the GDPR-Art.-33 notification template and the supervisory-authority contact list. We do not duplicate it here. - If the breach involves personal data of EU users, the DPO (currently: the product owner) is responsible for sending the Art. 33 notification within 72 hours and Art. 34 user notification "without undue delay" if risk is high.
Data Protection by Design (Art. 25)
- Minimization: we collect email, display name, device metadata. We do not collect phone, address, DOB, real name.
- Default: no marketing opt-in because no marketing system exists.
- Pseudonymization: audit trail as above; IP prefixing.
- Encryption at rest: filesystem-level; WebAuthn public keys are not secret but PII on the same disk is.
- Encryption in transit: TLS-only; HSTS; WebAuthn rejects non-HTTPS origins by spec.
Records of Processing (Art. 30)
A living document at docs/architecture/gdpr-ropa.md (to be filed in a sub-card) enumerates: categories of data subjects, categories of data, purposes, retention, cross-border transfers, security measures. This ADR does not duplicate it.
Data Protection Impact Assessment (Art. 35)
Given that we will process financial-adjacent behavioral data and route real orders, a DPIA is appropriate at the live-trading unlock milestone, not at multi-user MVP. Tracked as a follow-up card.
Consequences
Positive
- Launch-ready for EU users without scrambling.
- Each right has a single endpoint that maps 1:1 to an article — defensible to an auditor.
- Retention job is one module, not scattered cleanup scripts.
Negative
- Export + erasure are real engineering work (bundle format, async job, notification email). First sub-cards will feel heavyweight relative to feature value.
- Pseudonymization salt management is a moving part that must itself be audited.
- The 7-year trade-affecting audit retention conflicts with user intuition ("you said you'd delete everything"). The DSR endpoint must be explicit about what is retained and why.
Alternatives considered
"GDPR mode" feature flag for EU users only
Rejected. Impossible to reliably detect jurisdiction, and retrofitting export/erasure to "all users who flip on GDPR mode" just means we did the work twice.
Third-party GDPR-as-a-service tool
Rejected for v1. Adds a processor (more compliance surface), is overkill for our data volume, and locks us in. Revisit at scale.
No audit-log retention (delete everything on erasure)
Rejected. We have a legal obligation under financial-adjacent rules to retain certain logs. Pseudonymization is the compromise.
Compliance checklist
- [x] Lawful basis documented (contract for service, legitimate interest for audit).
- [x] Each GDPR right has a concrete endpoint or procedure.
- [x] Retention periods are numeric, not "as needed".
- [x] Breach pipeline starts the 72-hour clock automatically.
- [x] Minimization baked into schema, not bolted on.
- [x] Pseudonymization strategy for post-erasure audit.
Revisit when
- Live-trading unlocks (DPIA due).
- We add a new data source (news, sentiment) that may contain third-party PII.
- Volume crosses a threshold where manual retention review is impractical (already automated — trigger would be scale-related).
- EU regulator guidance changes on "financial-adjacent" log retention.