Raxx · internal docs

internal · gated ↑ index

Passkey E2E Encryption with Opt-In Shadow Analytics

Status: Draft — pending Kristerpher's path decision (see ADR 0017) Owner: software-architect Date: 2026-04-24 Parent issue: #250 — Passkey-keyed end-to-end encryption for customer data Related ADRs: 0001, 0002, 0003, 0017

1. Context

Issue #250 proposes passkey-keyed E2E encryption for all customer data. The consequential downside: Raxx can't read user trade content, which kills all server-side analytics on actual trade patterns — cohort insights, aggregate strategy recommendations, regime-pattern discovery, leaderboards (#211), and product-improvement telemetry on real behavior.

This doc evaluates a middle path: E2E encryption stays as the base invariant. Users who want to contribute to the analytics pool can opt in to a client-side shadow pipeline that anonymizes and aggregates their signals before submitting them to a separate, operator-readable analytics store. No individual trade data ever leaves the device in plaintext. The operator learns aggregate patterns; individual behavior stays private.

This is a research-and-design document. It does not authorize implementation. ADR 0017 records the architecture decision.

2. Invariants (non-negotiable)

The following invariants from the project-level constitution apply here with full force:

No stored credentials. The PRF extension output (used for key derivation) must never be stored server-side, logged, or transmitted beyond the in-memory crypto operation.
Passkeys / WebAuthn only. The encryption key derivation ceremony is passkey-rooted; there is no alternative key path that does not require a passkey assertion.
GDPR by default. Opt-in consent must be compliant with GDPR Art. 7 (freely given, specific, informed, unambiguous, withdrawable). Withdrawal must produce a hard delete from the analytics store within 30 days per Art. 17.
Audit trail. Consent grants, consent withdrawals, and analytics-delete events are written to audit_log.
Paper-first gating. Unaffected by this design; live-trading paths remain gated.
No PII in analytics store. The shadow store must be structurally incapable of re-identification. This is a design constraint, not a policy — the schema must make it impossible, not merely unlikely.

Any architecture that weakens these is a violation, not a tradeoff.

3. Prior Art

Signal — Sealed Sender + Private Contact Discovery

Signal's design insight: separate the "what" (message content, E2E encrypted) from the "that" (communication metadata). For the metadata-resistant features, Signal uses Private Information Retrieval and Oblivious RAM constructions so even contact discovery doesn't leak to the server. Relevant pattern: the operator learns nothing about individual interaction while still running the service. Signal does not run aggregate analytics on user behavior at all — they accept the product cost as brand differentiation. That is one legitimate choice.

Apple — Differential Privacy in iOS/macOS

Apple (2017, Erlingsson et al., WWDC 2017 session) collects keyboard usage, emoji frequency, Health type preferences, and Safari crash data using Local Differential Privacy (LDP). The core technique: before any data leaves the device, the client applies a randomized response mechanism (RAPPOR variant) that adds calibrated noise. The server sees aggregate histograms but cannot recover any individual's input even with access to all submitted records.

Concrete limits Apple respects: k≥100,000 users before publishing any aggregate (cohort floor). Privacy budget ε ≈ 4 per day per feature domain (moderate privacy). Their implementation is open-sourced in the Swift Algorithms repo.

Raxx scale concern: Apple's privacy budget works at hundreds of millions of devices. Raxx's user base in v1 will be in the thousands. At small cohorts, DP noise must be much larger (lower ε or higher noise) to prevent inference attacks, which may make aggregate data useless. This is a quantitative decision that depends on actual cohort sizes.

Google — RAPPOR (Randomized Aggregatable Privacy-Preserving Ordinal Response)

Google's 2014 RAPPOR paper (Erlingsson, Pihur, Korolova) is the canonical client-side LDP implementation. It uses Bloom filter encoding + Permanent Randomized Response + Instantaneous Randomized Response to collect frequency histograms of categorical values (e.g., "what is your default browser?") with a controlled privacy budget.

RAPPOR is well-suited to categorical signals (strategy family, error type, win-rate bucket). It performs poorly on continuous numeric signals (P&L, hold period in exact minutes). For continuous signals, bucketing into ≤10 categories first, then applying RAPPOR, is the standard approach.

Google — Federated Learning

Google's federated learning (2017, McMahan et al.) keeps training data on-device; only model gradients are aggregated server-side. This is more relevant for a recommendation engine than for analytics, but the boundary it enforces — gradients, not data — is the right mental model for Raxx's shadow pipeline: aggregates, not records.

For Raxx v1, full federated learning is out of scope (requires a model training infrastructure). But the "client computes the aggregate locally, sends only the result" pattern is directly applicable.

Proton Mail / Proton Drive

Proton's analytics approach: they collect zero content analytics. They do collect operational metrics (delivery success rates, storage usage, uptime) at the infrastructure layer, never at the content layer. Their transparency reports disclose counts of legal requests received, not user behavior data. They accept this as brand-defining.

Proton runs a separate, privacy-isolated "product usage statistics" feature (opt-in) that sends event types (clicked feature X) but never content. Events are stripped of user ID before storage. This is functionally identical to Raxx's proposed shadow pipeline — same structure, different domain.

Standard Notes

Standard Notes uses zero-knowledge architecture for all note content. For analytics they rely entirely on opt-in crash reporting (Sentry, user-initiated). They do no behavioral analytics. They accept the product cost. Their reasoning: "We can't improve what we can't see, so we ship conservatively and ask users explicitly."

Tuta (formerly Tutanota)

Tuta encrypts all email content and metadata (subject, sender, recipient in the body) client-side. For product analytics they ship Matomo (self-hosted) with IP anonymization and have users opt in. They do not correlate analytics events to user content. Their shadow analytics pattern: session-scoped pseudonymous identifier (resets on logout) + event type + no content fields. This is directly applicable to Raxx.

DuckDuckGo

DuckDuckGo collects aggregate query counts (what was searched, not who searched it) using hash bucketing to prevent single-query identification. Their key constraint: no persistent user identifier exists; every anonymous search is disconnected from every other. This is more radical than Raxx needs — Raxx has authenticated users who consent to aggregated analytics.

Cloudflare Privacy Gateway (OHTTP)

OHTTP (Oblivious HTTP, RFC 9458) inserts a relay between client and server so the server never sees the originating IP. Cloudflare's Privacy Gateway implements this for Apple's Private Relay and iCloud Private Relay. Relevant if Raxx wants analytics submissions that can't be correlated by IP — but at Raxx's scale, the implementation cost exceeds the marginal privacy gain. Not recommended for v1; note as a future option.

4. Anonymization Primitives — Applicability to Raxx

4.1 K-Anonymity

A dataset satisfies k-anonymity if every record is indistinguishable from at least k-1 others on the quasi-identifiers. For cohort analytics (strategy family, win-rate bucket, hold period bucket), this means: do not publish any bucket with fewer than k users.

Recommended k: 20 for v1. If a strategy family has fewer than 20 users who opted in, suppress that bucket entirely. This is a simple server-side enforcement rule.

Limitation: K-anonymity does not protect against linkage attacks or homogeneity attacks (if all k records in a bucket have the same value, the suppression is trivially defeated). Pair with data minimization (don't store anything not needed for the specific aggregate).

4.2 Differential Privacy (DP)

DP adds calibrated noise to query results so that the presence or absence of any single record changes the output only imperceptibly. The formal guarantee: for all adjacent datasets D and D' (differing by one record), the probability of any output O satisfies P[M(D)=O] / P[M(D')=O] ≤ exp(ε).

For Raxx: DP is most useful for continuous analytics (how many users opened a position in high-VIX regimes? what is the average win rate across strategy families?). With a privacy budget of ε=1 per query domain per week, the noise added would be detectable but statistically manageable at >1,000 respondents. Below 500 respondents, results may be noise-dominated and analytically useless.

Recommendation: Implement DP at the query layer in the analytics service (not client-side), using the Laplace mechanism for count queries and the Gaussian mechanism for histogram queries. At v1 scale, DP is a "ship-ready" guarantee primarily useful for future audit defense, not necessarily for meaningful noise reduction.

Privacy budget management: Each analytics query type consumes budget. A weekly reset with per-domain ceilings (strategy analytics ε=2/week, error analytics ε=1/week) is workable. This must be enforced at the analytics API layer.

4.3 Hashing / Tokenization

Suitable for categorical identifiers (strategy family name → hash). Useless for numeric signals (hashing a P&L value doesn't anonymize it — the distribution remains identifiable). Do not use for continuous values.

4.4 Aggregation-Only (Client Pre-Aggregation)

The simplest and most reliable technique: the client never sends individual records. It pre-aggregates over a time window (weekly), buckets continuous values, and sends only the bucket counts. The server receives "this user opened 3 trades in the credit-spread family at VIX>20 this week" rather than the three individual trades.

This is the recommended primary technique for Raxx's shadow pipeline. It requires no special crypto, is easy to audit, and produces the most useful signal at small scale.

What it protects against: The server cannot reconstruct individual trades even if the analytics store is fully breached, because individual records never existed there.

What it doesn't protect against: Timing correlation (if only one user traded in a given bucket this week, the bucket count=1 row is a de-anonymization event). Mitigated by k-anonymity floor: suppress buckets with count<k.

4.5 Homomorphic Encryption

Allows the server to compute on ciphertext without decrypting. Fully Homomorphic Encryption (FHE) remains impractically slow for most real-time use cases (though libraries like Microsoft SEAL and OpenFHE are maturing). Partially Homomorphic Encryption (PHE) is fast for specific operations (addition, multiplication) but not general aggregation.

For v1: explicitly out of scope. Mention in design as a v3 path if Raxx wants server-side computation over E2E-encrypted data without ever decrypting it.

5. Data to Shadow-Copy: Proposed Concrete List

Permitted — aggregate, anonymized signals

Signal	Granularity	Anonymization	Rationale
Strategy family usage	Weekly counts per family per user	Client pre-aggregates; k-floor suppression	Strategy recommendation engine
Win-rate bucket	Bucket: <30%, 30-50%, 50-70%, >70%	Bucket only, no raw value	Cohort comparison
Hold-period distribution	Bucket: intraday / 1-3d / 1wk / 1mo+	Bucket only	Regime analysis
Regime at entry	Binary: VIX>20 vs VIX≤20 at trade open	Exact VIX not shared	Regime-pattern discovery
Strategy parameter change patterns	Which parameter (delta, expiry, width) was adjusted; not to what value	Field name only	UX improvement
Error/failure event types	Error code + flow identifier	No content, no trade amounts	UX improvement, error rate telemetry
Paper-to-live transition event	Did user graduate to live mode (yes/no)	Boolean	Conversion funnel
Session feature engagement	Which top-level features used per week	Feature ID only, no content	Product usage

Prohibited — must never appear in shadow store

Data	Reason
Specific trade prices	Re-identifiable via public order flow
Specific positions (ticker, quantity)	PII + commercially sensitive
P&L values (raw)	Re-identifiable; sensitive
Ticker symbols used	Identifiable; commercially sensitive
Exact trade timestamps	Correlatable to public order flow
IP address or device fingerprint	Direct PII
Email or user ID	Direct link to encrypted store
Any content from strategy configurations	E2E encrypted; must stay that way
Passkey credential IDs	Authentication-related PII

6.1 Opt-In Flow

Default: shadow analytics are OFF. Raxx does not infer consent from account creation, from Founders trial enrollment, or from product use. A user must take an affirmative action.

The opt-in appears in account settings, post-onboarding. UI must show: - What is collected (list from §5, plain language) - What it is used for (product improvement, aggregate recommendations) - What is NOT collected (trade prices, positions, P&L, tickers) - How to withdraw and what happens when you do

6.2 Granular Toggles

Three independent consent scopes:

Scope	Default	What it covers
`analytics.strategy_patterns`	OFF	Strategy family usage, win-rate bucket, hold-period, regime
`analytics.product_usage`	OFF	Feature engagement, session patterns
`analytics.error_telemetry`	OFF	Error codes and flow identifiers

Each scope has an independent toggle. Turning off one scope does not reset others.

analytics_consent
  id              TEXT PK
  user_id         TEXT FK -> users.id ON DELETE CASCADE
  scope           TEXT NOT NULL  -- 'strategy_patterns' | 'product_usage' | 'error_telemetry'
  granted_at      TIMESTAMP NOT NULL
  withdrawn_at    TIMESTAMP NULL
  granted_version TEXT NOT NULL  -- privacy policy version at time of grant
  source          TEXT NOT NULL  -- 'settings_ui' | 'onboarding_prompt'

Consent records are immutable append-only. A withdrawal creates a new row with withdrawn_at set; it does not delete the grant row (needed for GDPR accountability proof under Art. 7(1)).

6.4 Withdrawal and Shadow-Data Deletion

On scope withdrawal: 1. Client stops submitting signals for that scope immediately. 2. A DELETE /api/analytics/shadow request (authenticated) queues a delete job. 3. The analytics store hard-deletes all rows associated with the shadow pseudonym within 30 days. 4. The shadow pseudonym token is rotated so new contributions can't be linked to deleted history (if user later re-opts-in). 5. audit_log records: analytics.consent.withdrawn, analytics.shadow.delete_queued, analytics.shadow.delete_completed.

GDPR Art. 7 requires: - Freely given: the service is fully functional without consent. No dark patterns. ✓ - Specific: granular scopes. ✓ - Informed: plain-language explainer required in UI. Feature-developer responsibility. - Unambiguous: active opt-in checkbox (no pre-ticked boxes). ✓ - Withdrawable at any time without detriment: withdrawal does not degrade service features. ✓

6.6 CCPA Mapping

Shadow analytics data is used for product improvement, not sold to third parties. Under CCPA, this is "internal research" and does not trigger "do not sell." However, if Raxx ever shares aggregate analytics with third parties (investors, partners), a "do not sell" mechanism is needed. This is a product decision outside v1 scope; flag it.

7. Architecture: The Shadow Pipeline

7.1 Components

                          ┌─────────────────────────────────┐
                          │           Antlers (browser)       │
                          │                                   │
   Passkey assertion ────►│  PRF key derivation               │
                          │       │                           │
                          │       ▼                           │
                          │  Encrypted user data store        │
                          │  (Raptor can't read this)         │
                          │       │                           │
                          │       ▼                           │
                          │  Shadow Aggregator (client-side)  │
                          │  - inspect raw events             │
                          │  - apply bucketing + k-floor      │
                          │  - attach shadow pseudonym        │
                          │  - serialize as analytics payload  │
                          └──────────────┬────────────────────┘
                                         │
                          (HTTPS; no user auth header)
                                         │
                          ┌──────────────▼────────────────────┐
                          │   Analytics API (separate service) │
                          │   - no DB foreign key to users     │
                          │   - no IP logged                   │
                          │   - pseudonym-keyed rows only      │
                          │   - DP noise at query layer        │
                          └──────────────┬────────────────────┘
                                         │
                          ┌──────────────▼────────────────────┐
                          │   Analytics Store (separate DB)    │
                          │   - no PII columns                  │
                          │   - no FK to users table           │
                          │   - schema enforces anonymization   │
                          └───────────────────────────────────┘

7.2 Shadow Pseudonym

The shadow pseudonym is derived client-side:

shadow_pseudonym = HKDF(
  ikm   = PRF_output,
  salt  = "raxx-shadow-v1",
  info  = "analytics-pseudonym",
  len   = 32 bytes
)

Properties: - Derived from the same PRF root as the encryption key, but via a distinct HKDF expansion so the two keys are cryptographically independent. - Deterministic per passkey per analytics epoch (rotate by changing the info string on withdrawal and re-grant). - Never transmitted to Raptor's main DB. Only submitted to the Analytics API. - Server cannot link shadow pseudonym to a user account because the PRF output is never known to the server.

Limitation: If a user uses multiple passkeys (multiple devices), each device derives the same pseudonym from the same underlying passkey (for platform passkeys synced via iCloud Keychain / Google Password Manager). For roaming authenticators, each physical key derives a different pseudonym — these appear as separate contributors in the analytics store. This is a known limitation; document it; do not engineer around it in v1.

7.3 Analytics Payload Schema

analytics_events
  id              TEXT PK (uuid v4, server-generated at receipt)
  pseudonym       BLOB NOT NULL  (32 bytes; no FK; no index on pseudonym alone)
  scope           TEXT NOT NULL
  event_type      TEXT NOT NULL
  payload_json    TEXT NOT NULL  (bucketed, no PII; schema-validated at intake)
  epoch_week      TEXT NOT NULL  (ISO week, e.g. "2026-W17")
  received_at     TIMESTAMP NOT NULL
  -- No user_id column. No email column. No IP column.

The Analytics API validates payload_json against a strict per-event-type schema before inserting. Any field not explicitly permitted by the schema is rejected at the API layer (not silently dropped — rejected with 400 so the client knows the payload was malformed).

7.4 Sequence: Opt-In Shadow Submission

sequenceDiagram
    participant U as User (browser)
    participant SA as Shadow Aggregator (client JS)
    participant AR as Antlers (React)
    participant RP as Raptor (main API)
    participant AN as Analytics API (separate)

    U->>AR: Opens Settings → opts in to strategy_patterns scope
    AR->>RP: POST /api/analytics/consent {scope, granted}
    RP->>RP: Insert analytics_consent row; write audit_log
    RP-->>AR: 200 OK

    Note over SA: On next trade event (client-side)
    SA->>SA: Observe raw trade event
    SA->>SA: Apply bucketing (strategy family, win-rate bucket, regime)
    SA->>SA: Suppress if cohort size < k (client-side estimate)
    SA->>SA: Derive shadow_pseudonym from PRF output
    SA->>SA: Build analytics payload (no PII, bucketed only)
    SA->>AN: POST /api/shadow/events {pseudonym, scope, event_type, payload, epoch_week}
    Note over AN: No session cookie. No auth header. Pseudonym is the only identity.
    AN->>AN: Validate payload schema (reject unknown fields)
    AN->>AN: Insert analytics_events row
    AN-->>SA: 201 Created

7.5 Sequence: Withdrawal + Shadow Delete

sequenceDiagram
    participant U as User
    participant AR as Antlers
    participant RP as Raptor
    participant AN as Analytics API

    U->>AR: Settings → Withdraw consent for strategy_patterns
    AR->>RP: POST /api/analytics/consent {scope, withdrawn}
    RP->>RP: Insert analytics_consent row (withdrawn_at); write audit_log
    RP->>RP: Enqueue analytics.shadow.delete job (pseudonym, scope)
    RP-->>AR: 200 OK
    AR->>AR: Shadow Aggregator stops emitting for that scope immediately

    Note over RP: Within 30 days (GDPR Art. 17)
    RP->>AN: DELETE /api/shadow/pseudonym/{pseudonym}/scope/{scope}
    AN->>AN: Hard-delete all matching analytics_events rows
    AN-->>RP: 204 No Content
    RP->>RP: Write audit_log: analytics.shadow.delete_completed
    RP->>RP: Rotate shadow pseudonym epoch for this user

7.6 Threat Model

What an attacker who fully breaches the analytics store can learn:

Aggregate bucket distributions (e.g., "42 contributors used credit spreads in high-VIX weeks in 2026-W17")
Pseudonym-keyed weekly records (series of bucketed signals from one pseudonym over time)
That some number of users opted in to analytics

What they cannot learn:

Which user account any pseudonym belongs to (no FK; PRF output never stored server-side)
Specific trade prices, positions, tickers, P&L
User identity (email, name, credential ID)
The number of trades underlying a bucket (bucket labels only, not counts per trade)

Re-identification risk surface:

Small cohorts: if only one user ever traded iron condors at VIX>20 in a given week, the bucket count=1 event linked to a pseudonym is a near-identification event. Mitigated by: k-anonymity floor (suppress single-contributor buckets) + DP noise at the query layer.
Timing correlation: if an attacker has out-of-band knowledge of when a specific user trades (e.g., via public market data), they could correlate epoch-week records to the known user. Mitigated by: epoch-week granularity (not day or hour); pre-aggregation means individual trade timestamps never appear.
Pseudonym persistence: a long-lived pseudonym is a tracking vector. Mitigated by: pseudonym rotates on consent withdrawal + re-grant; recommend optional annual rotation even without withdrawal.
Multi-passkey divergence: as noted in §7.2, users with roaming authenticators may generate multiple pseudonyms. This is privacy-enhancing (harder to aggregate across devices) but creates analytics inconsistency.

Raxx operator risk (insider threat):

Even an operator with full DB access cannot link analytics records to user accounts without access to the PRF output of the user's passkey — which is never stored. An operator trying to re-identify users would need: (a) the analytics store, (b) the consent table (which records pseudonyms only indirectly — the pseudonym isn't stored there either), and (c) the user's physical authenticator. This is the intended posture.

8. Migrations

If the shadow analytics path is chosen:

New tables: analytics_consent in the main Raptor DB. No schema changes to existing tables.
New service: Analytics API is a separate Raptor blueprint or standalone service with a separate SQLite (or Postgres, at scale) DB. The key invariant: no shared DB file with the main Raptor DB.
Migration 0002_analytics_consent.sql: Creates analytics_consent table. No data migration (new capability; no existing consent records).
Rollback: Drop analytics_consent table; shut down Analytics API. No user data affected (analytics data is separate; consent records are append-only and deletion is safe).

9. Rollout Plan

Phase	Description	Gate
Dark	Analytics API deployed; no client-side shadow aggregator active; feature flag `SHADOW_ANALYTICS=off`	Internal review of schema + threat model
Internal beta	Shadow aggregator enabled for 5–10 internal accounts; consent UI visible; verify delete pipeline	30 days of clean operation; no re-identification events in threat model review
Founders beta	Opt-in available to Founders cohort; prominent in settings	Founders consent and withdrawal tested end-to-end
GA	All users see opt-in prompt post-login	Analytics store has k≥20 opted-in users before publishing any aggregate

Kill switch: SHADOW_ANALYTICS=off env flag disables the consent UI, the shadow aggregator client-side load, and the Analytics API intake endpoint simultaneously. Existing analytics data is untouched; it just stops growing.

10. Security Considerations

[ ] PII collected: None in analytics store. Shadow pseudonym is not PII (cannot be linked to identity without PRF access).
[ ] Retention: Analytics event rows retained for 2 years (product insight value), then batch-deleted. Analytics consent rows retained for the life of the account + 2 years (GDPR accountability under Art. 7(1)).
[ ] DSR erasure: Shadow delete pipeline described in §7.5. Hard delete within 30 days. Consent rows retained with action type only (no content).
[ ] Audit trail: All consent events, delete-queued events, delete-completed events written to audit_log. Analytics API has its own append-only operation log.
[ ] Stored credentials: PRF output is never stored. Shadow pseudonym derivation is in-memory only. No credential storage in this design.
[ ] Breach: If analytics store is breached, the data is aggregate bucketed signals with no PII columns. Raxx notifies users that shadow analytics data may have been exposed; the actual risk to any individual is low (by design). Standard 72-hour GDPR Art. 33 clock still applies because aggregate behavioral data may still be considered personal data in some interpretations.
[ ] Secrets: Analytics API signing keys sit in env / secret store. Rotatable without redeploy. No keys in code.
[ ] Kill switch: SHADOW_ANALYTICS=off — see §9.

11. Tradeoff Analysis

Features this enables (that pure E2E blocks)

Feature	Notes
Cohort-level strategy recommendations	"Users with similar strategy patterns succeeded with credit spreads in low-IV environments"
UX error-rate improvement	What flows have highest error rates, which users hit them
Regime-pattern discovery	Do high-VIX entries outperform? Aggregate answer becomes possible
Aggregate leaderboards (#211)	Leaderboard data can be drawn from opted-in shadow store rather than encrypted main store
Business analytics	Strategy-family popularity, cohort retention by engagement pattern
Investor/product reporting	Aggregate usage data (no PII) defensible to investors

Features this STILL blocks (even with shadow pipeline)

Feature	Why still blocked
Per-user support debugging	Support still cannot see actual trades; only shadow-store aggregates
Personalized recommendations based on individual history	Shadow store has weekly buckets, not individual history
Regulatory discovery of a specific user's trade content	Raxx still cannot decrypt user data
Backtest runs on server using user's own data	Compute still must be client-side or user-authorized per-operation
Recovery-time data access	If user loses passkey, their E2E data is still unrecoverable from server

What shadow pipeline does NOT change

The pure E2E invariant remains. Raxx cannot read individual user trade content. The shadow pipeline is an additive opt-in path; it does not weaken the base encryption posture. A user who never opts in has identical privacy to a pure E2E system.

12. Alternatives Considered (see ADR 0017 for full treatment)

Alternative	Privacy cost	Product cost	Implementation cost
Pure E2E, no analytics	None	High — blocks all aggregate insight	Low — simpler architecture
Mandatory shadow pipeline (all users)	Moderate — all users share data	Low — full analytics available	Medium
Client-side telemetry only, no server aggregation	Low	Medium — crash data only, no behavioral insight	Low
Server-side columnar encryption (Raxx-held keys)	High — operator can always read with key	Low — full analytics available	Medium — envelope encryption
Opt-in shadow pipeline (this design)	Low for non-opted-in users; moderate for opted-in users	Medium — aggregate insight only for opted-in cohort	High — two services, consent pipeline, delete pipeline

13. Open Questions (Require Kristerpher's Decision)

ADR 0017 path decision. Which of the four alternatives does Kristerpher select? (see ADR 0017 §Decision placeholder)
Analytics store infrastructure. Separate SQLite DB (simple, single-host) or separate Postgres (scalable, operationally heavier)? At <10,000 opted-in users, SQLite is sufficient. At Raxx's current scale, SQLite is the right call; revisit at 50,000 MAU.
k-anonymity floor value. k=20 proposed. Is this high enough? Lower k means more data published but higher re-identification risk. Higher k means less data but stronger privacy guarantee. Decision depends on expected cohort sizes at launch.
Privacy budget ε for DP. ε=1–2 per domain per week is a reasonable starting point. Formal DP requires a privacy budget accounting system. Is implementing this in v1, or is k-anonymity-only acceptable as the v1 anonymization guarantee?
Analytics service boundary. Separate deployed service (own process, own port) vs. a Blueprint within Raptor with separate DB? Separate service is cleaner isolation; Blueprint is simpler to deploy. Recommend separate service; Kristerpher should confirm.
Pseudonym rotation on re-opt-in. When a user withdraws and later re-opts-in, the pseudonym rotates (new epoch). This means the new analytics series can't be linked to the old one by the server, but it also means cohort continuity breaks. Is this the right privacy-vs-analytics tradeoff?
DPIA requirement. Processing behavioral data (even aggregate, even anonymized) at scale for product improvement may require a DPIA under GDPR Art. 35, particularly if Raxx crosses the "systematic large-scale processing" threshold. At Founders scale this likely doesn't apply; revisit at GA with a formal DPIA.

Auto-generated from docs/ in raxx-app/TradeMasterAPI. Gated behind Cloudflare Access. Re-deployed on every push to main.