Raxx · internal docs

internal · gated ↑ index

ADR 0017 — E2E Encryption with Opt-In Shadow Analytics: Architecture Posture

Status: Accepted 2026-04-24 — Option B (opt-in shadow pipeline) selected. k=20 k-anonymity; k-anonymity only at v1 (no formal DP); analytics store as a separate service (not a Blueprint); DPIA acknowledged. Date: 2026-04-24 Deciders: product owner (Kristerpher Henderson), software-architect Related: ADR 0001, ADR 0002, ADR 0003, docs/architecture/passkey-e2e-with-opt-in-shadow-analytics.md Supplemented by: ADR 0018 — Shadow-analytics data goals + consent-UX consequences Parent issue: #250


Context

Issue #250 proposed passkey-keyed E2E encryption for all customer data. The core tension: E2E encryption with a server-incapable-of-reading posture is the strongest privacy position in retail fintech, but it structurally blocks every form of server-side analytics on actual user behavior.

The product cost is concrete: - No cohort strategy recommendations. - No UX error-rate telemetry on real user flows. - No regime-pattern discovery from aggregate trade behavior. - No aggregate leaderboard data based on actual performance (#211). - No business analytics on what strategies users actually use.

The proposed middle path — opt-in shadow analytics — keeps E2E as the base invariant and adds an optional client-side anonymization pipeline so operators can learn aggregate patterns without ever reading individual data.

This ADR records the architecture decision: which of the four postures does Raxx adopt?

The full research, threat model, consent design, pipeline architecture, and tradeoff analysis are in docs/architecture/passkey-e2e-with-opt-in-shadow-analytics.md. This ADR refers to that document rather than duplicating it.


The Four Alternatives

Option A — Pure E2E, No Analytics

E2E encryption ships. No shadow pipeline. Raxx accepts that it can never run server-side analytics on user trade content.

Privacy cost: None. Strongest posture possible.

Product cost: High. All server-side analytics on behavioral data are structurally impossible. Strategy recommendations must be hand-curated or based solely on market data (not user behavior). UX improvement relies on qualitative feedback, not behavioral telemetry. Regime-pattern discovery requires a separately curated dataset.

Implementation cost: Low. No second service; no consent pipeline; no delete pipeline. E2E encryption is a single architecture decision that simplifies everything downstream.

This is Signal's choice. They accept the product cost as brand-defining. It is a legitimate choice, not a failure mode.


E2E encryption is the base. Users who opt in trigger a client-side shadow aggregator that buckets, anonymizes, and submits aggregate signals (never individual records) to a separate analytics store that Raxx can read. Operators learn aggregate patterns; individual behavior stays private.

Consent is: - Off by default. - Granular (three independent scopes: strategy patterns, product usage, error telemetry). - Withdrawable with hard delete of shadow data within 30 days. - GDPR Art. 7-compliant.

Privacy cost: Low for non-opted-in users (identical to Option A). Moderate for opted-in users — aggregate weekly buckets of strategy family and win-rate are visible to Raxx operators. Re-identification is structurally prevented (k-anonymity floor, no PII columns, no FK to users table, pseudonym derived from PRF without server access).

Product cost: Medium. Analytics is available only for opted-in users, and only at aggregate-bucket granularity. If opt-in rates are low (common for privacy-sensitive products), analytics may be sparse at launch. No per-user support debugging. No individual-history recommendations. Still cannot decrypt E2E data.

Implementation cost: High. Two services (Raptor + Analytics API), consent schema, shadow aggregator (client-side JS), pseudonym derivation, delete pipeline, k-anonymity enforcement, GDPR consent records. This is real engineering — estimate 6–8 implementation sub-cards before first signal is collected.

This is Proton/Tuta's choice for the analytics they do run. They accept that opt-in rates will be lower than mandatory data collection, and they compensate by being explicit about what is shared.


Option C — Mandatory Shadow Pipeline (All Users)

E2E encryption is the base. All users, without opt-in, submit the same bucketed aggregate signals. Raxx can run analytics on the full user base from day one.

Privacy cost: Moderate-to-high. GDPR Art. 6 requires a lawful basis for processing. "Legitimate interest" is possible for some aggregate signals but is not the cleanest basis for behavioral analytics; "contract" doesn't cover analytics; "consent" is not freely given if it's mandatory. This option has GDPR exposure if challenged.

Product cost: Low. Full analytics available for all users.

Implementation cost: Similar to Option B but without the consent pipeline.

Rejected on GDPR grounds: mandatory behavioral telemetry without consent is defensible only under "legitimate interest" with a formal LIA (Legitimate Interest Assessment) and a clear right-to-object mechanism. At Raxx's scale and market position, this is a regulatory risk Raxx should not take. The opt-in approach (Option B) achieves similar analytics without the legal exposure.


Option D — Server-Side Columnar Encryption (Raxx-Held Keys)

This is the industry default: Raxx holds the encryption keys; data is encrypted at rest at the column or table level; only Raxx's application can decrypt it.

Privacy cost: High. This is not E2E. Raxx can always read all user data with its own keys. A breach of Raxx's key store is a full breach of all user content. Raxx can comply with subpoenas that compel it to produce decrypted content. The "Raxx can't read your trades" marketing claim is false.

Product cost: Low. Full analytics available.

Implementation cost: Low. Envelope encryption on sensitive columns (see ADR 0009's schema sketch for how this is already designed for OAuth tokens).

Rejected as the primary posture. This forfeits the privacy differentiation that makes Raxx distinctive in a crowded market. It is the default in the industry precisely because it is easier — not because it is better for users. ADR 0001 and ADR 0002 already establish the invariants that make this unacceptable as the primary posture.

Note: Column-level server-managed encryption may still be appropriate for specific operational metadata that legitimately needs to be readable by Raxx (e.g., subscription status, support ticket metadata). This ADR is about trade content and strategy data.


Decision

RECOMMENDATION: Option B — Opt-In Shadow Pipeline.

The recommendation rests on three observations:

  1. The privacy invariant must hold. ADR 0002 says Raxx must be incapable of replaying a credential; the spirit of that invariant extends to being incapable of reading user trade content without user authorization. E2E is not negotiable as the base posture if Raxx wants to make the "can't read your trades" claim.

  2. Pure E2E (Option A) forecloses a product surface that is real and valuable. Aggregate strategy recommendations, regime-pattern insights, and UX telemetry are not vanity metrics — they are the feedback loop that makes the product better. Accepting Option A is a bet that Raxx can grow without that feedback. That may be true for Signal (messaging is messaging), but trading strategy is a domain where aggregate insight produces direct user value.

  3. Opt-in is the right default for a privacy-first product. Raxx's brand is "precision tool, user-controlled." Opt-in shadow analytics is consistent with that brand. Mandatory analytics (Option C) contradicts it. The expected opt-in rate will be lower than 100% — that is a product cost Raxx should accept as the price of the brand position.

What Kristerpher is saying yes to if he accepts this recommendation:

What Kristerpher is saying no to:


Decision placeholder

[ ] Kristerpher to select:


Consequences if Option B is chosen

Positive

Negative

Neutral


Open Questions Before Implementation Sub-Cards Can Be Filed

  1. Kristerpher's path selection (above). Blocks all sub-cards.
  2. k-anonymity floor value. k=20 proposed; Kristerpher to confirm.
  3. Privacy budget ε for DP. v1 k-anonymity-only acceptable? Or implement basic Laplace noise at query layer?
  4. Analytics service boundary. Separate process vs. Raptor Blueprint with separate DB?
  5. Analytics store infrastructure. SQLite vs. Postgres.

These open questions are enumerated in docs/architecture/passkey-e2e-with-opt-in-shadow-analytics.md §13.


Alternatives Considered

See the four options above. Rejected alternatives:


Revisit When