Account Merge D1 — Detection Design Memo
Date: 2026-06-05 UTC
Status: Design-phase memo — awaiting operator D1 decision
Author: detection-engineer agent
Scope: Behavioral-detection design pass on the D1 (who picks primary) decision for account-merge.
Cross-reference: docs/architecture/account-merge-2026-06-05.md | security-agent threat-model memo: TBD (filing in parallel).
Epic: #3245 | PR: #3256
This memo surfaces what each D1 candidate flow opens up for an attacker and what detection signals must exist to observe those behaviors. It does not pick a winner. The operator and security-agent own the final posture call.
1. Flow Summaries (Detection Lens)
Flow 1 — Hybrid (CS nominates, customer can request one swap before first code is consumed)
The flow has two distinct windows:
- Pre-verification swap window. Between POST /internal/merges and the moment the first code is verified, the customer can call POST /merges/{id}/swap-primary. This window has no cryptographic constraint other than session auth — the caller just needs to be authenticated to either account.
- Post-first-verification lock. Once primary_verified_at or secondary_verified_at is set, the primary designation is frozen.
Attack surface the swap window opens:
- An adversary who has compromised only one inbox (say the secondary account's email) can observe the merge-initiation email arriving in that inbox and call /swap-primary before verifying anything. If they can authenticate as the secondary account, they can elect themselves the survivor. This is the structural risk that security-agent's threat model will quantify; detection's job is to make the call observable.
- The swap operation currently has no separate audit footprint beyond the primary_swap_requested boolean on account_merges. Detection needs a dedicated event so timing and actor context survive.
CS-side risk: - CS nominates the wrong pair of accounts (autocomplete error, ticket misread). This produces a merge that one or both customers reject. Detectable via early cancellations and rapid reversals attributed to specific CS users.
Flow 2 — CS-only (CS picks, customer has no mechanism to dispute before verification)
Tightest control surface: - No swap-window attack surface. The only path to changing the primary is creating a new merge record, which requires CS action and produces an audit event. - Social-engineering risk is fully concentrated in the CS staff: an adversary tricks a CS agent into initiating a merge with the adversary's account as primary. One phone call or spoofed support email is sufficient. - Highest reversal rate in practice (customers complain after the fact rather than having an up-front voice). Reversal clustering by CS user is the primary anomaly signal.
Flow 3 — Customer-choice during verification
Customer designates primary by which account they verify first (or by an explicit selection UI). - Highest adversary flexibility: if an adversary controls one inbox, they verify first and claim primary. The cross-verification requirement still forces the adversary to also control the other inbox or social-engineer a real user into verifying — but the primary election is still decided at the earliest possible moment. - Merge-initiation patterns remain CS-gated, which preserves that seam. But the primary election seam is entirely in the verification sequence, which is customer-facing and has a much wider attack surface than a CS UI. - Detection is harder: distinguishing a legitimate customer choosing their preferred primary from an adversary racing the swap requires IP-distance and session-fingerprint correlation that isn't needed in the other flows.
2. Attacker Behavior Catalog by Stage
Stage A — Merge Initiation (POST /internal/merges)
This stage is CS-only in all three flows. It is the earliest choke point.
A1 — Social-engineered CS initiation. Adversary contacts support claiming to own both accounts (or claiming account-recovery). CS initiates a merge that maps a legitimate account to one the adversary controls.
- Observable: the merge record has at least one user_id that has no prior ticket history with the initiating CS agent. Cross-referencing freescout_ticket_cache for both primary_user_id and secondary_user_id against the freescout_ticket_id on the merge record surfaces this.
- Observable: CS user's merge initiation rate deviates from their personal baseline.
A2 — Autocomplete or copy-paste error by CS. No adversarial intent; CS selects the wrong account. Behavioral signature is indistinguishable from A1 until reversal. - Observable: merge is cancelled before any verification, or reversed within 24h. These "early turnaround" merges cluster on specific CS users.
A3 — Bulk scripted initiation (insider or API key compromise). An adversary with customers:merge:write role (or a compromised service token) fires many initiations in a short window.
- Observable: merge-initiation count per 5-minute window deviates from Poisson baseline. With zero live merges today the baseline is zero; first signal fires immediately and by design.
Stage B — Code Generation (Postmark sends codes to both addresses)
B1 — Inbox compromise (code intercept). Adversary has read access to one inbox and is waiting for the code. They verify immediately after delivery. - Observable: time-to-verify is anomalously short (sub-60-second verify after code delivery). In a legitimate scenario the customer reads, processes, and then logs in to verify — this takes minutes to hours. A bot intercept happens in seconds. - Observable: verification from an IP address geographically distant from the account's historical session IP cluster.
B2 — Postmark delivery failure exploited. Adversary triggers repeated resend (up to 5 allowed per account) to keep the merge in initiated state indefinitely, preventing legitimate resolution.
- Observable: resend count approaches maximum for a merge record. High-resend merges with no verification after 12h are anomalous.
B3 — Postmark signature failures on merge emails. If Postmark webhook signature validation failures spike on the merge.initiated template, someone may be spoofing delivery events to confuse state.
- Observable: signature failure rate on merge-class emails via postmark_delivery_events.
Stage C — Code Entry (POST /merges/{id}/verify)
C1 — Cross-session verification (two browsers, two IPs in the same window). Both verifications arrive within a short time window from geographically separated IPs. This can be legitimate (a customer on mobile + desktop), but it is the canonical cross-account-compromise pattern: adversary verifying from their network while real customer also just happens to verify.
- Observable: IP-to-IP geolocation distance between primary_verified_at session and secondary_verified_at session exceeds the distribution of same-user session distances for this account.
C2 — Brute-force of 8-char code. The rate limiter (10 attempts per merge record) prevents this structurally, but a detection rule watching for repeated 400/409 responses from /merges/{id}/verify confirms the structural control is firing correctly.
- Observable: 4+ failed verify attempts on any merge record within 60 minutes.
C3 — Swap-window race (Hybrid only). Adversary with session on secondary account calls /swap-primary then immediately calls /verify before the legitimate primary user does. The swap + fast verify sequence in under 60 seconds is the signature.
- Observable: primary_swap_requested = TRUE on a merge record where secondary_verified_at precedes primary_verified_at AND both verifications occurred within 5 minutes of merge initiation.
C4 — Re-use of consumed code (replay). The architecture uses used_at to block this structurally. Detection confirms the structural control: any 409 on /verify for an already-consumed code is an instrumentation point, not a vulnerability.
- Observable: 409 responses on the verify route per merge_id.
Stage D — Completion
D1 — Post-merge session anomaly. The surviving account (primary) suddenly exhibits authentication patterns inconsistent with its own history: new device, new IP region, different time-of-day pattern. - Observable: first post-merge session for primary user_id arrives from an IP class or geolocation not present in the prior 30d session history. - Observable: passkey that was never used before merge (came from secondary) is the authenticating credential on the first post-merge login.
D2 — High-value data exfiltration after merge. Adversary who merged to gain access to the primary's strategies and backtest history starts exporting or reading data at abnormally high rates. - Observable: strategy read + backtest read event rate for primary_user_id in the 24h post-merge vs. the 30d pre-merge baseline for that user.
Stage E — Reversal (POST /internal/merges/{id}/reverse)
E1 — CS uses reversal to cycle account ownership. Adversary convinces CS to merge, extract data, then reverse to restore the victim. This is a reconnaissance pattern. - Observable: a merge record is both completed and reversed within 2h. Short merge lifecycles (complete + reverse within the reversal window, within 2h) are suspicious.
E2 — CS user with disproportionate reversal rate. A compromised or coerced CS agent completes merge requests and then reverses them after adversary extraction. Reversal rate per CS user over a 7-day window is the signal.
- Observable: reversal_count / completion_count per initiated_by_cs hash over 7d exceeds 0.3 (30% reversal rate against a baseline expected to be <5%).
3. Cardinality and Drift Detectors
3.1 Per-CS-User Merge-Rate Poisson Anomaly
Method: Per-initiated_by_cs hash, count merge initiations per 24-hour window. Model as Poisson with lambda estimated from trailing 30-day history. Flag when observed count exceeds the 99.9th percentile of the Poisson distribution.
Pre-launch state: No baseline exists. Seed lambda = 0.1 initiations/day (expectation is occasional organic requests). First merge by any CS user fires at HIGH; re-baseline after 30 days of observed activity.
Minimum sample: 10 observed events per CS user before the statistical threshold governs; below that, every merge initiation fires at MEDIUM.
3.2 Per-Customer-Pair Geolocation Distance Distribution
Method: For each merge record, compute the great-circle distance between: - The primary account's median session IP geolocation over the trailing 30 days. - The secondary account's median session IP geolocation over the trailing 30 days.
Build a running distribution of this distance across all completed merges. Flag records where the distance exceeds the 95th percentile of the distribution AND both accounts' primary session IPs have no overlap in their IP-to-ASN mapping (i.e., they don't share a residential ISP, CDN, or corporate prefix).
Note on operator VPN: Per user_uses_vpn, IPs from Datacamp/CDN77 and similar prefixes do not constitute a true geolocation anchor. The IP-distance check must exclude CDN/VPN ASNs from the median computation.
Pre-launch state: Baseline window is empty. Seed with a 1000-km flag threshold as an absolute until the distribution has 20+ samples.
3.3 Merge Initiation Time-Bunching
Method: Compute the inter-arrival time between successive merge initiations (all CS users combined). Model as an exponential distribution (Poisson process). Flag when 3+ initiations occur within a 5-minute window. A legitimate CS operation is a per-ticket manual action; 3 in 5 minutes is anomalous for manual work.
Baseline: Initiation rate expected to be 1–3 per day at launch. Burst detector is absolute (3 in 5 minutes) until the empirical inter-arrival rate has been observed for 30 days, then switch to the 99.9th percentile of the exponential distribution.
3.4 Reversal Rate as Leading Mis-Merge Indicator
Method: Weekly rolling ratio: reversed_count / (reversed_count + completed_count) across all merges. Alert at:
- MEDIUM: ratio > 0.10 (10%)
- HIGH: ratio > 0.25 (25%)
Per-CS-user dimension: Same ratio computed per initiated_by_cs hash. Flag any individual CS user at HIGH if their personal reversal ratio > 0.30 over any 14-day window with at least 3 completions.
3.5 Time-to-Verify Distribution
Method: For each merge record, compute min(primary_verified_at, secondary_verified_at) - initiated_at. This is the time from code dispatch to first verification. Build a distribution across completed and reversed merges. Flag records where time-to-first-verify < 90 seconds (absolute; no statistical baseline needed — legitimate humans cannot open an email client, read a code, navigate to the app, and verify in under 90 seconds).
K-S test complement: Monthly, run a Kolmogorov-Smirnov test comparing the current month's time-to-verify distribution against the prior 3-month baseline. A shift toward shorter times (D-statistic > 0.2 at alpha=0.05) is a signal that verification behavior is changing — possibly automated.
4. Required Instrumentation — customer_audit_events Additions
The architecture doc specifies that every account_merges status transition produces an entry in customer_audit_events. That is necessary but not sufficient for detection. The following additional events are required.
All events use dimension = "operator_interaction" for CS-actor events and dimension = "customer_self" for customer-actor events. The ticket_id field on each event must be populated from account_merges.freescout_ticket_id.
New action namespaces required (additions to audit_action_allowlist.yaml)
merge.initiated
allowed_fields:
- merge_id # account_merges.id (bigint, not user PII)
- primary_user_id # bigint
- secondary_user_id # bigint
- cs_actor_hash # SHA-256 of initiating CS email — same convention as initiated_by_cs column
- freescout_ticket_id # string or null
- dsr_block_checked # boolean — was DSR check performed
dimension: operator_interaction
merge.cancelled
allowed_fields:
- merge_id
- cancelled_by # "cs" | "system" (expiry)
- cs_actor_hash # null if system-cancelled
- reason # free text, operator-supplied
dimension: operator_interaction
merge.code_verified
allowed_fields:
- merge_id
- verifying_account_role # "primary" | "secondary" — which account completed THIS verification
- verification_sequence # "first" | "second" — was this the 1st or 2nd code consumed?
- request_ip_class # /24 prefix only — not full IP (PII floor)
- request_asn # autonomous system number of the verifying IP
- seconds_since_initiation # integer — time from merge.initiated to this event
dimension: customer_self
merge.code_verify_failed
allowed_fields:
- merge_id
- verifying_account_role # "primary" | "secondary"
- failure_reason # "wrong_code" | "expired" | "already_consumed" | "rate_limited"
- attempt_number # 1..10 (the structural limit)
dimension: customer_self
merge.resend_requested
allowed_fields:
- merge_id
- account_role # "primary" | "secondary"
- resend_sequence # 1..5 (the structural limit)
dimension: customer_self
merge.swap_primary_requested
allowed_fields:
- merge_id
- requesting_account_role # "primary" | "secondary" — which session called /swap-primary
- request_ip_class # /24 prefix only
- request_asn
- seconds_since_initiation
dimension: customer_self
# Hybrid flow only. This event is the primary detection anchor for the swap-window attack.
merge.completed
allowed_fields:
- merge_id
- primary_user_id
- secondary_user_id
- tables_migrated_count # integer — rows re-FK'd
- billing_action # "none" | "refund_issued" | "subscription_extended"
- duration_seconds # time from verified to completed
dimension: system_automated
merge.failed
allowed_fields:
- merge_id
- failure_stage # "pre_flight" | "mid_transaction" | "post_transaction"
- error_category # "dsr_block" | "db_error" | "passkey_rebind_error" | "billing_error" | "other"
dimension: system_automated
merge.reversed
allowed_fields:
- merge_id
- cs_actor_hash
- days_since_completion # integer — how deep into the 14-day window
- reason # operator-supplied text
- rows_restored_count # integer — rows re-FK'd back to secondary
dimension: operator_interaction
merge.post_merge_session_new_ip
allowed_fields:
- merge_id
- user_id # the primary_user_id post-merge
- session_ip_class # /24 prefix
- session_asn
- is_new_asn # boolean — ASN absent from 30d pre-merge history
- hours_since_completion
dimension: system_automated
# Written by session middleware when the first post-merge session is detected
# from an IP class not in the primary's 30d history.
Note on IP storage: Full IPs are not stored in after_state (PII floor). The /24 prefix and ASN are sufficient for geolocation-distance detection and carry no single-host PII. This is consistent with how waf_events handles IP cardinality.
console_audit_events — CS-side parallel write
Every merge.* event with dimension = "operator_interaction" must also produce a parallel write to console_audit_events so the CS-user anomaly detectors can query on the console side without touching the customer PII tables. The console_audit_events schema already handles this pattern; the merge events need entries in the console-side action allowlist as well.
5. Per-Flow Detection-Surface Comparison
1 = cleanest detection surface (most observable, fewest blind spots) 3 = hardest to monitor (most ambiguous signals, more required infrastructure)
| Criterion | Flow 1: Hybrid | Flow 2: CS-only | Flow 3: Customer-choice |
|---|---|---|---|
| Swap-window attack surface | Present (requires merge.swap_primary_requested event) |
Absent | Absent for swap; primary election embedded in verify sequence |
| CS-side observability | Same as CS-only; CS nominates | All actor attribution in CS | CS nominates only; customer election at verify time |
| Verification sequence distinguishability | Medium — swap + verify ordering carries signal | High — no swap; verify sequence is the only customer signal | Low — which-verify-first IS the decision; hard to distinguish legitimate choice from adversarial race |
| Time-to-verify signal | Same power in all flows | Same | Same |
| IP-distance cross-session signal | Same power in all flows | Same | Same; but also needed to classify the election itself |
| Reversal rate per CS user | Same power in all flows | Highest signal (all complaints surface as reversals) | Distributed between reversals and swap recriminations |
| New audit events required | 9 (including swap event) | 8 (no swap event) | 9+ (need election event, not just swap) |
| Overall detection rank | 2 | 1 | 3 |
Flow 2 (CS-only) has the cleanest detection surface because the control surface is entirely on one side (CS), reversal rate is the natural leading indicator of mis-merges, and there is no swap-window event requiring timing analysis. Every anomaly channels through the CS-side actor signals.
Flow 1 (Hybrid) is detectably richer than Flow 3 but adds a swap-window event that requires fast detection (the attack window is open for only as long as neither code has been verified). This window is measurable but narrow.
Flow 3 (Customer-choice) is the hardest to monitor because the primary election seam — the highest-value action in the merge — is indistinguishable at detection time from a legitimate customer preference. The only disambiguation is after-the-fact (was the elected primary the one with the adversarial session?), which means detection fires on completion or reversal rather than at the seam. This is a material detection lag.
This ranking is offered as a tiebreaker input. It does not override security-agent's threat-model output or the operator's policy judgment.
6. Ranked High-Signal Detections
Ranked by: (attack severity if undetected) * (signal quality) / (false-positive rate). Pre-launch baselines are sparse; rankings favor detections that fire immediately on clear signatures rather than requiring statistical maturity.
Detection 1 — Sub-90-Second Verification (CRITICAL trigger)
What it catches: Automated inbox compromise (B1). Code intercepted by a mail-scanning bot or by an adversary with active inbox access. Human verification cannot occur in under 90 seconds.
Signal: merge.code_verified.seconds_since_initiation < 90 for either the first or second verification event on a merge record.
Telemetry: customer_audit_events with action = "merge.code_verified".
False positive rate: Near zero. No legitimate user verifies in under 90 seconds. The 90-second floor is conservative; if empirical data shows 95th-percentile human verify time is 3 minutes, tighten to 180 seconds after 30 days of data.
Alert route: HIGH (no statistical baseline required; single-event signature). File GH issue with type:security. Escalate to security-agent for session analysis.
Detection 2 — Swap + Immediate Verify (Hybrid flow, HIGH)
What it catches: Swap-window race attack (C3). Adversary calls /swap-primary and then verifies before the legitimate primary user does, electing themselves the surviving account.
Signal: merge.swap_primary_requested.seconds_since_initiation < 300 AND merge.code_verified (verifying_account_role = role that requested swap) arrives within 60 seconds of the swap event on the same merge_id.
Telemetry: customer_audit_events correlated on merge_id across merge.swap_primary_requested and merge.code_verified.
False positive rate: Low. A legitimate customer who wants to swap would: read the email, decide they prefer the other account, call CS or navigate to the swap endpoint — this takes minutes. A sub-5-minute initiation-to-swap-to-verify sequence is adversarial in shape. Estimated FP rate: <2% of legitimate hybrid merges based on expected human task-switching time.
Alert route: HIGH. Flag merge record for manual CS hold before completion is allowed to proceed.
Note: This detection only exists in Flow 1. If D1 resolves to Flow 2 or Flow 3, this detection is unnecessary.
Detection 3 — Per-CS-User Merge Initiation Anomaly (HIGH)
What it catches: Compromised CS credentials (A3), insider-threat bulk initiation, or social-engineering campaigns targeting CS (A1).
Signal: Count of merge.initiated events per cs_actor_hash in any 24-hour window exceeds the 99.9th percentile of that user's Poisson-modeled initiation rate. Pre-baseline (first 30 days or fewer than 10 observations per user): flag at MEDIUM if count > 3 in 24h, HIGH if count > 5 in 24h.
Telemetry: console_audit_events (CS actor writes), keyed on the cs_actor_hash payload field.
False positive rate: Moderate before baseline establishes. A CS user handling a backlog of merge tickets could hit 3 in a day legitimately. The 30-day baseline maturation period is necessary.
Alert route: HIGH. Investigate CS session for credential compromise. Escalate to security-agent if more than one CS user spikes simultaneously.
Detection 4 — Cross-Session IP-Distance Anomaly (HIGH)
What it catches: Two different actors verifying on behalf of two accounts (C1). One inbox compromised; the other is the legitimate user.
Signal: request_asn from merge.code_verified where verification_sequence = "first" vs. verification_sequence = "second" are in different country-code ASN prefixes AND neither is a known VPN/CDN ASN. Threshold: ASN country codes differ.
Telemetry: customer_audit_events correlated on merge_id, comparing ASN fields between the two verification events.
False positive rate: Moderate. Legitimate multi-device users (mobile + laptop, work + home) will trigger this. Requires manual review, not auto-block. Estimated FP rate: 15–25% of legitimate merges where both parties verify on different networks.
Alert route: MEDIUM. Include in ops@ daily digest. Manual CS review of the merge record.
Detection 5 — Post-Merge New-ASN Session on Primary Account (HIGH)
What it catches: Successful account-takeover that completed the merge. The adversary authenticated as the new surviving account from an ASN absent from the primary's session history.
Signal: merge.post_merge_session_new_ip.is_new_asn = true within 2 hours of merge.completed.
Telemetry: customer_audit_events with action = "merge.post_merge_session_new_ip", written by session middleware. Requires session middleware instrumentation (new infra — see Section 7).
False positive rate: Moderate. Customers do sometimes log in from new locations post-merge (they were notified by email; they may open it from a mobile device on a different network). Context: if the new ASN is also the same ASN that verified one of the codes, the confidence is higher. Estimate 20–30% FP rate on new-ASN alone; drops to ~5% when combined with detection 1 or 2 on the same merge_id.
Alert route: HIGH when correlated with detection 1 or 2 on same merge_id. MEDIUM standalone.
Detection 6 — Per-CS-User Reversal Rate Spike (MEDIUM)
What it catches: CS-facilitated extract-and-revert (E2), or systematic mis-merge pattern by a specific CS agent.
Signal: (reversed_count / total_completions) per cs_actor_hash over any 14-day rolling window > 0.30 with at least 3 completions.
Telemetry: customer_audit_events aggregated on merge.reversed.cs_actor_hash and merge.completed events.
False positive rate: Low once volumes are non-trivial. Pre-launch: this signal is informational only (any single reversal is 100% of that user's ratio). Promote to HIGH-severity after 30-day baseline.
Alert route: MEDIUM in ops@ digest. Escalate to HIGH if the reversing CS user is the same one who flagged in detection 3.
Detection 7 — High-Resend with No Verify (MEDIUM)
What it catches: Deliberate merge stall (B2) — adversary or confused customer keeps requesting resends to hold the merge open, possibly waiting for a window.
Signal: primary_resend_count + secondary_resend_count >= 4 on a merge record that has been in initiated status for more than 12 hours.
Telemetry: account_merges table direct query (no audit event needed — the resend counts are on the row). Can be run as a nightly SQL query against Postgres.
False positive rate: Low. A merge open for 12 hours with 4+ resends but no verification indicates either an unreachable customer or an actor deliberately cycling codes.
Alert route: MEDIUM. CS notification to call the customer directly.
Detection 8 — Short-Lifecycle Merge (Complete-then-Reverse within 2h, HIGH)
What it catches: Reconnaissance pattern (E1) — adversary triggers merge to access primary's data, then reverses to restore the victim and avoid detection.
Signal: merge.reversed.days_since_completion < 1 AND merge.completed.duration_seconds < 3600 (merge completed in under 1 hour of initiation). Proxy for "completed and reversed on the same day before anyone noticed."
Telemetry: customer_audit_events correlated on merge_id between merge.completed and merge.reversed.
False positive rate: Low. Legitimate reversals happen when customers call in to complain (takes hours to days). Sub-2-hour complete-and-reverse has no legitimate scenario.
Alert route: HIGH. File GH issue type:security. Escalate to security-agent for full session replay on the primary account during the merge window.
7. New Monitoring Infrastructure Required
The following detection signals cannot be computed from the existing customer_audit_events schema and require either schema additions (covered in Section 4) or new instrumentation that does not currently exist.
7.1 Session Middleware Post-Merge Hook (New Infra)
Detection 5 (merge.post_merge_session_new_ip) requires session middleware to:
1. Know when a user_id corresponds to a recently-merged primary account (check account_merges where primary_user_id = session.user_id and merge_completed_at > now() - 48h).
2. Compare the incoming session IP's ASN against a 30-day rolling cache of ASNs for that user_id.
3. Write a merge.post_merge_session_new_ip audit event if the ASN is new.
This is instrumentation that does not exist today. It requires:
- A 30-day ASN history cache per user_id (can be a lightweight Postgres query on customer_sessions if session IP or ASN is stored there — verify with sre-agent whether customer_sessions carries IP).
- An IP-to-ASN lookup at session validation time. The architecture doc does not reference an ASN lookup library in Raptor today.
Recommendation to sre-agent: before this detection can produce signal, confirm whether customer_sessions stores the requesting IP and add ASN resolution if not.
7.2 Postmark Delivery Event Merge-Classification (Existing Table, New Query)
Detection of Postmark bounce/delivery anomalies on merge emails (B3) requires that postmark_delivery_events carries a template_id or tag field that identifies the email as merge-related. The architecture doc specifies three merge email templates (merge-initiated, merge-completed, merge-failed). The detection can be built today if those template names are consistently applied as Postmark message stream tags when the emails are sent.
No new table needed; just consistent tagging. This is a feature-developer instrumentation requirement, not a new detection table.
7.3 IP-to-ASN Resolution (Possible New Dependency)
Detection 4 requires ASN comparison between two verification events. If Raptor does not currently resolve ASN at request time, a lightweight resolution call is needed (e.g., a local MaxMind GeoLite2-ASN lookup). This is in-scope for sre-agent to evaluate.
7.4 console_audit_events Merge Action Allowlist
All CS-side merge events need entries in the console-side action allowlist (the console app has its own equivalent of audit_action_allowlist.yaml). This is a same-PR requirement for the merge feature implementation, not a separate infrastructure project.
8. What Fits Existing Infrastructure vs. What Requires New Instrumentation
| Detection | Existing infra sufficient? | What's missing |
|---|---|---|
| Sub-90s verify (D1) | Yes, once audit events land | merge.code_verified event with seconds_since_initiation field |
| Swap + immediate verify (D2) | Yes, once audit events land | merge.swap_primary_requested + merge.code_verified events |
| Per-CS-user rate anomaly (D3) | Yes, via console_audit_events |
merge.initiated event with cs_actor_hash in console audit log |
| Cross-session IP-distance (D4) | Partially — need ASN resolution | ASN field in merge.code_verified; IP-to-ASN resolver in Raptor |
| Post-merge new-ASN session (D5) | No | Session middleware post-merge hook; ASN history cache |
| Per-CS reversal rate (D6) | Yes, once audit events land | merge.reversed event with cs_actor_hash |
| High-resend no-verify (D7) | Yes — direct account_merges query |
None (query existing resend_count columns) |
| Short-lifecycle merge (D8) | Yes, once audit events land | merge.completed + merge.reversed events with timestamps |
9. Operator-Action Checklist
The following items must be in place before any merge detection can produce signal. These are instrumentation requirements, not optional improvements.
-
All
merge.*action namespaces in Section 4 must be registered inbackend_v2/api/audit_action_allowlist.yamlbefore the merge feature ships. Without them, the audit writer service will reject every merge event with 422. -
The same action namespaces must be added to the console-side audit allowlist for
operator_interaction-dimensioned events. -
Postmark merge email sends must apply consistent message stream tags (
merge-initiated,merge-completed,merge-failed) so thatpostmark_delivery_eventsis queryable by merge context. -
The feature-developer implementing the merge engine must confirm that every status transition in the state machine produces a call to
write_customer_audit_eventwith the correct action namespace before closing the implementation PR. -
sre-agent must confirm whether
customer_sessionsstores the requesting IP (or ASN) before Detection 5 can be operationalized. This should be a pre-launch checkpoint, not a post-launch retrofit. -
The IP-to-ASN resolver dependency (Detection 4 and 5) needs an operator decision: add MaxMind GeoLite2-ASN as a Raptor dependency (free, offline, ~10MB) or resolve at event-query time against a public ASN API. The offline approach is strongly preferred for a financial-data SaaS (no log exfiltration via lookup calls).
10. Signals That Require Security-Agent Input Before Operationalizing
The following proposed detections depend on the security-agent's threat-model output and should not be operationalized without their companion analysis:
- DSR-block bypass attempt. If an actor attempts to initiate a merge on a customer with a pending DSR and the block fires correctly, detection can verify the structural control worked. But the question of whether the DSR block is robust to race conditions is a security-agent question, not a detection question.
- Passkey post-merge rebinding anomaly. Detecting whether a newly-rebound passkey is being used suspiciously requires understanding the attacker's full control path, which is security-agent scope.
- Billing-credit abuse. The D4 decision on subscription extension/refund creates a potential abuse vector (merge, extract the subscription extension, reverse the merge). This is in scope for the fraud-detection catalog but requires security-agent to model the value of the credit vs. the effort of the attack before a detection threshold makes sense.
This memo is detection-design analysis. It does not constitute a code change, a rule deployment, or a D1 recommendation. Per feedback_deterministic_execution_ai_augments, all detections described here are statistical signals for human review — no automated reversal or account action follows from any of these detections firing.