Raxx · internal docs

internal · gated

Account Merge D1 — Detection Design Memo

Date: 2026-06-05 UTC Status: Design-phase memo — awaiting operator D1 decision Author: detection-engineer agent Scope: Behavioral-detection design pass on the D1 (who picks primary) decision for account-merge. Cross-reference: docs/architecture/account-merge-2026-06-05.md | security-agent threat-model memo: TBD (filing in parallel). Epic: #3245 | PR: #3256

This memo surfaces what each D1 candidate flow opens up for an attacker and what detection signals must exist to observe those behaviors. It does not pick a winner. The operator and security-agent own the final posture call.


1. Flow Summaries (Detection Lens)

Flow 1 — Hybrid (CS nominates, customer can request one swap before first code is consumed)

The flow has two distinct windows: - Pre-verification swap window. Between POST /internal/merges and the moment the first code is verified, the customer can call POST /merges/{id}/swap-primary. This window has no cryptographic constraint other than session auth — the caller just needs to be authenticated to either account. - Post-first-verification lock. Once primary_verified_at or secondary_verified_at is set, the primary designation is frozen.

Attack surface the swap window opens: - An adversary who has compromised only one inbox (say the secondary account's email) can observe the merge-initiation email arriving in that inbox and call /swap-primary before verifying anything. If they can authenticate as the secondary account, they can elect themselves the survivor. This is the structural risk that security-agent's threat model will quantify; detection's job is to make the call observable. - The swap operation currently has no separate audit footprint beyond the primary_swap_requested boolean on account_merges. Detection needs a dedicated event so timing and actor context survive.

CS-side risk: - CS nominates the wrong pair of accounts (autocomplete error, ticket misread). This produces a merge that one or both customers reject. Detectable via early cancellations and rapid reversals attributed to specific CS users.

Flow 2 — CS-only (CS picks, customer has no mechanism to dispute before verification)

Tightest control surface: - No swap-window attack surface. The only path to changing the primary is creating a new merge record, which requires CS action and produces an audit event. - Social-engineering risk is fully concentrated in the CS staff: an adversary tricks a CS agent into initiating a merge with the adversary's account as primary. One phone call or spoofed support email is sufficient. - Highest reversal rate in practice (customers complain after the fact rather than having an up-front voice). Reversal clustering by CS user is the primary anomaly signal.

Flow 3 — Customer-choice during verification

Customer designates primary by which account they verify first (or by an explicit selection UI). - Highest adversary flexibility: if an adversary controls one inbox, they verify first and claim primary. The cross-verification requirement still forces the adversary to also control the other inbox or social-engineer a real user into verifying — but the primary election is still decided at the earliest possible moment. - Merge-initiation patterns remain CS-gated, which preserves that seam. But the primary election seam is entirely in the verification sequence, which is customer-facing and has a much wider attack surface than a CS UI. - Detection is harder: distinguishing a legitimate customer choosing their preferred primary from an adversary racing the swap requires IP-distance and session-fingerprint correlation that isn't needed in the other flows.


2. Attacker Behavior Catalog by Stage

Stage A — Merge Initiation (POST /internal/merges)

This stage is CS-only in all three flows. It is the earliest choke point.

A1 — Social-engineered CS initiation. Adversary contacts support claiming to own both accounts (or claiming account-recovery). CS initiates a merge that maps a legitimate account to one the adversary controls. - Observable: the merge record has at least one user_id that has no prior ticket history with the initiating CS agent. Cross-referencing freescout_ticket_cache for both primary_user_id and secondary_user_id against the freescout_ticket_id on the merge record surfaces this. - Observable: CS user's merge initiation rate deviates from their personal baseline.

A2 — Autocomplete or copy-paste error by CS. No adversarial intent; CS selects the wrong account. Behavioral signature is indistinguishable from A1 until reversal. - Observable: merge is cancelled before any verification, or reversed within 24h. These "early turnaround" merges cluster on specific CS users.

A3 — Bulk scripted initiation (insider or API key compromise). An adversary with customers:merge:write role (or a compromised service token) fires many initiations in a short window. - Observable: merge-initiation count per 5-minute window deviates from Poisson baseline. With zero live merges today the baseline is zero; first signal fires immediately and by design.

Stage B — Code Generation (Postmark sends codes to both addresses)

B1 — Inbox compromise (code intercept). Adversary has read access to one inbox and is waiting for the code. They verify immediately after delivery. - Observable: time-to-verify is anomalously short (sub-60-second verify after code delivery). In a legitimate scenario the customer reads, processes, and then logs in to verify — this takes minutes to hours. A bot intercept happens in seconds. - Observable: verification from an IP address geographically distant from the account's historical session IP cluster.

B2 — Postmark delivery failure exploited. Adversary triggers repeated resend (up to 5 allowed per account) to keep the merge in initiated state indefinitely, preventing legitimate resolution. - Observable: resend count approaches maximum for a merge record. High-resend merges with no verification after 12h are anomalous.

B3 — Postmark signature failures on merge emails. If Postmark webhook signature validation failures spike on the merge.initiated template, someone may be spoofing delivery events to confuse state. - Observable: signature failure rate on merge-class emails via postmark_delivery_events.

Stage C — Code Entry (POST /merges/{id}/verify)

C1 — Cross-session verification (two browsers, two IPs in the same window). Both verifications arrive within a short time window from geographically separated IPs. This can be legitimate (a customer on mobile + desktop), but it is the canonical cross-account-compromise pattern: adversary verifying from their network while real customer also just happens to verify. - Observable: IP-to-IP geolocation distance between primary_verified_at session and secondary_verified_at session exceeds the distribution of same-user session distances for this account.

C2 — Brute-force of 8-char code. The rate limiter (10 attempts per merge record) prevents this structurally, but a detection rule watching for repeated 400/409 responses from /merges/{id}/verify confirms the structural control is firing correctly. - Observable: 4+ failed verify attempts on any merge record within 60 minutes.

C3 — Swap-window race (Hybrid only). Adversary with session on secondary account calls /swap-primary then immediately calls /verify before the legitimate primary user does. The swap + fast verify sequence in under 60 seconds is the signature. - Observable: primary_swap_requested = TRUE on a merge record where secondary_verified_at precedes primary_verified_at AND both verifications occurred within 5 minutes of merge initiation.

C4 — Re-use of consumed code (replay). The architecture uses used_at to block this structurally. Detection confirms the structural control: any 409 on /verify for an already-consumed code is an instrumentation point, not a vulnerability. - Observable: 409 responses on the verify route per merge_id.

Stage D — Completion

D1 — Post-merge session anomaly. The surviving account (primary) suddenly exhibits authentication patterns inconsistent with its own history: new device, new IP region, different time-of-day pattern. - Observable: first post-merge session for primary user_id arrives from an IP class or geolocation not present in the prior 30d session history. - Observable: passkey that was never used before merge (came from secondary) is the authenticating credential on the first post-merge login.

D2 — High-value data exfiltration after merge. Adversary who merged to gain access to the primary's strategies and backtest history starts exporting or reading data at abnormally high rates. - Observable: strategy read + backtest read event rate for primary_user_id in the 24h post-merge vs. the 30d pre-merge baseline for that user.

Stage E — Reversal (POST /internal/merges/{id}/reverse)

E1 — CS uses reversal to cycle account ownership. Adversary convinces CS to merge, extract data, then reverse to restore the victim. This is a reconnaissance pattern. - Observable: a merge record is both completed and reversed within 2h. Short merge lifecycles (complete + reverse within the reversal window, within 2h) are suspicious.

E2 — CS user with disproportionate reversal rate. A compromised or coerced CS agent completes merge requests and then reverses them after adversary extraction. Reversal rate per CS user over a 7-day window is the signal. - Observable: reversal_count / completion_count per initiated_by_cs hash over 7d exceeds 0.3 (30% reversal rate against a baseline expected to be <5%).


3. Cardinality and Drift Detectors

3.1 Per-CS-User Merge-Rate Poisson Anomaly

Method: Per-initiated_by_cs hash, count merge initiations per 24-hour window. Model as Poisson with lambda estimated from trailing 30-day history. Flag when observed count exceeds the 99.9th percentile of the Poisson distribution.

Pre-launch state: No baseline exists. Seed lambda = 0.1 initiations/day (expectation is occasional organic requests). First merge by any CS user fires at HIGH; re-baseline after 30 days of observed activity.

Minimum sample: 10 observed events per CS user before the statistical threshold governs; below that, every merge initiation fires at MEDIUM.

3.2 Per-Customer-Pair Geolocation Distance Distribution

Method: For each merge record, compute the great-circle distance between: - The primary account's median session IP geolocation over the trailing 30 days. - The secondary account's median session IP geolocation over the trailing 30 days.

Build a running distribution of this distance across all completed merges. Flag records where the distance exceeds the 95th percentile of the distribution AND both accounts' primary session IPs have no overlap in their IP-to-ASN mapping (i.e., they don't share a residential ISP, CDN, or corporate prefix).

Note on operator VPN: Per user_uses_vpn, IPs from Datacamp/CDN77 and similar prefixes do not constitute a true geolocation anchor. The IP-distance check must exclude CDN/VPN ASNs from the median computation.

Pre-launch state: Baseline window is empty. Seed with a 1000-km flag threshold as an absolute until the distribution has 20+ samples.

3.3 Merge Initiation Time-Bunching

Method: Compute the inter-arrival time between successive merge initiations (all CS users combined). Model as an exponential distribution (Poisson process). Flag when 3+ initiations occur within a 5-minute window. A legitimate CS operation is a per-ticket manual action; 3 in 5 minutes is anomalous for manual work.

Baseline: Initiation rate expected to be 1–3 per day at launch. Burst detector is absolute (3 in 5 minutes) until the empirical inter-arrival rate has been observed for 30 days, then switch to the 99.9th percentile of the exponential distribution.

3.4 Reversal Rate as Leading Mis-Merge Indicator

Method: Weekly rolling ratio: reversed_count / (reversed_count + completed_count) across all merges. Alert at: - MEDIUM: ratio > 0.10 (10%) - HIGH: ratio > 0.25 (25%)

Per-CS-user dimension: Same ratio computed per initiated_by_cs hash. Flag any individual CS user at HIGH if their personal reversal ratio > 0.30 over any 14-day window with at least 3 completions.

3.5 Time-to-Verify Distribution

Method: For each merge record, compute min(primary_verified_at, secondary_verified_at) - initiated_at. This is the time from code dispatch to first verification. Build a distribution across completed and reversed merges. Flag records where time-to-first-verify < 90 seconds (absolute; no statistical baseline needed — legitimate humans cannot open an email client, read a code, navigate to the app, and verify in under 90 seconds).

K-S test complement: Monthly, run a Kolmogorov-Smirnov test comparing the current month's time-to-verify distribution against the prior 3-month baseline. A shift toward shorter times (D-statistic > 0.2 at alpha=0.05) is a signal that verification behavior is changing — possibly automated.


4. Required Instrumentation — customer_audit_events Additions

The architecture doc specifies that every account_merges status transition produces an entry in customer_audit_events. That is necessary but not sufficient for detection. The following additional events are required.

All events use dimension = "operator_interaction" for CS-actor events and dimension = "customer_self" for customer-actor events. The ticket_id field on each event must be populated from account_merges.freescout_ticket_id.

New action namespaces required (additions to audit_action_allowlist.yaml)

merge.initiated
  allowed_fields:
    - merge_id                   # account_merges.id (bigint, not user PII)
    - primary_user_id            # bigint
    - secondary_user_id          # bigint
    - cs_actor_hash              # SHA-256 of initiating CS email — same convention as initiated_by_cs column
    - freescout_ticket_id        # string or null
    - dsr_block_checked          # boolean — was DSR check performed
  dimension: operator_interaction

merge.cancelled
  allowed_fields:
    - merge_id
    - cancelled_by               # "cs" | "system" (expiry)
    - cs_actor_hash              # null if system-cancelled
    - reason                     # free text, operator-supplied
  dimension: operator_interaction

merge.code_verified
  allowed_fields:
    - merge_id
    - verifying_account_role     # "primary" | "secondary" — which account completed THIS verification
    - verification_sequence      # "first" | "second" — was this the 1st or 2nd code consumed?
    - request_ip_class           # /24 prefix only — not full IP (PII floor)
    - request_asn                # autonomous system number of the verifying IP
    - seconds_since_initiation   # integer — time from merge.initiated to this event
  dimension: customer_self

merge.code_verify_failed
  allowed_fields:
    - merge_id
    - verifying_account_role     # "primary" | "secondary"
    - failure_reason             # "wrong_code" | "expired" | "already_consumed" | "rate_limited"
    - attempt_number             # 1..10 (the structural limit)
  dimension: customer_self

merge.resend_requested
  allowed_fields:
    - merge_id
    - account_role               # "primary" | "secondary"
    - resend_sequence            # 1..5 (the structural limit)
  dimension: customer_self

merge.swap_primary_requested
  allowed_fields:
    - merge_id
    - requesting_account_role    # "primary" | "secondary" — which session called /swap-primary
    - request_ip_class           # /24 prefix only
    - request_asn
    - seconds_since_initiation
  dimension: customer_self
  # Hybrid flow only. This event is the primary detection anchor for the swap-window attack.

merge.completed
  allowed_fields:
    - merge_id
    - primary_user_id
    - secondary_user_id
    - tables_migrated_count      # integer — rows re-FK'd
    - billing_action             # "none" | "refund_issued" | "subscription_extended"
    - duration_seconds           # time from verified to completed
  dimension: system_automated

merge.failed
  allowed_fields:
    - merge_id
    - failure_stage              # "pre_flight" | "mid_transaction" | "post_transaction"
    - error_category             # "dsr_block" | "db_error" | "passkey_rebind_error" | "billing_error" | "other"
  dimension: system_automated

merge.reversed
  allowed_fields:
    - merge_id
    - cs_actor_hash
    - days_since_completion      # integer — how deep into the 14-day window
    - reason                     # operator-supplied text
    - rows_restored_count        # integer — rows re-FK'd back to secondary
  dimension: operator_interaction

merge.post_merge_session_new_ip
  allowed_fields:
    - merge_id
    - user_id                    # the primary_user_id post-merge
    - session_ip_class           # /24 prefix
    - session_asn
    - is_new_asn                 # boolean — ASN absent from 30d pre-merge history
    - hours_since_completion
  dimension: system_automated
  # Written by session middleware when the first post-merge session is detected
  # from an IP class not in the primary's 30d history.

Note on IP storage: Full IPs are not stored in after_state (PII floor). The /24 prefix and ASN are sufficient for geolocation-distance detection and carry no single-host PII. This is consistent with how waf_events handles IP cardinality.

console_audit_events — CS-side parallel write

Every merge.* event with dimension = "operator_interaction" must also produce a parallel write to console_audit_events so the CS-user anomaly detectors can query on the console side without touching the customer PII tables. The console_audit_events schema already handles this pattern; the merge events need entries in the console-side action allowlist as well.


5. Per-Flow Detection-Surface Comparison

1 = cleanest detection surface (most observable, fewest blind spots) 3 = hardest to monitor (most ambiguous signals, more required infrastructure)

Criterion Flow 1: Hybrid Flow 2: CS-only Flow 3: Customer-choice
Swap-window attack surface Present (requires merge.swap_primary_requested event) Absent Absent for swap; primary election embedded in verify sequence
CS-side observability Same as CS-only; CS nominates All actor attribution in CS CS nominates only; customer election at verify time
Verification sequence distinguishability Medium — swap + verify ordering carries signal High — no swap; verify sequence is the only customer signal Low — which-verify-first IS the decision; hard to distinguish legitimate choice from adversarial race
Time-to-verify signal Same power in all flows Same Same
IP-distance cross-session signal Same power in all flows Same Same; but also needed to classify the election itself
Reversal rate per CS user Same power in all flows Highest signal (all complaints surface as reversals) Distributed between reversals and swap recriminations
New audit events required 9 (including swap event) 8 (no swap event) 9+ (need election event, not just swap)
Overall detection rank 2 1 3

Flow 2 (CS-only) has the cleanest detection surface because the control surface is entirely on one side (CS), reversal rate is the natural leading indicator of mis-merges, and there is no swap-window event requiring timing analysis. Every anomaly channels through the CS-side actor signals.

Flow 1 (Hybrid) is detectably richer than Flow 3 but adds a swap-window event that requires fast detection (the attack window is open for only as long as neither code has been verified). This window is measurable but narrow.

Flow 3 (Customer-choice) is the hardest to monitor because the primary election seam — the highest-value action in the merge — is indistinguishable at detection time from a legitimate customer preference. The only disambiguation is after-the-fact (was the elected primary the one with the adversarial session?), which means detection fires on completion or reversal rather than at the seam. This is a material detection lag.

This ranking is offered as a tiebreaker input. It does not override security-agent's threat-model output or the operator's policy judgment.


6. Ranked High-Signal Detections

Ranked by: (attack severity if undetected) * (signal quality) / (false-positive rate). Pre-launch baselines are sparse; rankings favor detections that fire immediately on clear signatures rather than requiring statistical maturity.

Detection 1 — Sub-90-Second Verification (CRITICAL trigger)

What it catches: Automated inbox compromise (B1). Code intercepted by a mail-scanning bot or by an adversary with active inbox access. Human verification cannot occur in under 90 seconds.

Signal: merge.code_verified.seconds_since_initiation < 90 for either the first or second verification event on a merge record.

Telemetry: customer_audit_events with action = "merge.code_verified".

False positive rate: Near zero. No legitimate user verifies in under 90 seconds. The 90-second floor is conservative; if empirical data shows 95th-percentile human verify time is 3 minutes, tighten to 180 seconds after 30 days of data.

Alert route: HIGH (no statistical baseline required; single-event signature). File GH issue with type:security. Escalate to security-agent for session analysis.

Detection 2 — Swap + Immediate Verify (Hybrid flow, HIGH)

What it catches: Swap-window race attack (C3). Adversary calls /swap-primary and then verifies before the legitimate primary user does, electing themselves the surviving account.

Signal: merge.swap_primary_requested.seconds_since_initiation < 300 AND merge.code_verified (verifying_account_role = role that requested swap) arrives within 60 seconds of the swap event on the same merge_id.

Telemetry: customer_audit_events correlated on merge_id across merge.swap_primary_requested and merge.code_verified.

False positive rate: Low. A legitimate customer who wants to swap would: read the email, decide they prefer the other account, call CS or navigate to the swap endpoint — this takes minutes. A sub-5-minute initiation-to-swap-to-verify sequence is adversarial in shape. Estimated FP rate: <2% of legitimate hybrid merges based on expected human task-switching time.

Alert route: HIGH. Flag merge record for manual CS hold before completion is allowed to proceed.

Note: This detection only exists in Flow 1. If D1 resolves to Flow 2 or Flow 3, this detection is unnecessary.

Detection 3 — Per-CS-User Merge Initiation Anomaly (HIGH)

What it catches: Compromised CS credentials (A3), insider-threat bulk initiation, or social-engineering campaigns targeting CS (A1).

Signal: Count of merge.initiated events per cs_actor_hash in any 24-hour window exceeds the 99.9th percentile of that user's Poisson-modeled initiation rate. Pre-baseline (first 30 days or fewer than 10 observations per user): flag at MEDIUM if count > 3 in 24h, HIGH if count > 5 in 24h.

Telemetry: console_audit_events (CS actor writes), keyed on the cs_actor_hash payload field.

False positive rate: Moderate before baseline establishes. A CS user handling a backlog of merge tickets could hit 3 in a day legitimately. The 30-day baseline maturation period is necessary.

Alert route: HIGH. Investigate CS session for credential compromise. Escalate to security-agent if more than one CS user spikes simultaneously.

Detection 4 — Cross-Session IP-Distance Anomaly (HIGH)

What it catches: Two different actors verifying on behalf of two accounts (C1). One inbox compromised; the other is the legitimate user.

Signal: request_asn from merge.code_verified where verification_sequence = "first" vs. verification_sequence = "second" are in different country-code ASN prefixes AND neither is a known VPN/CDN ASN. Threshold: ASN country codes differ.

Telemetry: customer_audit_events correlated on merge_id, comparing ASN fields between the two verification events.

False positive rate: Moderate. Legitimate multi-device users (mobile + laptop, work + home) will trigger this. Requires manual review, not auto-block. Estimated FP rate: 15–25% of legitimate merges where both parties verify on different networks.

Alert route: MEDIUM. Include in ops@ daily digest. Manual CS review of the merge record.

Detection 5 — Post-Merge New-ASN Session on Primary Account (HIGH)

What it catches: Successful account-takeover that completed the merge. The adversary authenticated as the new surviving account from an ASN absent from the primary's session history.

Signal: merge.post_merge_session_new_ip.is_new_asn = true within 2 hours of merge.completed.

Telemetry: customer_audit_events with action = "merge.post_merge_session_new_ip", written by session middleware. Requires session middleware instrumentation (new infra — see Section 7).

False positive rate: Moderate. Customers do sometimes log in from new locations post-merge (they were notified by email; they may open it from a mobile device on a different network). Context: if the new ASN is also the same ASN that verified one of the codes, the confidence is higher. Estimate 20–30% FP rate on new-ASN alone; drops to ~5% when combined with detection 1 or 2 on the same merge_id.

Alert route: HIGH when correlated with detection 1 or 2 on same merge_id. MEDIUM standalone.

Detection 6 — Per-CS-User Reversal Rate Spike (MEDIUM)

What it catches: CS-facilitated extract-and-revert (E2), or systematic mis-merge pattern by a specific CS agent.

Signal: (reversed_count / total_completions) per cs_actor_hash over any 14-day rolling window > 0.30 with at least 3 completions.

Telemetry: customer_audit_events aggregated on merge.reversed.cs_actor_hash and merge.completed events.

False positive rate: Low once volumes are non-trivial. Pre-launch: this signal is informational only (any single reversal is 100% of that user's ratio). Promote to HIGH-severity after 30-day baseline.

Alert route: MEDIUM in ops@ digest. Escalate to HIGH if the reversing CS user is the same one who flagged in detection 3.

Detection 7 — High-Resend with No Verify (MEDIUM)

What it catches: Deliberate merge stall (B2) — adversary or confused customer keeps requesting resends to hold the merge open, possibly waiting for a window.

Signal: primary_resend_count + secondary_resend_count >= 4 on a merge record that has been in initiated status for more than 12 hours.

Telemetry: account_merges table direct query (no audit event needed — the resend counts are on the row). Can be run as a nightly SQL query against Postgres.

False positive rate: Low. A merge open for 12 hours with 4+ resends but no verification indicates either an unreachable customer or an actor deliberately cycling codes.

Alert route: MEDIUM. CS notification to call the customer directly.

Detection 8 — Short-Lifecycle Merge (Complete-then-Reverse within 2h, HIGH)

What it catches: Reconnaissance pattern (E1) — adversary triggers merge to access primary's data, then reverses to restore the victim and avoid detection.

Signal: merge.reversed.days_since_completion < 1 AND merge.completed.duration_seconds < 3600 (merge completed in under 1 hour of initiation). Proxy for "completed and reversed on the same day before anyone noticed."

Telemetry: customer_audit_events correlated on merge_id between merge.completed and merge.reversed.

False positive rate: Low. Legitimate reversals happen when customers call in to complain (takes hours to days). Sub-2-hour complete-and-reverse has no legitimate scenario.

Alert route: HIGH. File GH issue type:security. Escalate to security-agent for full session replay on the primary account during the merge window.


7. New Monitoring Infrastructure Required

The following detection signals cannot be computed from the existing customer_audit_events schema and require either schema additions (covered in Section 4) or new instrumentation that does not currently exist.

7.1 Session Middleware Post-Merge Hook (New Infra)

Detection 5 (merge.post_merge_session_new_ip) requires session middleware to: 1. Know when a user_id corresponds to a recently-merged primary account (check account_merges where primary_user_id = session.user_id and merge_completed_at > now() - 48h). 2. Compare the incoming session IP's ASN against a 30-day rolling cache of ASNs for that user_id. 3. Write a merge.post_merge_session_new_ip audit event if the ASN is new.

This is instrumentation that does not exist today. It requires: - A 30-day ASN history cache per user_id (can be a lightweight Postgres query on customer_sessions if session IP or ASN is stored there — verify with sre-agent whether customer_sessions carries IP). - An IP-to-ASN lookup at session validation time. The architecture doc does not reference an ASN lookup library in Raptor today.

Recommendation to sre-agent: before this detection can produce signal, confirm whether customer_sessions stores the requesting IP and add ASN resolution if not.

7.2 Postmark Delivery Event Merge-Classification (Existing Table, New Query)

Detection of Postmark bounce/delivery anomalies on merge emails (B3) requires that postmark_delivery_events carries a template_id or tag field that identifies the email as merge-related. The architecture doc specifies three merge email templates (merge-initiated, merge-completed, merge-failed). The detection can be built today if those template names are consistently applied as Postmark message stream tags when the emails are sent.

No new table needed; just consistent tagging. This is a feature-developer instrumentation requirement, not a new detection table.

7.3 IP-to-ASN Resolution (Possible New Dependency)

Detection 4 requires ASN comparison between two verification events. If Raptor does not currently resolve ASN at request time, a lightweight resolution call is needed (e.g., a local MaxMind GeoLite2-ASN lookup). This is in-scope for sre-agent to evaluate.

7.4 console_audit_events Merge Action Allowlist

All CS-side merge events need entries in the console-side action allowlist (the console app has its own equivalent of audit_action_allowlist.yaml). This is a same-PR requirement for the merge feature implementation, not a separate infrastructure project.


8. What Fits Existing Infrastructure vs. What Requires New Instrumentation

Detection Existing infra sufficient? What's missing
Sub-90s verify (D1) Yes, once audit events land merge.code_verified event with seconds_since_initiation field
Swap + immediate verify (D2) Yes, once audit events land merge.swap_primary_requested + merge.code_verified events
Per-CS-user rate anomaly (D3) Yes, via console_audit_events merge.initiated event with cs_actor_hash in console audit log
Cross-session IP-distance (D4) Partially — need ASN resolution ASN field in merge.code_verified; IP-to-ASN resolver in Raptor
Post-merge new-ASN session (D5) No Session middleware post-merge hook; ASN history cache
Per-CS reversal rate (D6) Yes, once audit events land merge.reversed event with cs_actor_hash
High-resend no-verify (D7) Yes — direct account_merges query None (query existing resend_count columns)
Short-lifecycle merge (D8) Yes, once audit events land merge.completed + merge.reversed events with timestamps

9. Operator-Action Checklist

The following items must be in place before any merge detection can produce signal. These are instrumentation requirements, not optional improvements.

  1. All merge.* action namespaces in Section 4 must be registered in backend_v2/api/audit_action_allowlist.yaml before the merge feature ships. Without them, the audit writer service will reject every merge event with 422.

  2. The same action namespaces must be added to the console-side audit allowlist for operator_interaction-dimensioned events.

  3. Postmark merge email sends must apply consistent message stream tags (merge-initiated, merge-completed, merge-failed) so that postmark_delivery_events is queryable by merge context.

  4. The feature-developer implementing the merge engine must confirm that every status transition in the state machine produces a call to write_customer_audit_event with the correct action namespace before closing the implementation PR.

  5. sre-agent must confirm whether customer_sessions stores the requesting IP (or ASN) before Detection 5 can be operationalized. This should be a pre-launch checkpoint, not a post-launch retrofit.

  6. The IP-to-ASN resolver dependency (Detection 4 and 5) needs an operator decision: add MaxMind GeoLite2-ASN as a Raptor dependency (free, offline, ~10MB) or resolve at event-query time against a public ASN API. The offline approach is strongly preferred for a financial-data SaaS (no log exfiltration via lookup calls).


10. Signals That Require Security-Agent Input Before Operationalizing

The following proposed detections depend on the security-agent's threat-model output and should not be operationalized without their companion analysis:


This memo is detection-design analysis. It does not constitute a code change, a rule deployment, or a D1 recommendation. Per feedback_deterministic_execution_ai_augments, all detections described here are statistical signals for human review — no automated reversal or account action follows from any of these detections firing.