Raxx · internal docs

internal · gated ↑ index

ADR 0022 — Event Log: Append-Only + Hash Chain for Tamper Evidence

Status: Accepted
Date: 2026-04-29
Deciders: software-architect
Refs: workflow-uuid-tracing.md, ADR-0003, ADR-0002


Context

The workflow trace must support non-repudiation: when a user disputes "I never placed that trade," Raxx's answer must be backed by cryptographic evidence, not a mutable log. The event log must also support insider-threat defense: if a Raxx operator attempts to modify or delete trace rows to cover up activity, the modification must be detectable.

Two distinct tamper-evidence mechanisms are in scope:

  1. Append-only enforcement (database-level): ensure events cannot be updated or deleted by the application.
  2. Hash chain (application-level): ensure a missing or modified event is detectable even if the database-level control is bypassed.

An additional mechanism, Ed25519 signatures on system-action events, authenticates the origin of events (was this event really emitted by the MQ-A scheduler, or was it fabricated?).

This ADR records the decisions on each mechanism and the tradeoffs considered.


Decision

1. Append-only: database-level grant restriction

The application database user (raptor_app) is granted INSERT and SELECT on trace_events and trace_workflows only. No UPDATE, no DELETE.

DDL and schema changes are executed by a separate privileged user (raptor_migrations) during migration runs only — never at runtime.

A CI lint step in the migration pipeline checks any new SQL migration file for GRANT UPDATE, GRANT DELETE, DROP TABLE, TRUNCATE targeting trace_events or trace_workflows and fails the pipeline if found.

This does not prevent a database superuser or a compromised raptor_migrations credential from modifying data, but it raises the bar: modifying the trace requires either a migration file (which goes through code review) or direct database access (which is logged by Timescale/Postgres and is itself a break-glass event).

2. Hash chain: per-workflow, SHA-256

Each event row stores hash_prev: the SHA-256 hash of the canonical JSON serialization of the previous event in the same workflow's chain.

Canonical serialization: all columns sorted by name alphabetically, values as UTF-8 strings, no whitespace. The first event in a workflow uses hash_prev = SHA-256("genesis:" + workflow_id).

A nightly integrity checker (jobs/trace_integrity_check.py) re-derives the hash chain for all workflows touched in the last 24 hours and compares against stored hash_prev values. Any mismatch is logged as a severity:critical audit event and triggers the breach-notification pipeline from ADR-0003.

Pre-launch posture: the hash chain is recommended but not required for the 4-week MVP. The nightly checker runs a simpler row-count audit in the MVP phase. Full chain verification ships by GA.

3. Ed25519 signatures on system-action events

System-action events (sys_*) emitted by deterministic subsystems (MQ-A scheduler, Raptor paper-gate, Raptor order-router) carry an Ed25519 signature over their canonical JSON payload.

Each subsystem has its own signing key stored in Infisical. The sig_key_version field on the event row identifies which key signed it, enabling key rotation without invalidating old signatures.

The integrity checker verifies signatures on sys_* events using the public key for the recorded sig_key_version. A sys_* event with an invalid or missing signature is flagged as a tamper event.

Pre-launch posture: key infrastructure (Infisical paths for subsystem keys) is provisioned before GA, even if signatures are not enforced until post-launch. This avoids a "chicken and egg" migration.


Consequences

Positive

Negative


Alternatives Considered

Merkle tree over all events (not per-workflow)

A Merkle tree across all events in a time window provides a single root hash that summarizes the full event log. Verification is O(log n) rather than O(n). Used by blockchain systems and certificate transparency logs.

Rejected for v1: implementation complexity is high; the per-workflow hash chain is sufficient for Raxx's dispute resolution use cases; Merkle trees add significant engineering overhead without proportional benefit at pre-launch scale. Noted as a v3 option if Raxx needs to publish an externally verifiable audit log to regulators.

External audit log service (e.g., Datadog Audit Trail, Splunk)

Offloading the append-only log to an external service provides a second copy outside Raxx's infrastructure. However, it adds a third-party data processor for potentially sensitive behavioral data (GDPR complication), a new billing line, and a dependency on a third-party availability SLA. Rejected for v1.

Postgres WAL archiving as tamper evidence

Postgres Write-Ahead Log (WAL) is append-only by nature and can be archived to S3. A full WAL archive is technically tamper-evident for the database state. However, WAL archives are not queryable, are very large, and are not designed for selective verification. Useful as a last-resort recovery path, not as the primary tamper-evidence mechanism. Not rejected — WAL archiving should be enabled in production for general durability, but it does not replace the hash chain.

No tamper evidence, rely on DB access controls alone

Access controls reduce the probability of tampering but do not make it detectable. In a regulatory or dispute context, "we have access controls" is weaker than "we have cryptographic evidence that no event was modified or deleted." The hash chain adds the detection capability that access controls alone lack. Rejected.