Raxx · internal docs

internal · gated

Operator Action Queue — 2026-05-12

Incident ID: 2026-05-12-operator-action-queue Date: 2026-05-12 UTC Type: Planned operator-action dispatch (not an incident) Operator authorization: 2026-05-12 ~01:30 UTC — "SRE Agent can do the terraform apply and verify the STRIPE Key permission." Author: sre-agent


Task 1 — Terraform apply email-delivery-stack

Status: COMPLETED

Authorization

Explicit operator authorization received 2026-05-12 ~01:30 UTC. Prior SRE (aa6307a23e06c5d8f) was blocked by the auto-mode classifier on this action; this session received the explicit consent required to proceed.

Pre-apply state verification

Plan result

Plan: 40 to add, 0 to change, 0 to destroy

Matches prior SRE's verified plan. No deviations.

Apply result — first pass

38 of 40 resources created successfully. Two resources failed:

aws_sns_topic_policy.inbound   — FAILED
aws_sns_topic_policy.outbound  — FAILED

Error: InvalidParameter: Policy statement action out of service scope!

Root cause: Both topic policies contained "Action": "sns:*" for the AllowAccountRoot statement. AWS FIFO SNS topics reject the sns:* wildcard because some standard SNS actions are not supported on FIFO topics. The 38 other resources (including the actual SNS topics, SQS queues, subscriptions, IAM roles, DynamoDB table, SSM parameters, and CloudWatch alarms) were all created successfully.

Functional impact of initial failure: None. The sns_publisher IAM role already had sns:Publish granted via its inline role policy (raxx-email-sns-publish-only). The failed topic policies were adding redundant access. Message flow was not blocked.

Fix applied (sns.tf): Replaced "sns:*" with an explicit list of FIFO-supported actions matching the AWS default topic policy:

sns:GetTopicAttributes, sns:SetTopicAttributes, sns:AddPermission,
sns:RemovePermission, sns:DeleteTopic, sns:Subscribe,
sns:ListSubscriptionsByTopic, sns:Publish

Source: actual default policy retrieved from aws sns get-topic-attributes on the created FIFO topic.

Apply result — second pass

Apply complete! Resources: 2 added, 0 changed, 0 destroyed.

All 40 resources now in Terraform state. Total resources managed: 40.

AWS resource verification

Resource type Names Status
SNS topics raxx-inbound-email.fifo, raxx-outbound-email.fifo Confirmed
SQS queues raxx-email-inbound-sns-dlq.fifo, raxx-email-outbound-sns-dlq.fifo, inbound-email-freescout-bridge.fifo, inbound-email-freescout-bridge-dlq.fifo, outbound-email-postmark-sender.fifo, outbound-email-postmark-sender-dlq.fifo Confirmed
DynamoDB table email-dedup-idempotency Confirmed
SSM parameters /raxx/email/freescout_api_key, /raxx/email/postmark_server_token, /raxx/email/postmark_inbound_webhook_token, /raxx/email/freescout_mailbox_routing_map Confirmed

SSM parameter population

Parameters were initially created by Terraform with PLACEHOLDER values (per ssm.tf design). Populated from Infisical via CF-Access-authenticated universal auth token.

Vault path correction: POSTMARK_SERVER_TOKEN does not exist in Infisical. The canonical key name at /MooseQuest/postmark/ is POSTMARK_SERVER_API_KEY. Used that key to populate /raxx/email/postmark_server_token.

SSM parameter Infisical source Type Value length Status
/raxx/email/freescout_api_key /MooseQuest/freescout/FREESCOUT_API_KEY SecureString 32 chars SET
/raxx/email/postmark_server_token /MooseQuest/postmark/POSTMARK_SERVER_API_KEY SecureString 36 chars SET
/raxx/email/postmark_inbound_webhook_token /Raxx/Email/POSTMARK_INBOUND_WEBHOOK_TOKEN SecureString 43 chars SET
/raxx/email/freescout_mailbox_routing_map /Raxx/Email/FREESCOUT_MAILBOX_ROUTING_MAP String 76 chars SET

Constraints honored: - aws ssm put-parameter stdout silenced (>/dev/null 2>&1) per feedback_heroku_config_set_echoes_secrets.md - No secret values printed, logged, or committed

Note on key name mismatch: The task brief asked for POSTMARK_SERVER_TOKEN but the vault has POSTMARK_SERVER_API_KEY. The SSM parameter /raxx/email/postmark_server_token now holds the POSTMARK_SERVER_API_KEY value. The Lambda code that reads this SSM parameter reads by path name, not vault key name, so this is correct. The discrepancy in naming should be noted for future documentation updates.

SNS→SQS smoke test

Published probe message to raxx-inbound-email.fifo: - MessageDeduplicationId: probe-<unix-timestamp> - MessageGroupId: test - Body: {"MessageID":"test-001","From":"kris@moosequest.net","To":"support@raxx.app","Subject":"PROBE-test-001"}

Result: 1 message received from inbound-email-freescout-bridge.fifo within 3 seconds. Message deleted after receipt to keep queue clean.

Smoke test: PASS

The SNS→SQS subscription with raw_message_delivery = true is functioning correctly. Messages publish to the FIFO topic and land in the bridge queue.

Terraform output key values (non-secret)

inbound_topic_arn     = arn:aws:sns:us-east-1:521228113048:raxx-inbound-email.fifo
outbound_topic_arn    = arn:aws:sns:us-east-1:521228113048:raxx-outbound-email.fifo
inbound_bridge_queue_url = https://sqs.us-east-1.amazonaws.com/521228113048/inbound-email-freescout-bridge.fifo
outbound_sender_queue_url = https://sqs.us-east-1.amazonaws.com/521228113048/outbound-email-postmark-sender.fifo
lambda_inbound_role_arn  = arn:aws:iam::521228113048:role/raxx-email-lambda-inbound
lambda_outbound_role_arn = arn:aws:iam::521228113048:role/raxx-email-lambda-outbound
sns_publisher_role_arn   = arn:aws:iam::521228113048:role/raxx-email-sns-publisher
dedup_table_name         = email-dedup-idempotency

Full output in /tmp/email-stack-outputs.json (local only, not committed).

Infrastructure drift note

The dynamodb_table parameter in the S3 backend config is deprecated. Terraform warns to use use_lockfile instead. Not a blocker — locking works. Tracked here for the next infrastructure maintenance window.


Task 2 — Verify Stripe key permission scopes

Status: COMPLETED

Key location

STRIPE_RESTRICTED_KEY confirmed at Infisical path /Raxx/Queue/Billing/Stripe/ (prod environment). Key length: 107 characters (consistent with Stripe restricted key format rk_live_... or rk_test_...).

Scope verification methodology

Per task brief: probed each capability by attempting a representative API call. 200 or 400 parameter_missing/invalid_request_error = has scope. 403 insufficient_permissions = missing scope.

Scope ADR-0076 requirement Method HTTP response Result
Customers Write POST /v1/customers with valid email 200 (customer created) PASS
Subscriptions Write POST /v1/subscriptions (missing required fields) 400 parameter_missing PASS
Invoices Write POST /v1/invoices (missing required fields) 400 parameter_missing PASS
Charges Read GET /v1/charges 200 PASS
Webhooks Read GET /v1/webhook_endpoints 200 PASS

All 5 scopes: PASS. No operator action required.

Test artifact cleanup

A live customer object was created to verify the Customers-Write scope (HTTP 200 = actual object created, not just permission check). Customer cus_UV5WvMsPe4MPE3 was deleted immediately after confirmation (DELETE /v1/customers/cus_UV5WvMsPe4MPE3 → HTTP 200 deleted: true). Stripe account left clean.

ADR-0076 scope alignment

All five scopes specified in ADR-0076 are present and verified. The key is ready for use by SC-QP-#406 (Queue Stripe service layer).


Infrastructure note — sns.tf fix

The sns:* wildcard bug in sns.tf is a latent defect that would have affected any future Terraform apply of this module. The fix (explicit action list) has been applied to the working tree and should be committed with the audit doc PR.

File changed: terraform/modules/email-delivery-stack/sns.tf

Change: "Action": "sns:*" in both AllowAccountRoot statements replaced with explicit list of 8 FIFO-supported SNS actions.


Operator-action gates hit

None. Both tasks completed fully.


Action items

# Action Owner Due
1 Update ssm.tf comment to reference POSTMARK_SERVER_API_KEY (not POSTMARK_SERVER_TOKEN) as the canonical Infisical key name sre-agent or feature-dev 2026-05-19
2 Address dynamodb_tableuse_lockfile deprecation in S3 backend config sre-agent 2026-05-19
3 SC-E3 (#1666) and SC-E4 (#1667) — Lambda functions — are now unblocked (IAM roles + queue ARNs available) feature-developer per sprint