Raxx · internal docs

internal · gated ↑ index

SRE Provisioning Batch — 2026-05-06

Date: 2026-05-06
Author: sre-agent
Session start: 10:10 UTC
Session end: 10:45 UTC


Session environment check

Credential Status Notes
heroku auth VALID — kris@moosequest.net CLI authenticated
aws sts VALID — arn:aws:iam::521228113048:user/aws-developer-alpha (AdministratorAccess)
gh auth VALID — MooseQuest scopes: admin:public_key, repo, read:org
INFISICAL_TOKEN / vault STALE — CF Access error 1010 vault.raxx.app CF Access has no service token for agent access; unblocks when #680 lands
CLOUDFLARE_RAXX_AUTOMATION_API_TOKEN STALE — HTTP 401 Token revoked or expired
CLOUDFLAREROLLED PARTIAL — zone reads work, /user/tokens/verify returns 401 Has Zone:Read; DNS write scope unconfirmed

Impact of stale credentials: Infisical vault writes blocked (Items 1 vault write, 2 CF billing token, 2 vault write). CF DNS write blocked (Items 5, 6 partial).


Item 1 — Rotate Heroku Platform API tokens (issue #454)

Status: PARTIAL (3/5 steps automated; vault write + old auth revoke require operator)

What was done

  1. Minted new Heroku OAuth authorization via POST /oauth/authorizations scope=global - New auth_id: 9d5ec3c4-...-408b9cb3720f - Description: raxx-platform-token-2026-05-06 - Token validated against GET /account: PASS

  2. Distributed new token as HEROKU_API_KEY to all 4 target apps: - raxx-console-prod: OK (HTTP 200) - raxx-console-staging: OK (HTTP 200) - raxx-api-prod: OK (HTTP 200) - raxx-api-staging: OK (HTTP 200)

  3. Updated GitHub Actions repo secret HEROKU_API_KEY — updated_at: 2026-05-06T10:27:39Z

Pending (operator required)

Vault write is blocked (CF Access service token not provisioned for vault.raxx.app). Once fixed:

# Step A — Write new token to vault
# (run from a shell with vault access, e.g., heroku run --app raxx-console-prod)
# vault path: /MooseQuest/heroku/HEROKU_PLATFORM_API_TOKEN
# New token is live in Heroku config vars and GH secret — retrieve from Heroku to write back:
# heroku config:get HEROKU_API_KEY -a raxx-console-prod
# Then store in Infisical at /MooseQuest/heroku/HEROKU_PLATFORM_API_TOKEN

# Step B — Revoke old auth (once vault is updated)
heroku authorizations:revoke ba6a2961-...-7866b505a3a6
# ba6a2961 = "GitHub Actions deploy (raxx-console + raxx-api)" — the prior automation key
# Confirm: curl -H "Authorization: Bearer <old_token>" https://api.heroku.com/account
# Expect: HTTP 401

Do not revoke ba6a2961 until vault is updated. Old and new tokens coexist on Heroku; the old one is only needed if a rollback is required.

Secret locations

Secret Location Notes
HEROKU_API_KEY (new) Heroku config vars on 4 apps + GH repo secret Live as of 10:27 UTC
/MooseQuest/heroku/HEROKU_PLATFORM_API_TOKEN Infisical vault NEEDS OPERATOR WRITE
/MooseQuest/heroku/HEROKU_API_KEY__AUTH_ID Infisical vault NEEDS OPERATOR WRITE — value: 9d5ec3c4-...-408b9cb3720f

Issue #454: leave OPEN until vault write + revoke complete.


Item 2 — Provision read-only billing API tokens (issue #759)

Status: PARTIAL (2/3 providers done; Cloudflare blocked)

Heroku billing token

Done. Dedicated Heroku authorization minted with description billing-collector-readonly-2026-05-06.

Scope tradeoff documented: Heroku Platform API does not offer fine-grained billing-read-only permissions. A global-scoped token is the minimum required. Recommend operator review whether this is acceptable before activating the billing collector.

Interim storage: Vault is blocked; credentials stored in SSM until vault access is restored: - SSM path: /raxx/billing-readonly/heroku_billing_api_key (SecureString, us-east-1) - SSM path: /raxx/billing-readonly/heroku_billing_auth_id (value: 99f3c0a4-...-dfc5842ef9a8)

Operator action required: Once vault CF Access is fixed, migrate both values to Infisical at /Raxx/Console/Billing/HEROKU_BILLING_API_KEY (create the /Raxx/Console/Billing/ folder via POST /api/v1/folders first per memory feedback_vault_folder_must_exist.md).

AWS billing credentials

Done. IAM user raxx-billing-readonly created (arn: arn:aws:iam::521228113048:user/raxx-billing-readonly).

Policy attached: raxx-billing-readonly-policy (arn: arn:aws:iam::521228113048:policy/raxx-billing-readonly-policy) — permissions: ce:Get*, ce:Describe*, ce:List*, billing:*ReadOnly, budgets:ViewBudget, cur:DescribeReportDefinitions.

Access key provisioned. Credentials stored in SSM (following feedback_aws_workloads_use_ssm_not_vault.md): - SSM path: /raxx/billing-readonly/aws_access_key_id (SecureString, us-east-1) - SSM path: /raxx/billing-readonly/aws_secret_access_key (SecureString, us-east-1)

Cloudflare billing token

BLOCKED. Requires an API token with User:API Tokens:Edit scope to mint a new Account.Billing:Read token. Both CLOUDFLARE_RAXX_AUTOMATION_API_TOKEN and CLOUDFLAREROLLED are stale.

Operator steps (3 minutes in CF dashboard):

  1. Go to https://dash.cloudflare.com/profile/api-tokens
  2. Click "Create Token" → "Custom token"
  3. Token name: raxx-billing-readonly
  4. Permissions: Account > Billing > Read
  5. Account Resources: Include → your account
  6. Click "Continue to summary" → "Create Token"
  7. Copy the token value (shown once)
  8. Store in SSM: aws ssm put-parameter --name "/raxx/billing-readonly/cloudflare_billing_token" --type SecureString --value "<token>" --region us-east-1 --overwrite >/dev/null
  9. Also migrate to vault at /Raxx/Console/Billing/CLOUDFLARE_BILLING_TOKEN when vault access is restored

Issue #759: leave OPEN until CF billing token step is complete.


Item 3 — Secrets for deploy audit log (post-#1267)

Status: DONE

What was done

Generated CONSOLE_AUDIT_INGEST_TOKEN (32-byte random hex, 64-char string).

GitHub Environment secrets: - production env: CONSOLE_AUDIT_INGEST_TOKEN — set 2026-05-06T10:29:26Z - staging env: CONSOLE_AUDIT_INGEST_TOKEN — set 2026-05-06T10:29:28Z

GitHub Environment variables: - staging env: CONSOLE_INTERNAL_URL = https://raxx-console-staging.herokuapp.com - production env: CONSOLE_INTERNAL_URL = https://console.raxx.app

Heroku console apps: - raxx-console-prod: CONSOLE_AUDIT_INGEST_TOKEN set, FLAG_CONSOLE_DEPLOY_AUDIT_INGEST=on - raxx-console-staging: CONSOLE_AUDIT_INGEST_TOKEN set, FLAG_CONSOLE_DEPLOY_AUDIT_INGEST=on

Both apps verified via Heroku API: FLAG=on, AUDIT_TOKEN_PRESENT=True.

The same token value is set in the GH environment secret (used by deploy workflow to authenticate) and in the Heroku config var (used by the console endpoint to verify the incoming request). They match.


Item 4 — SSM + IAM for FreeScout backup (post-#1272)

Status: DONE — dry-run verified

What was done

SSM parameters (us-east-1): - /raxx/freescout/db_password — already existed (version 1, 32 chars) - /raxx/freescout/ssh_key — written from /tmp/lightsail_us_east_1.pem (RSA PEM, 1680 chars), SecureString, version 1

IAM user raxx-freescout-backup: - ARN: arn:aws:iam::521228113048:user/raxx-freescout-backup - Policy attached: raxx-freescout-backup-policy (ARN: arn:aws:iam::521228113048:policy/raxx-freescout-backup-policy) - Permissions: ssm:GetParameter /raxx/freescout/*, s3:PutObject/GetObject/HeadObject/DeleteObject/ListBucket/CreateBucket on raxx-support-attachments, kms:GenerateDataKey/Decrypt/DescribeKey (via S3 KMS), lightsail:CreateInstanceSnapshot/GetInstanceSnapshots/DeleteInstanceSnapshot - Access key: AKIA...DKLMQ (last 4 only — full ID stored in SSM + GH secrets)

GitHub Actions repo secrets: - AWS_BACKUP_ACCESS_KEY_ID — set 2026-05-06T10:33:33Z - AWS_BACKUP_SECRET_ACCESS_KEY — set 2026-05-06T10:33:34Z

Verification: - IAM credentials validated: aws sts get-caller-identity as raxx-freescout-backup → PASS - SSM read /raxx/freescout/db_password: accessible (version 1) - SSM read /raxx/freescout/ssh_key: accessible (version 1) - Dry-run workflow dispatch gh workflow run freescout-backup.yml --field dry_run=true → completed success at 2026-05-06T10:34:07Z (run ID 25430172194)

Note on verified restore: The runbook requires a verified restore against a scratch instance before marking #714 fully closed. That test requires SSH access to a restored Lightsail instance and is operator-run per the runbook.

Note on dry-run limitation: The workflow dry-run skips SSM reads (dry_run=true uses placeholder creds). The SSM access was validated separately via AWS CLI using the backup user's credentials. The first real cron run at 06:00 UTC will be the end-to-end proof.


Item 5 — Verify raxx.app domain + enable DKIM (issue #1210)

Status: PARTIAL (DNS state is favorable; DKIM record add blocked; Workspace UI steps required)

Current DNS state (as of 2026-05-06 10:40 UTC)

Record Status
TXT raxx.app google-site-verification EXISTS — Ss0o0uHmeJNe-iafrf8DzI3ajM0SusH9JyUxEEl5DFY
TXT raxx.app SPF EXISTS — v=spf1 include:_spf.google.com include:spf.mtasv.net ~all
TXT _dmarc.raxx.app EXISTS — v=DMARC1; p=quarantine; rua=mailto:kris@moosequest.net; fo=1
MX raxx.app EXISTS — Google Workspace MX (aspmx.l.google.com + alt1-4)
TXT google._domainkey.raxx.app MISSING — DKIM not yet configured

The google-site-verification TXT is already present. Google Workspace may already show raxx.app as verified if the domain was added and this token was placed previously.

Operator checklist (Google Workspace Admin — 10 minutes)

  1. Open https://admin.google.com → Domains → Manage domains
  2. Check if raxx.app shows as "Verified". If yes, skip to step 5.
  3. If not verified: in the domain entry click "Verify" → Google will check the TXT record Ss0o0uHmeJNe-iafrf8DzI3ajM0SusH9JyUxEEl5DFY that is already in Cloudflare DNS → confirm.
  4. Once verified, confirm in a comment on issue #1210.
  5. Navigate to: Apps → Google Workspace → Gmail → Authenticate email (DKIM).
  6. Select domain raxx.app from the dropdown.
  7. Click "Generate new record". Default prefix google, key length 2048-bit.
  8. Copy the TXT record value (it will look like v=DKIM1; k=rsa; p=MIIBIjAN...).
  9. Add to Cloudflare DNS: - In Cloudflare dashboard → raxx.app zone → DNS → Add record - Type: TXT, Name: google._domainkey, Content: <value from step 8>, TTL: Auto - (If CF DNS edit token is restored: curl -X POST "https://api.cloudflare.com/client/v4/zones/f12dbb5cac57d5591a5058874498a6d1/dns_records" -H "Authorization: Bearer $CF_DNS_EDIT_TOKEN" -H "Content-Type: application/json" -d '{"type":"TXT","name":"google._domainkey.raxx.app","content":"<value>","ttl":1}')
  10. Wait 5-10 min for DNS propagation.
  11. Back in Workspace Admin → Gmail → Authenticate email → Click "Start authentication" for raxx.app.
  12. Verify: dig TXT google._domainkey.raxx.app returns the DKIM public key.
  13. Send a test message from a @raxx.app Google address to an external address; check headers for dkim=pass and spf=pass.

Issue #1210: leave OPEN until operator completes steps 2-13 above.


Item 6 — Provision ops@/billing@/no-reply@ on raxx.app (issue #1212)

Status: BLOCKED on item 5

Issue #1210 (domain verification + DKIM) must close before #1212 can proceed. The Google Workspace address provisioning requires the domain to be fully verified and active. Additionally, the raxx.app → secondary domain conversion (Group vs alias) is a one-way operation and the operator should confirm after reviewing the checklist output from item 5.

When item 5 is done, return here and follow this checklist:

Operator checklist (Google Workspace Admin — 15 minutes)

Step 1 — Convert raxx.app from alias to secondary domain (one-way) 1. Open https://admin.google.com → Domains → Manage domains 2. Find raxx.app → if it shows as "Domain alias", click the domain name → "Make this a secondary domain" 3. Confirm the warning. This is one-way. The domain becomes independently addressable.

Step 2 — Provision ops@raxx.app as a Google Group 1. Open https://admin.google.com → Directory → Groups → Create group 2. Group name: Raxx Ops, Group email: ops@raxx.app 3. Access type: Team (members can post) 4. Add Kristerpher as owner 5. Save. Verify: send a test email to ops@raxx.app and confirm it arrives in Kristerpher's inbox.

Step 3 — Provision billing@raxx.app as a Google Group (choice: Group, per operator default in task spec) 1. Same flow as ops@: Group name: Raxx Billing, Group email: billing@raxx.app 2. Add Kristerpher as owner and initial member 3. Save.

Step 4 — Configure no-reply@raxx.app as send-only 1. Open https://admin.google.com → Directory → Users → Add user 2. First name: No Reply, Last name: Raxx, Primary email: no-reply@raxx.app 3. Set a strong random password (this account will never be logged into interactively) 4. Navigate to the account → Add recovery email (optional, but recommended) 5. To make it send-only / bounce inbound: in Gmail settings for this account, set up a vacation responder with message "This address does not accept replies. For support, email support@raxx.app." — OR use a Group with external member receipt turned off. 6. The Postmark relay (already configured per #708) handles outbound; no-reply@ is a sender identity only.

Note: provisioning no-reply@ via Google Workspace API is possible but requires a service account with Domain-Wide Delegation — not available in this session. Steps 1-4 above are all achievable in the Admin UI.

Issue #1212: leave OPEN pending item 5 completion.


Summary table

# Item Status Issues
1 Heroku Platform API token rotation PARTIAL — token live on 4 apps + GH, vault write + old auth revoke pending #454 open
2 Billing read-only tokens PARTIAL — Heroku + AWS done; CF blocked #759 open
3 Secrets for deploy audit log (#1267) DONE
4 SSM + IAM for FreeScout backup (#1272) DONE — dry-run success
5 raxx.app domain verify + DKIM PARTIAL — DNS state OK; DKIM add UI-locked #1210 open
6 ops@/billing@/no-reply@ provisioning BLOCKED on #1210 #1212 open

Blocked by: stale Cloudflare tokens

Both CLOUDFLARE_RAXX_AUTOMATION_API_TOKEN and CLOUDFLAREROLLED return HTTP 401 on /user/tokens/verify. This blocks: - Minting new API tokens (needs API Tokens:Edit scope) - CF billing token for item 2 - DNS writes for items 5/6 (though CLOUDFLAREROLLED can read zone data)

To unblock: Rotate CF API tokens via Cloudflare dashboard: 1. Go to https://dash.cloudflare.com/profile/api-tokens 2. For each stale token, click "Roll" to generate a new value 3. Run source scripts/ops/session-bootstrap.sh to refresh the local env 4. Re-run items 2 (CF billing token) and 5 (DKIM record add) from this batch


Blocked by: vault CF Access service token missing

Infisical vault at https://vault.raxx.app is protected by Cloudflare Access. The current session CF Access service token (CF_ACCESS_CLIENT_ID) is not authorized for the vault.raxx.app Access app (error 1010). This blocks: - Writing new Heroku token to vault (item 1, step B) - Writing billing tokens to vault (item 2)

To unblock: Provision a CF Access service token for vault.raxx.app per docs/ops/runbooks/cf-access-service-token-provisioning.md, then add it to the agent's environment. Issue #680 tracks this.


Secret inventory (this session)

No secret values are recorded here. This table documents WHERE secrets were written only.

Secret Location Owner
Heroku Platform API token (new) Heroku config vars (4 apps) + GH repo secret HEROKU_API_KEY Item 1
Heroku billing token SSM /raxx/billing-readonly/heroku_billing_api_key + /heroku_billing_auth_id Item 2
AWS billing access key SSM /raxx/billing-readonly/aws_access_key_id + /aws_secret_access_key Item 2
CONSOLE_AUDIT_INGEST_TOKEN GH env secrets (production + staging) + Heroku config vars (raxx-console-prod, raxx-console-staging) Item 3
FreeScout SSH key SSM /raxx/freescout/ssh_key Item 4
FreeScout DB password SSM /raxx/freescout/db_password (pre-existing, verified) Item 4
FreeScout backup IAM access key GH repo secrets AWS_BACKUP_ACCESS_KEY_ID + AWS_BACKUP_SECRET_ACCESS_KEY Item 4