Raxx · internal docs

internal · gated

FreeScout S3 IAM runbook

System: IAM user raxx-freescout-backup — S3 + Lightsail backup credentials Owner: operator Related issues: #744 (provision), #714 (backup workflow), #668 (S3 bucket) Related runbook: docs/ops/runbooks/freescout-backup-restore.md Last incident: none (initial creation 2026-05-17) Last reviewed: 2026-05-17


What this is

raxx-freescout-backup is a dedicated IAM user whose access key pair is stored as GH Actions repo secrets (AWS_BACKUP_ACCESS_KEY_ID / AWS_BACKUP_SECRET_ACCESS_KEY). The GH Actions workflow .github/workflows/freescout-backup.yml uses these credentials to:

Why an IAM user, not an IAM role? Lightsail instances do not support EC2 IAM instance profiles. The backup is driven from GH Actions, not from the instance. Migrating to GH Actions OIDC (aws-actions/configure-aws-credentials with role-to-assume) is the long-term path and is tracked as a toil-reduction action item; see #744.

Credential storage: GH Actions repo secrets only. The key material is not stored in Infisical or SSM — the backup workflow is an AWS-side workload that uses AWS-native credential injection.


Terraform resource

Managed in terraform/freescout/iam_backup.tf.

After terraform apply, the IAM user exists but has no access keys — Terraform does not create keys. Key generation is a deliberate manual step so the key material is never in Terraform state.


Key generation (one-time setup)

Run these commands once after terraform apply creates the user.

# Generate the key pair — output is printed once and never stored by AWS
aws iam create-access-key \
  --user-name raxx-freescout-backup \
  --region us-east-1 \
  --output json

The output contains AccessKeyId and SecretAccessKey. Store them immediately as GH Actions repo secrets. The secret key is only shown once.

# Store in GH Actions repo secrets (replace <VALUE> with actual values)
gh secret set AWS_BACKUP_ACCESS_KEY_ID \
  --body "<AccessKeyId>" \
  --repo raxx-app/TradeMasterAPI

gh secret set AWS_BACKUP_SECRET_ACCESS_KEY \
  --body "<SecretAccessKey>" \
  --repo raxx-app/TradeMasterAPI

Verify the secrets are present:

gh secret list --repo raxx-app/TradeMasterAPI | grep AWS_BACKUP

Expected output:

AWS_BACKUP_ACCESS_KEY_ID    Updated <date>
AWS_BACKUP_SECRET_ACCESS_KEY Updated <date>

Verify the IAM user is correctly provisioned

# Confirm user exists
aws iam get-user --user-name raxx-freescout-backup --region us-east-1

# List attached policies
aws iam list-user-policies --user-name raxx-freescout-backup --region us-east-1

# Confirm access keys exist
aws iam list-access-keys --user-name raxx-freescout-backup --region us-east-1

Expected: one active access key, status Active, created date matches setup date.


Verify the backup workflow fires correctly

After key generation and secret upload:

# Trigger a dry run — prints all planned steps without performing S3 upload or snapshot
gh workflow run freescout-backup.yml \
  --field dry_run=true \
  --repo raxx-app/TradeMasterAPI

# Watch the run
gh run list --workflow freescout-backup.yml \
  --repo raxx-app/TradeMasterAPI \
  --limit 1

A dry run that completes Configure AWS credentials and Read DB password from SSM without error confirms the IAM key is valid and the SSM policy is working.


Key rotation

Rotate every 90 days or immediately after any suspected key exposure.

# 1. Create a new key BEFORE deleting the old one (zero-downtime rotation)
NEW_KEY=$(aws iam create-access-key \
  --user-name raxx-freescout-backup \
  --region us-east-1 \
  --output json)

NEW_KEY_ID=$(echo "$NEW_KEY" | python3 -c "import json,sys; d=json.load(sys.stdin); print(d['AccessKey']['AccessKeyId'])")
NEW_SECRET=$(echo "$NEW_KEY" | python3 -c "import json,sys; d=json.load(sys.stdin); print(d['AccessKey']['SecretAccessKey'])")

# 2. Update GH Actions secrets with new values
gh secret set AWS_BACKUP_ACCESS_KEY_ID \
  --body "$NEW_KEY_ID" \
  --repo raxx-app/TradeMasterAPI

gh secret set AWS_BACKUP_SECRET_ACCESS_KEY \
  --body "$NEW_SECRET" \
  --repo raxx-app/TradeMasterAPI

# 3. Trigger a dry run to confirm the new key works
gh workflow run freescout-backup.yml \
  --field dry_run=true \
  --repo raxx-app/TradeMasterAPI

# 4. List existing keys and delete the OLD one (get OLD_KEY_ID from list)
aws iam list-access-keys --user-name raxx-freescout-backup --region us-east-1

# Replace <OLD_KEY_ID> with the key that is NOT $NEW_KEY_ID
aws iam delete-access-key \
  --user-name raxx-freescout-backup \
  --access-key-id <OLD_KEY_ID> \
  --region us-east-1

echo "Rotation complete — only new key active"

Verification: Run another dry run after deletion; confirm workflow succeeds.


Emergency: revoke all keys immediately

Use this if the key is suspected compromised. Backup will be disrupted until a new key is provisioned.

# List all keys for the user
aws iam list-access-keys --user-name raxx-freescout-backup --region us-east-1

# Delete each key
aws iam delete-access-key \
  --user-name raxx-freescout-backup \
  --access-key-id <KEY_ID> \
  --region us-east-1

echo "All keys revoked — backup workflow will fail until new key provisioned"

After revocation, follow the "Key generation" section above to issue a replacement.


Known failure modes

Failure mode: InvalidClientTokenId in GH Actions backup workflow

Symptom: Step "Configure AWS credentials" fails with: Error: The security token included in the request is invalid.

Cause: The GH Actions secret AWS_BACKUP_ACCESS_KEY_ID or AWS_BACKUP_SECRET_ACCESS_KEY is stale, expired, or was never set.

Fix: 1. Verify secrets exist: gh secret list --repo raxx-app/TradeMasterAPI | grep AWS_BACKUP 2. If missing or stale: rotate per the "Key rotation" section above. 3. Re-run: gh workflow run freescout-backup.yml --field dry_run=true

Failure mode: AccessDenied on SSM GetParameter

Symptom: Step "Read DB password from SSM" fails with AccessDenied.

Cause: The IAM inline policy SSMReadFreescout does not cover the parameter path, or the account ID in the ARN is wrong.

Fix:

# Check the actual inline policy
aws iam get-user-policy \
  --user-name raxx-freescout-backup \
  --policy-name raxx-freescout-backup-policy \
  --region us-east-1 \
  --query 'PolicyDocument'

# Verify the SSM parameter exists at the expected path
aws ssm get-parameter \
  --name "/raxx/freescout/db_password" \
  --region us-east-1 \
  --query 'Parameter.Name'

If the policy ARN has the wrong account ID, re-run terraform apply in terraform/freescout/ and regenerate the key.

Failure mode: AccessDenied on S3 PutObject

Symptom: Tier 2 backup step fails with AccessDenied writing to s3://raxx-support-attachments/db-backups/freescout/.

Cause: The KMS key policy for alias/raxx-support-attachments does not include raxx-freescout-backup as an allowed principal, or the bucket policy blocks non-role principals.

Fix: Check the bucket policy and KMS key policy:

# Bucket policy
aws s3api get-bucket-policy \
  --bucket raxx-support-attachments \
  --region us-east-1 \
  --query Policy --output text | python3 -m json.tool

# KMS key policy
aws kms get-key-policy \
  --key-id alias/raxx-support-attachments \
  --policy-name default \
  --region us-east-1 \
  --query Policy --output text | python3 -m json.tool

The IAM policy uses kms:ViaService condition rather than requiring explicit KMS key principal entry — the backup workflow uses SSE-KMS via S3, so the S3 service calls KMS on behalf of the IAM user. If the bucket policy has an explicit Deny for non-role principals, add an Allow statement for arn:aws:iam::<account>:user/raxx/freescout/raxx-freescout-backup on the db-backups/freescout/* resource.


IAM policy reference

Full policy is defined in terraform/freescout/iam_backup.tf.

Sid Scope Actions
SSMReadFreescout /raxx/freescout/* parameters ssm:GetParameter
S3BackupWrite raxx-support-attachments/db-backups/freescout/* PutObject, GetObject, HeadObject, DeleteObject
S3BackupBucket raxx-support-attachments (with prefix condition) ListBucket, GetBucketLocation, CreateBucket
S3HeadBucket * (IAM-level, not bucket-level) ListAllMyBuckets
KMSBackup * via kms:ViaService condition GenerateDataKey, Decrypt, DescribeKey
LightsailSnapshotManage * CreateInstanceSnapshot, GetInstanceSnapshots, DeleteInstanceSnapshot

Escalation

Escalate to the operator when: - Key rotation fails and the backup has not run for more than 24 hours - AccessDenied on Lightsail snapshot operations and terraform apply has not resolved it - Suspected key compromise (revoke immediately, then escalate)