Rotation SOP — AWS IAM Access Key
Mode: programmatic Last validated: 2026-04-24 UTC Validation method: sandbox-rotation (safe — create a test IAM user, rotate, delete) Average duration: 4m (programmatic; longer if dyno restarts are required) Required role: ops (superadmin for IAM users with elevated privileges)
Applies to: AWS_ACCESS_KEY_ID + AWS_SECRET_ACCESS_KEY pairs for IAM users used by Raxx infrastructure (e.g., the Lightsail vault host operator user, S3 backup writer, etc.). Each IAM user has its own pair. This SOP is per-user.
Confirmed: AWS IAM supports true programmatic self-rotation. This is the cleanest of all our rotation paths — pure CLI/API, no portal interaction, supports overlap (max 2 keys per user) for atomic swaps.
When to run
- Scheduled rotation (cadence: every 30 days — automation makes this cheap)
- Operator-initiated (suspected compromise, off-cycle)
- After incident (employee offboarding, leaked-key recovery)
Prerequisites
- [ ] AWS CLI installed and authenticated as a different IAM identity that has
iam:CreateAccessKey,iam:UpdateAccessKey,iam:DeleteAccessKey,iam:ListAccessKeyson the target user (NOT the user being rotated) - [ ] OR: rotation is being run by the user-being-rotated, in which case ensure that the user has
iam:*AccessKey*on themselves (less common; less recommended) - [ ] Existing key pair in Infisical with history
- [ ] Downstream consumer list (Heroku apps, GitHub Actions, vault host services)
- [ ] Confirm the IAM user currently has fewer than 2 active keys (max is 2 — if at 2, the rotation flow uses the existing slot)
Steps
1. Pre-rotation checks
# Confirm we are operating on the right user
TARGET_USER="raxx-vault-operator" # example
aws iam list-access-keys --user-name "$TARGET_USER" \
| jq '.AccessKeyMetadata[] | {AccessKeyId, Status, CreateDate}'
# Expect: 1 active key (the OLD one). If 2, see step 7.
# Confirm the OLD key still works (sanity check)
AWS_ACCESS_KEY_ID="$OLD_KEY_ID" AWS_SECRET_ACCESS_KEY="$OLD_SECRET" \
aws sts get-caller-identity | jq '.Arn'
# Expect: ARN matching the target user
2. Generate the new credential
NEW_KEY_JSON=$(aws iam create-access-key --user-name "$TARGET_USER")
NEW_KEY_ID=$(echo "$NEW_KEY_JSON" | jq -r '.AccessKey.AccessKeyId')
NEW_SECRET=$(echo "$NEW_KEY_JSON" | jq -r '.AccessKey.SecretAccessKey')
# AWS now has 2 active keys for this user. Both work.
3. Validate the new credential
AWS_ACCESS_KEY_ID="$NEW_KEY_ID" AWS_SECRET_ACCESS_KEY="$NEW_SECRET" \
aws sts get-caller-identity | jq '.Arn'
# Expect: same ARN as step 1
Validate against a representative consumer call (e.g., aws s3 ls if the user is an S3 user):
AWS_ACCESS_KEY_ID="$NEW_KEY_ID" AWS_SECRET_ACCESS_KEY="$NEW_SECRET" \
aws s3 ls s3://raxx-backups/ | head
4. Store in Infisical
infisical secrets set AWS_ACCESS_KEY_ID="$NEW_KEY_ID" \
--projectId="$INFISICAL_PROJECT_ID" --env=prod
infisical secrets set AWS_SECRET_ACCESS_KEY="$NEW_SECRET" \
--projectId="$INFISICAL_PROJECT_ID" --env=prod
5. Propagate to downstream consumers
| Consumer | How |
|---|---|
| Heroku apps | heroku config:set AWS_ACCESS_KEY_ID=... AWS_SECRET_ACCESS_KEY=... -a <app> |
| GitHub Actions | gh secret set AWS_ACCESS_KEY_ID -b ...; gh secret set AWS_SECRET_ACCESS_KEY -b ... |
| AWS Lightsail vault host | SSH to host, update systemd EnvironmentFile or /etc/raxx/credentials.env, restart service |
6. Verify downstream
Wait for AWS IAM eventual consistency (typically <60s for a new key to be globally usable):
sleep 60
# Trigger the consumer's AWS-using workload
heroku run --app raxx-api-prod python -m scripts.s3_backup_dry_run
# Expect: no AccessDenied / InvalidClientTokenId
For each downstream service, verify a representative call.
7. Deactivate the old credential (NOT delete yet)
Per AWS best practice, deactivate first, observe for several days, then delete:
aws iam update-access-key \
--access-key-id "$OLD_KEY_ID" \
--status Inactive \
--user-name "$TARGET_USER"
Confirm consumers still work (some may have cached credentials briefly):
sleep 60
heroku run --app raxx-api-prod python -m scripts.s3_backup_dry_run
# Expect: still passing — new key is in use.
7b. Delete the old credential (after observation period)
After at least 24h of observation with the old key Inactive (no surprise dependency surfaces), delete:
aws iam delete-access-key \
--access-key-id "$OLD_KEY_ID" \
--user-name "$TARGET_USER"
For automated/programmatic rotations on tight cadences (e.g., 30-day rotation), deactivate-then-delete can collapse to a single rotation cycle by deleting the previous-Inactive key at the start of the next rotation:
# At the start of each rotation, delete any Inactive keys older than the cadence window
aws iam list-access-keys --user-name "$TARGET_USER" \
| jq -r '.AccessKeyMetadata[] | select(.Status=="Inactive") | .AccessKeyId' \
| xargs -I {} aws iam delete-access-key --access-key-id {} --user-name "$TARGET_USER"
8. Audit log entry
action: secret.rotate.completed
actor: <admin_id>
context: {
"secret_name": "AWS_ACCESS_KEY_ID",
"iam_user": "<user>",
"old_key_id": "<...>",
"new_key_id": "<...>",
"method": "programmatic"
}
Rollback
The old key is Inactive, not deleted, until step 7b. To roll back:
aws iam update-access-key \
--access-key-id "$OLD_KEY_ID" \
--status Active \
--user-name "$TARGET_USER"
Then revert Heroku/CI config vars to the old pair from Infisical history. The user now has 2 active keys; deactivate the new one to force the consumer back to the old:
aws iam update-access-key \
--access-key-id "$NEW_KEY_ID" \
--status Inactive \
--user-name "$TARGET_USER"
Investigate, then redo from step 2.
After step 7b (delete), the old key is unrecoverable.
Vendor doc references
- AWS IAM access keys overview: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html
- Update/rotate access keys: https://docs.aws.amazon.com/IAM/latest/UserGuide/id-credentials-access-keys-update.html
- CLI reference:
- https://docs.aws.amazon.com/cli/latest/reference/iam/create-access-key.html
- https://docs.aws.amazon.com/cli/latest/reference/iam/update-access-key.html
- https://docs.aws.amazon.com/cli/latest/reference/iam/delete-access-key.html
- https://docs.aws.amazon.com/cli/latest/reference/iam/get-access-key-last-used.html
Known gotchas
- Max 2 active keys per user. If a user has 2 keys when rotation starts, you must delete (or have already-Inactive) one before creating a new one. Build the rotation flow around this limit.
- Eventual consistency. A newly-created key may take up to 60s to be globally valid. Step 6 waits explicitly.
- Inactive ≠ Deleted. An Inactive key cannot authenticate but still occupies one of the 2 slots. Always delete before next rotation cycle.
get-access-key-last-usedis the right diagnostic before delete — confirms no consumer is still hitting the old key.- Secret access key is shown once at creation. Capture in step 2 immediately.
- Console (web) rotation is also possible at https://console.aws.amazon.com/iam/ but offers no advantage over CLI; prefer CLI for auditability.
- CloudTrail logs every IAM rotation. Useful for post-rotation audit verification.