-
Notifications
You must be signed in to change notification settings - Fork 563
Add retry logic for GetCallerIdentity validation after role assumption #1681
Description
Describe the feature
Add retry logic around the STS GetCallerIdentity call. This could use the same logic as the assumeRole and getIDToken calls.
Use Case
When assuming a newly created IAM role via OIDC, the action can fail during the internal GetCallerIdentity step (used for exporting the account ID) with:
Error: The security token included in the request is invalid.
This appears to be caused by IAM eventual consistency when roles are created immediately before they are assumed. The role assumption succeeds, but the validation call made immediately afterward sometimes fails because the credentials are not yet usable across all AWS endpoints.
Adding retry logic around this validation call would make the action more resilient in these scenarios.
Proposed Solution
I can open a PR, but wanted to get design feedback first. General solution should allow retries to be applied to the GetCallerIdentity call, either through the same retry mechanics already used within this action or via STS client configuration.
Example
if (outputEnvCredentials) {
await retryAndBackoff(
async () => {
exportAccountId(credentialsClient, maskAccountId);
},
!disableRetry,
maxRetries,
);
}Other Information
Action version
v5.1.1
Observed Behavior
Workflow logs show the role assumption succeeding:
Assuming role with OIDC
Assuming role with OIDC
Assuming role with OIDC
Authenticated as assumedRoleId AROAXXXXXXXX:session-name
Error: The security token included in the request is invalid.
The Authenticated as assumedRoleId log indicates the AssumeRoleWithWebIdentity call succeeded and credentials were returned. However, the action fails immediately afterward when calling GetCallerIdentity (via exportAccountId).
Acknowledgements
- I may be able to implement this feature request
- This feature might incur a breaking change