If you’ve worked on AWS Cognito in your application, you’ve might come across refresh tokens. They’re essential for keeping your users authenticated without forcing them to log in repeatedly. But once you dive into refresh token rotation, things can get tricky—especially when race conditions sneak in.
In this blog, we’ll break down:
- What refresh token rotation is in Cognito
- Why race conditions happen in real-world apps
- Practical strategies to solve them
Let’s dive in.
What is Refresh Token Rotation?
- Refresh Token is a long-lived token generated by Cognito, used for generating the new access token when the existing access token expires, and makes the user stay logged-in always.
- Normally, a refresh token stays valid until it expires or revoked. But with refresh token rotation, AWS Cognito takes it one step further:
- Each time you use a refresh token, Cognito issues a new pair of access and refresh tokens.
- The old refresh token immediately becomes invalid.
- This prevents replay attacks if an attacker steals a refresh token.
So, by implementing this approach, our security does get tighter. Everything looks good on paper. But here’s the catch: in real-world distributed systems, this introduces new challenges.
Why do race conditions happen in real-world apps?
Let's take an example
- A user has your application and opens it in two tabs.
- Both tabs had reached the expiration time (access token) at the same time.
- Each tab tries to refresh using the same refresh token
- Cognito accepts the first request and issues a new token pair.
- The second request fails because the original refresh token is now invalid.
- Now one of the tabs gets logged out unexpectedly.
This is the refresh token race condition.
Below is the example of race condition for refresh token rotation
Note:
Always remember, the best approach is to make the access token short lived (5-15 min).
Shorten Access Token Lifetime
By keeping access tokens short-lived (e.g., 5 –15 minutes instead of 1 hour):
- The impact of a compromised token is reduced.
- Refresh happens more frequently, which naturally reduces the window for race collisions.
- Security is stronger while user experience stays smooth (since refreshes are silent).
How to Solve Refresh Token Race Conditions
Here are a few approaches to handle this gracefully.
Accept Both Current and Previous iat
Each access token has an iat (issued at) timestamp. You can store the latest and previous iat values in your database/session. When a new access token is generated:
- Compare its iat against the stored values.
- Accept if it matches either the latest or previous one.
This ensures only one refresh chain is valid, while allowing a small overlap so multiple requests don’t immediately fail.
This pattern prevents users from being logged out if two refreshes happen back-to-back.
Add a Retry Mechanism
If a refresh fails because of invalid token errors:
- Prompt the client to retry with the latest access token.
- Optionally, fall back to a full login only if retries fail.
This makes your app more resilient instead of abruptly logging the user out.
Use a Database Lock or Transaction
If multiple refresh requests can hit your backend at the same time:
- Use a row lock in PostgreSQL (or your DB of choice).
- Ensure only one request updates the token record at a time.
Use Redis for Token State Tracking (Advanced)
For apps running at scale across multiple servers, Redis can be a game-changer:
- Store the latest valid access token (or iat) in Redis.
- Use atomic operations to ensure only one refresh succeeds at a time.
- Redis automatically handles expiry if you set token TTLs.
- Since Redis is centralized and fast, all your backend nodes share the same token state.
This approach minimizes race conditions in distributed environments and is often considered a best-practice strategy for production-grade systems.
Conclusion
Refresh token rotation in AWS Cognito is awesome for security, but it comes with a catch: if you don’t handle it carefully, users can end up getting logged out for no good reason. That’s where the strategies we talked about come in.
For smaller apps, something simple like keeping track of the current and previous iat values (and maybe adding a retry) usually does the trick. But if your app is running across multiple servers or you’re working at scale, bringing in Redis or another shared store makes life a lot easier—and keeps your users happy.
Always remember to maintain short lived access tokens (5-15 min) and plan for token rotation from the start, when implemented correctly we get strong security and a smooth, frustration-free experience for your users.