"Just refresh the token" looks like a five-line problem. For a lot of agents it quietly isn't — and the failure only shows up in production, under load, after a human took a few seconds to click approve.
Two things real OAuth providers do break the naive approaches:
- Short-lived access tokens. They expire in minutes.
- Rotating refresh tokens. Each refresh invalidates the refresh token you just used and issues a new one — GitHub Apps, Google one-time-use, Okta, Auth0 rotation all do this.
The repro
Here's a deliberately honest test: a mock OAuth server with a 2-second access-token TTL, rotating refresh tokens, and real latency on the token endpoint. An agent fires eight tool calls at once (a normal thing for an agent to do), each asking for a token. The "obvious" fix — every call reads the stored refresh token and refreshes — produces this:
$ node run.mjs
A) naive (hold access token across pause): resource → 401 token_expired
B) nominee (refresh at call time): before → 200 OK | after pause → 200 OK
C) nominee + 8 concurrent calls: network refreshes = 1 (single-flight) | resource 200s = 8/8
D) refresh WITHOUT single-flight (8 concurrent): network refreshes = 8 | invalid_grant failures = 7/8
Row D is the bug. Eight calls each read the same stored refresh token and refresh independently. The first one to land rotates it — and that invalidates the token the other seven are still holding. Seven come back invalid_grant. Worse, depending on who writes last, your stored refresh token can end up pointing at a dead value, and now the connection is broken until the user re-consents.
Row A is the simpler cousin: grab the access token up front, pause for human approval, act after the pause — by then it's expired.
Why it's not a 5-line happy path
Getting this right means handling, all at once:
- Resolve at call time, not at startup — so a token never goes stale across a pause or a durable-execution hibernation.
- Single-flight — concurrent cache-misses for the same key must share one network refresh. Without it, rotation makes concurrency actively corrupt state.
- Persist the rotated refresh token atomically, before returning — or the next refresh uses a dead token.
- Surface re-consent on a genuinely dead refresh token instead of retrying forever.
None of these is hard alone. Together, written under deadline, they're where the incident comes from.
The fix, runnable
The whole thing above is a real, runnable example — no mocks that cheat, the server really rotates and really expires:
git clone https://github.com/bharath31/nominee
cd nominee/examples/token-refresh-correctness
pnpm install && node run.mjs
Rows B and C are nominee. The agent code doesn't change — you just ask nominee for the token instead of holding one, and give it one line to persist the rotated refresh token:
import { Nominee, OAuth2 } from 'nominee'
const nominee = new Nominee({
strategy: OAuth2({
connections: {
github: {
tokenEndpoint, clientId,
refreshToken: () => store.get('alice').refreshToken, // read from your store
onRefreshToken: (_p, rt) => // write the rotated one back
store.set('alice', { ...store.get('alice'), refreshToken: rt }),
},
},
}),
})
// at the moment of the tool call — never held across the pause:
const token = await nominee.token({ user: 'alice', connection: 'github' })
nominee does the proactive refresh, the single-flight coalescing, and the atomic rotation persistence for you. That's the entire correctness kernel — the core is zero-dependency, and there's no service to sign up for.
When you don't need it
I want to be explicit, because it matters for credibility:
- If your provider issues long-lived, non-rotating tokens and your agent never pauses or fans out, the naive path is genuinely fine.
- If you're on Eve or a framework that already brokers fresh third-party access, use its native auth.
- If you use the Vercel AI SDK with Vercel Connect, Connect already manages the tokens.
- If you want a single fully-managed vendor, use Auth0 Token Vault or Nango directly.
nominee is for the case where you want this correct framework-neutral, with no SaaS, and bring-your-own-store — and want to swap the vault underneath without rewriting your agent. I pulled the correct version out of a few too many hand-rolled refresh layers and made it a tiny library. If the repro above looks familiar, it'll save you the incident.
See it for yourself
Run the 7/8 → 8/8 proof in 30 seconds.