Why idempotency keys exist

The network can drop your response after the work is done. Now you have to retry — and you have no idea whether you'd be doing it for the first time or the second. Idempotency keys are the small protocol the client and server agree on so the retry is safe.

Systems intro Apr 29, 2026

Why it exists

You call POST /charges for $42. The request goes out. The connection times out. Did the charge happen?

You don’t know. The packet that would have told you got lost — but the packet that asked for the work might have arrived just fine. From the client’s seat, “request succeeded but reply was dropped” is indistinguishable from “request never made it.” Both look like a timeout.

So you have to choose, and both choices are bad:

Retry. You might charge the customer twice.
Don’t retry. You might never charge them, and your code will think the operation succeeded silently or fail loudly.

This is the core problem. The network can hide the answer without hiding the work. Idempotency is the way out: design the operation so retrying it is harmless. For a GET that’s free — reading the same row twice is the same as reading it once. For a write, you usually need help. The help is an idempotency key: a token the client invents and sends along, that the server uses to recognize “I’ve already done this exact request, here’s the same answer I gave last time.”

The key turns a dangerous retry into a safe lookup.

Why it matters now

Anywhere the cost of a duplicate is real, idempotency keys appear:

Payments. Stripe’s v1 API takes an Idempotency-Key header on every POST; resending the same key returns the originally saved response instead of creating a second charge or refund. (Stripe’s newer v2 API extends keys to DELETE and gives them a much longer replay window — read the docs of the version you’re calling.) The header is the canonical example most engineers meet first.
LLM and other expensive AI calls. A retried request that re-runs the model costs you the dollars and the latency a second time, even if the first one actually completed. Some providers honor an idempotency key; many don’t yet, and home-grown agent retry loops routinely double-charge users when the upstream timed out at the wrong moment. Worth checking the docs of the specific API you’re calling before assuming.
Webhook receivers. Senders that retry — Stripe automatically redelivers failed webhooks for up to three days; many others have similar policies (GitHub, by contrast, does not automatically redeliver, leaving it to the receiver to refetch) — can deliver the same event more than once. A webhook handler that isn’t idempotent will, eventually, ship a duplicate side effect to production. The remedy is the same shape: dedupe by the event ID the sender provides.
Message queues. Standard SQS queues and Kafka’s default consumer semantics are at-least-once, not exactly-once. (RabbitMQ’s guarantees depend on whether you use manual acknowledgements; the default-ack mode is fast but unsafe.) Consumers have to dedupe themselves; an idempotency key on the message is how.
Background job systems. A worker that crashes after doing the work but before acknowledging the job will see the same job again on restart.

The pattern shows up wherever a network or process boundary can swallow an acknowledgement. Which is, on a long enough timeline, everywhere.

The short answer

idempotency key = client-generated unique ID + server-side dedupe table

The client picks a unique value (often a UUID) before it sends the request, and reuses that same value for every retry of that logical operation. The server, before doing the work, checks a dedupe store: “have I seen this key?” If yes, return the recorded response. If no, do the work, record the response under the key, then return it. Two halves: the client’s promise that the same key means the same intent, and the server’s commitment to remember.

How it works

The basic dance

For a single POST /charges with key k:

Client generates k once, before the first attempt. Sends Idempotency-Key: k plus the request body.
Server looks up k in its dedupe store.
- Hit, with a stored response → return it. The work has already been done; the client just didn’t hear about it.
- Miss → mark k as in-progress, do the work, record the response under k, return the response.
- Hit, but still in-progress → either wait for the first attempt to finish, or return a “duplicate-in-flight” error so the client backs off and retries later.
Client retries on timeout/5xx, with the same k, and gets either the original outcome or a quick deterministic error.

That’s the whole mechanism. The subtleties are all in the details of each step.

Why the client picks the key, not the server

It has to be the client. The whole point is to survive the case where the server’s response never arrives — so the client must be able to identify the operation before it has any reply to anchor on. A server-generated key is fine for referencing a created resource afterwards; it can’t be used to deduplicate the request that created it.

What the server actually stores

A dedupe store is at minimum a map from key to “I’m working on it” or “here’s the recorded result.” A common implementation puts the key in a row of the same database the work itself writes to, and updates both inside one transaction — so the key insertion and the side effect commit together or not at all. That atomicity is the part most home-grown implementations get wrong.

A common shape:

INSERT INTO idempotency_keys (key, request_fingerprint, status)
VALUES ($1, $2, 'in_progress')
ON CONFLICT (key) DO NOTHING;

If the insert wins, this attempt does the work and updates the row to completed with the response, in the same transaction as the business write. If it loses, the row already exists and the server hands back whatever’s already recorded.

Same key, same request

Idempotency keys are a promise about intent, not just identity. If the client sends key k once with {amount: 42} and again with {amount: 4200}, the server should not silently treat the second as a duplicate of the first — that would let a bug or a race quietly overwrite a charge with a different one.

The defensive move is for the server to also store a fingerprint of the request body and reject mismatches. Stripe’s API documents exactly this behavior — replaying a key with a different payload is an error, not a silent dedupe. Anything less is a footgun.

Keys expire

Storing every key forever is unbounded growth. Most real systems give keys a TTL — Stripe’s v1 docs say keys can be pruned once they’re at least 24 hours old; v2 extends the replay window much further. The retention has to comfortably exceed the client’s worst-case retry budget, or else the client retries with a key the server has already forgotten, the dedupe miss looks fresh, and the operation runs again.

Show the seams

Idempotent in HTTP’s sense isn’t quite the same thing. RFC 9110 calls a method “idempotent” when N identical requests have the same effect as one. PUT and DELETE are defined that way; POST is not. But that’s a property of the method, not of any particular request. Idempotency keys layer “this specific request is a retry of that one” on top, so they make POST operationally idempotent for a given key.
Concurrent retries are the awkward case. A client that sends request k twice in parallel (because the first attempt hasn’t obviously failed yet) puts the server in the in-progress state twice. Servers handle this by either serializing on the row lock or by returning a 409-ish “operation in progress, retry later.” Either is fine; silently doing the work twice is not.
Exactly-once is still a fairy tale at the network layer. What idempotency keys give you is effectively-once application semantics on top of at-least-once delivery. The duplicates still arrive; the server just refuses to act on them twice. If anyone promises you exactly-once delivery, ask where the dedupe lives — it’s always somewhere.
The key has to be generated before the first send. If a client generates the key inside its retry loop, every retry gets a fresh key and the dedupe never fires. This sounds obvious and is one of the most common bugs.
Side effects outside the database don’t roll back. If the work sends an email and then fails to record the idempotency row, a retry will send a second email. The general pattern — record the side effect’s intent transactionally, perform the side effect from an outbox — is the outbox pattern, and idempotency keys live happily next to it.

UUID — UUID = 128 random or time-based bits + a format that's collision-resistant without a coordinator — the usual choice for the key itself, because the client can mint one alone.
At-least-once delivery — at-least-once = retry on uncertainty + accept the cost of duplicates — the delivery mode idempotency keys exist to compensate for.
Outbox pattern — outbox = side-effect intent table + drain worker — pairs with idempotency keys when the work has external side effects.
Exponential backoff — retry policy = backoff + jitter + stopping rule — the when of retrying; idempotency keys are the how to make it safe.
Two-phase commit — 2PC = prepare phase + commit phase across participants — the heavyweight alternative when you can’t tolerate even a transient duplicate. Almost always the wrong tool for an HTTP API.
CAS — CAS = read expected value + atomic conditional write — the in-process cousin; same instinct (“only do this if the world hasn’t moved”), different scale.

Going deeper

Stripe’s API reference on the Idempotency-Key header. The most widely-copied production implementation; reading their docs is the fastest way to internalize the contract.
“Designing robust and predictable APIs with idempotency” on the Stripe engineering blog (Brandur Leach), which walks through the same-key-same-request rule and the retry semantics in production.
RFC 9110 §9.2.2, on idempotent HTTP methods — the property idempotency keys generalize.
Tyler Treat, “You Cannot Have Exactly-Once Delivery” — the standard argument for why effectively-once on top of at-least-once is the honest target.