Introduction
You're designing a notification system. You've got Kafka, workers, retry logic. The interviewer nods approvingly.
Then they ask: "A worker pulls a message, sends the email, but crashes before acknowledging. Kafka redelivers the message. The next worker sends the email again. The user gets two password reset emails. How do you prevent that?"
And now you realize that retries without idempotency means duplicates. Every system that retries (notifications, payments, webhooks, order processing) needs a strategy for handling the same message twice without causing double side effects.
Here's how idempotency works, how to implement it, and how to talk about it in interviews.
What Is Idempotency?
An operation is idempotent if performing it multiple times produces the same result as performing it once.
Some operations are naturally idempotent:
SET user.email = "alice@example.com" // idempotent (same result every time)
DELETE FROM orders WHERE id = 123 // idempotent (row is gone either way)
GET /api/users/123 // idempotent (read-only)
Some are not:
INSERT INTO orders (user_id, amount) VALUES (123, 50.00) // NOT idempotent (creates a new row each time)
POST /api/charge { amount: 50.00 } // NOT idempotent (charges again)
sendEmail("Your OTP is 4829") // NOT idempotent (sends another email)
The goal: make non-idempotent operations safe to retry. If your system processes the same request twice, the user shouldn't notice.
Why It Matters in Distributed Systems
In a single-server, single-process world, you don't worry much about duplicates. But distributed systems have failure modes that make duplicates inevitable.
Network Retries
Client sends payment request
-> Server processes payment
-> Server sends response
-> Response lost (network timeout)
Client retries (it never got a response)
-> Server processes payment AGAIN
-> User charged twice
The client had no way to know the first request succeeded. Retrying is the correct behavior. But without idempotency, the retry causes a double charge.
At-Least-Once Delivery
Most message queues use at-least-once delivery. This means a message might be delivered more than once.
Worker pulls message from Kafka
Worker sends email via SendGrid
Worker crashes before committing offset
Kafka redelivers the same message
Next worker sends the email again
This is by design. At-least-once is simpler and more reliable than exactly-once. The trade-off is that your consumers must handle duplicates. For more on delivery guarantees, see Message Queues.
Consumer Crashes
Even without queue redelivery, a consumer might process the same work twice if it crashes mid-operation. Any system that separates "do the work" from "record that work was done" has this window of vulnerability.
Idempotency Keys
The standard solution: attach a unique identifier to every operation. Before processing, check if that identifier has already been processed.
How It Works
1. Client generates a unique idempotency key (UUID)
2. Client sends request with key: X-Idempotency-Key: abc-123
3. Server checks: have I seen abc-123 before?
- No -> process the request, store abc-123 with result
- Yes -> return the stored result without reprocessing
The key must be generated by the caller, not the server. If the server generates the key, each retry looks like a new request.
Key Types
Client-generated UUID: The client creates a UUID (e.g., 550e8400-e29b-41d4-a716-446655440000) and attaches it to the request. Best for API calls where the client controls retries.
POST /api/v1/payments
X-Idempotency-Key: 550e8400-e29b-41d4-a716-446655440000
{ "amount": 50.00, "userId": "user-123" }
Server-generated dedup key: For message queue consumers, the message itself carries an ID. The consumer tracks which message IDs it has processed.
Kafka message: { messageId: "notif-789", userId: "user-123", channel: "email" }
Consumer checks: has notif-789 been processed?
- No -> send email, mark notif-789 as processed
- Yes -> skip, already handled
Natural dedup key: Sometimes the operation itself contains a natural key. An order confirmation for order #456 should only be sent once, so order-456-confirmation can be the dedup key.
Implementation Strategies
Redis SET NX with TTL
The most common approach. Use Redis to track processed keys with automatic expiration.
-- Check and set atomically
result = Redis.SET("idemp:abc-123", "processed", NX, EX 86400)
if result == nil:
// Key already exists, this is a duplicate
return stored_response
else:
// Key is new, process the request
process_request()
Redis.SET("idemp:abc-123:result", serialize(response), EX 86400)
NX means "set only if the key does not exist." This is atomic in Redis, so there is no race condition between checking and setting. The EX 86400 sets a 24-hour TTL so keys don't accumulate forever.
For more on Redis atomic operations, see Databases & Caching.
Database Unique Constraints
For operations that write to a database, use unique constraints as your idempotency check.
Table: processed_notifications
notification_id VARCHAR PRIMARY KEY
processed_at TIMESTAMP
result TEXT
-- Attempt to insert
INSERT INTO processed_notifications (notification_id, processed_at, result)
VALUES ('notif-789', NOW(), 'sent')
ON CONFLICT (notification_id) DO NOTHING;
-- If rows_affected == 0, it was a duplicate
The database enforces uniqueness. No separate check needed. This works well when the idempotency record belongs in the same database as the business data, since you can wrap both in a single transaction.
Message ID Tracking
For queue consumers, maintain a set of recently processed message IDs.
Consumer flow:
1. Pull message from Kafka
2. Check Redis: SISMEMBER "processed-messages" "msg-123"
3. If member -> skip (already processed)
4. If not member -> process, then SADD "processed-messages" "msg-123"
5. Commit Kafka offset
The set needs a cleanup strategy. Either use TTL on individual keys, or periodically remove entries older than your queue's retention period.
At-Least-Once + Idempotent Consumers = Effective Exactly-Once
This is the key insight for interviews. True exactly-once delivery is extremely hard (and often impossible) in distributed systems. But you don't need it.
At-least-once delivery: message might be delivered 1, 2, or 3 times
Idempotent consumer: processing a message 2 or 3 times = same result as once
Combined effect: message is effectively processed exactly once
Kafka does support exactly-once semantics with idempotent producers and transactional consumers. But it comes with complexity and performance overhead. For most systems, at-least-once with idempotent consumers is simpler, faster, and good enough.
The pattern:
1. Message arrives (possibly a duplicate)
2. Extract idempotency key (message ID, or derive from content)
3. Check dedup store (Redis or database)
4. If already processed -> skip, acknowledge message
5. If new -> process, record in dedup store, acknowledge message
This works for notifications, payments, order processing, and webhook delivery. It applies to basically any async workflow.
Real-World Examples
Preventing Duplicate Emails
Notification worker receives: { notificationId: "notif-789", channel: "email" }
1. Check Redis: EXISTS "sent:notif-789"
2. Not found -> send email via SendGrid
3. SET "sent:notif-789" "delivered" EX 86400
4. Commit Kafka offset
If worker crashes after step 2 but before step 3:
- Kafka redelivers the message
- New worker checks Redis: key doesn't exist (step 3 never ran)
- Email sends again (duplicate)
Fix: move the Redis SET before the email send:
1. SET "sent:notif-789" "processing" NX EX 86400
2. If key already existed -> skip (another worker handled it)
3. Send email
4. Update Redis value to "delivered"
5. Commit offset
The trade-off: if the worker crashes after step 1 but before step 3, the email is never sent, but the key exists. You need a cleanup process that retries "processing" entries that are older than a threshold. This is more complex, but it prevents duplicates at the cost of occasional delayed delivery.
For most notification systems, the simpler "send then record" approach is fine. A duplicate marketing email is annoying. A missing OTP is a blocker. Bias toward delivery.
Preventing Double Charges
Payment APIs like Stripe require an idempotency key on every charge request:
POST /v1/charges
Idempotency-Key: order-456-charge
{ "amount": 5000, "currency": "usd" }
If the network drops the response and the client retries with the same key, Stripe returns the original charge result without creating a new charge. The key is typically derived from the business operation: {orderId}-charge guarantees one charge per order.
Preventing Duplicate Webhook Deliveries
Webhook receivers often see the same event delivered multiple times:
Event arrives: { eventId: "evt-001", type: "order.completed" }
1. Check: has evt-001 been processed?
2. No -> process event, update order status, record evt-001
3. Yes -> return 200 OK, skip processing
Always return 200 OK for duplicates. If you return an error, the sender will keep retrying, creating an infinite retry loop.
Common Interview Mistakes
Mistake 1: Assuming messages are delivered exactly once
"Kafka guarantees each message is processed once."
Problem: Default Kafka delivery is at-least-once. Consumer crashes, rebalances, and network issues all cause redelivery. Saying "exactly once" without caveats tells the interviewer you haven't operated a real system.
Better: "We use at-least-once delivery with idempotent consumers. Each worker checks a dedup key in Redis before processing."
Mistake 2: No strategy for duplicate side effects
"The worker sends the email and commits the offset."
Problem: If the worker crashes after sending but before committing, the email sends again on redelivery. You've just explained a system that sends duplicate emails.
Better: Track processed message IDs in Redis or a database. Check before processing. Acknowledge that there's a small window for duplicates and explain the trade-off.
Mistake 3: Using server-generated IDs for client idempotency
"The server assigns a unique ID to each request."
Problem: If the client retries, the server sees a new request (no ID yet) and assigns a new ID. Both requests get processed. The server-generated ID doesn't help because it's created too late.
Better: The client generates the idempotency key and sends it with the request. Retries carry the same key, so the server recognizes them.
Mistake 4: Never expiring idempotency keys
"We store every processed message ID forever."
Problem: Your dedup store grows without bound. After a year, you're storing millions of keys that will never be seen again.
Better: Use TTL on idempotency keys. 24 hours for API requests, 7 days for queue consumers (matching queue retention). Old enough that retries are impossible, short enough that storage stays bounded.
Summary: What to Remember
- An operation is idempotent if doing it multiple times produces the same result as doing it once
- Distributed systems make duplicates inevitable: network retries, at-least-once delivery, consumer crashes
- Idempotency keys are the standard solution: check before processing, skip if already seen
- Redis SET NX with TTL is the most common implementation for dedup checking
- At-least-once + idempotent consumers = effective exactly-once, and this is the production standard
- Always expire idempotency keys with TTL to prevent unbounded storage growth
- The client (or message producer) must generate the idempotency key, not the server
Key numbers:
- Idempotency key TTL: 24 hours for API requests, 7 days for queue consumers
- Redis SET NX: ~0.1ms (fast enough for the critical path)
Interview golden rule:
Don't just say "we use exactly-once delivery." Explain that
you use at-least-once with idempotent consumers, how you track
processed messages, and what happens when a consumer crashes
mid-processing.