Cache Eviction Internals

Implementing TTL, LRU, LFU, sampling, and memory-pressure behavior

S
System Design Sandbox··10 min read
Learn how distributed cache nodes decide what to remove when memory is full. Covers TTL expiration, lazy and active expiry, LRU and LFU metadata structures, approximate policies, byte-aware eviction, and common implementation mistakes.

#Introduction

A distributed cache has 128 GB of memory. The product says "store keys until they expire."

Then traffic changes. Some keys never expire. Other keys are large. One tenant fills memory. The node starts evicting randomly or crashes.

Eviction is the part of a cache that decides what survives memory pressure.

This article goes deeper than the broad Databases & Caching overview. It supports the Distributed Cache solution, where TTL, LRU/LFU metadata, consistent hashing, and hot key protection all show up together.


#TTL Expiration

TTL answers a different question than eviction.

TTL says: "This key is invalid after this time."

Eviction says: "Memory is full; which valid key should be removed?"

A cache usually needs both.

Common TTL strategies:

  • lazy expiration on read
  • active background sampling
  • expiration wheels or time buckets
  • jittered TTLs to avoid mass expiry

Lazy expiration is simple but can leave expired keys consuming memory. Active expiration keeps memory cleaner but uses CPU. Real systems usually combine both. TTL also affects cache stampede behavior: if many hot keys expire together, the database gets hit at once.


#LRU and LFU Internals

LRU is often implemented with:

  • hash map from key to entry
  • doubly linked list ordered by recency
  • move-to-front on every get/set
  • evict from the tail

LFU needs frequency tracking:

  • hash map from key to entry
  • count or logarithmic counter
  • buckets by frequency
  • decay so old popularity does not live forever
GET / SET
Entry Hash Map
TTL Buckets
Eviction Index
Victim Key

The interview tradeoff:

PolicyGood forCost
LRUrecent working setseach read updates metadata
LFUstable hot keyscounters, decay, more bookkeeping
Random/samplevery high throughputless precise choices
TTL-onlyfreshness rulesdoes not handle memory pressure well

#Memory Pressure and Sampling

Exact LRU on every access can be expensive in a distributed cache with high concurrency.

Many production caches use approximations:

  • sample a few keys and evict the worst candidate
  • use segmented LRU to protect frequently reused keys
  • use approximate counters for LFU
  • charge eviction by memory size, not just key count

Size matters. Evicting one 20 MB key may help more than evicting 10 tiny keys. In a sharded cache, this metadata is usually local to each owner shard, while routing is handled by Consistent Hashing or a slot map.


#Common Interview Mistakes

Mistake 1: Confusing TTL with eviction.

TTL handles freshness. Eviction handles memory pressure.

Mistake 2: Saying LRU without implementation.

Mention the hash map plus linked list shape, or explain an approximate policy.

Mistake 3: Ignoring large values.

Eviction should account for bytes, not only item count.

Mistake 4: Letting old LFU counts live forever.

LFU needs decay, or last month's hot key can block today's working set.


#Summary: What to Remember

Cache eviction is memory management.

Use TTL for expiration and LRU, LFU, or approximate policies for memory pressure. Track metadata efficiently, account for object size, and explain how the policy behaves under high concurrency.

Related articles: Databases & Caching, Hot Keys and Cache Stampedes, Consistent Hashing, CAP Theorem, and Design a Distributed Cache.