Top-K Ranking and Heavy Hitters

#Introduction

You are designing autocomplete. The user types new, and the service has 40,000 matching queries.

The product asks for the top 5.

That sounds simple until the interviewer asks:

"Top by what time window? What if a celebrity goes viral in the last 10 minutes? What if one query gets 10 million events per hour?"

Top-k ranking is not just sorting. It is deciding which items matter under high write volume, time windows, and freshness constraints.

#Exact Top-K

The exact version is straightforward:

query -> count

new york weather -> 98,210
new york times   -> 77,120
new york flights -> 52,004

Then sort by count and keep the largest k.

For small datasets, a database query or Redis Sorted Set is enough:

ZINCRBY autocomplete:new 1 "new york weather"
ZREVRANGE autocomplete:new 0 4 WITHSCORES

For larger systems, exact top-k on every request is too expensive. You do not want the read path to scan and sort thousands of candidates per keystroke.

Instead, compute top-k in the background and serve snapshots.

#Heavy Hitters and Approximate Counting

Heavy hitters are the items that dominate the stream.

If the system receives billions of search events, exact per-query counters may be expensive. Approximate structures can reduce memory:

Count-Min Sketch estimates frequencies with bounded overcounting.
Space-Saving tracks likely top items in a fixed-size structure.
Reservoir sampling helps with representative samples, not exact top-k.

The interview point is not to name every algorithm. It is to explain the tradeoff:

Approach	Benefit	Cost
Exact counters	precise	high memory for huge cardinality
Approximate counters	bounded memory	possible ranking errors
Batch aggregation	simple and stable	freshness lag
Streaming aggregation	fresher	more operational complexity

Autocomplete can usually tolerate approximate counts. Billing systems usually cannot.

#Time Windows and Decay

"Most popular" needs a time window.

All-time popularity is stable but slow to react. Last-minute popularity is fresh but noisy.

Common strategies:

all-time count for stable suggestions
trailing 24-hour count for current demand
trailing 5-minute count for trends
exponential decay so old events gradually matter less

Example score:

score = all_time_count * 0.5
      + last_24h_count * 2.0
      + last_10m_count * 5.0

The exact formula is a product decision. The system design point is that rankings should be precomputed per window, region, and language if those dimensions affect results.

#Serving Snapshots

The read path should not compute global popularity.

A typical flow:

Search events
  -> Kafka or event log
  -> stream/batch aggregation
  -> top-k by prefix, region, language
  -> durable suggestion store
  -> serving cache

The serving service reads a compact result:

{
  "prefix": "new y",
  "region": "US",
  "suggestions": [
    { "text": "new york weather", "score": 98210 },
    { "text": "new york times", "score": 77120 }
  ],
  "generatedAt": "2026-04-22T12:00:00Z"
}

If the ranking job is delayed, serve the last good snapshot. Stale suggestions are usually better than an empty dropdown.

#Common Interview Mistakes

Mistake 1: Sorting on the request path.

The user is waiting after every keystroke. Serve precomputed top-k.

Mistake 2: Not defining the window.

All-time top results and trending results are different products.

Mistake 3: Using exact counters when approximate is acceptable.

Exactness costs memory and write throughput. Use it only where correctness demands it.

Mistake 4: Ignoring hot keys.

Popular prefixes can dominate writes and reads. Cache hot snapshots and partition aggregation carefully.

Mistake 5: Treating ranking as one global list.

Autocomplete often needs region, language, device, and safe-search variants.

#Summary: What to Remember

Top-k ranking turns event streams into small serving lists.

For autocomplete, the read path should return precomputed suggestions. The write path should aggregate completed searches by prefix and time window. Use exact counters when the cardinality is manageable, approximate heavy-hitter structures when memory matters, and stale snapshots when the ranking pipeline falls behind.

In interviews, always ask: top-k over what time window, for what region, and with what freshness target?