#Introduction
You are designing autocomplete. The user types new, and the service has 40,000 matching queries.
The product asks for the top 5.
That sounds simple until the interviewer asks:
"Top by what time window? What if a celebrity goes viral in the last 10 minutes? What if one query gets 10 million events per hour?"
Top-k ranking is not just sorting. It is deciding which items matter under high write volume, time windows, and freshness constraints.
#Exact Top-K
The exact version is straightforward:
query -> count
new york weather -> 98,210
new york times -> 77,120
new york flights -> 52,004
Then sort by count and keep the largest k.
For small datasets, a database query or Redis Sorted Set is enough:
ZINCRBY autocomplete:new 1 "new york weather"
ZREVRANGE autocomplete:new 0 4 WITHSCORES
For larger systems, exact top-k on every request is too expensive. You do not want the read path to scan and sort thousands of candidates per keystroke.
Instead, compute top-k in the background and serve snapshots.
#Heavy Hitters and Approximate Counting
Heavy hitters are the items that dominate the stream.
If the system receives billions of search events, exact per-query counters may be expensive. Approximate structures can reduce memory:
- Count-Min Sketch estimates frequencies with bounded overcounting.
- Space-Saving tracks likely top items in a fixed-size structure.
- Reservoir sampling helps with representative samples, not exact top-k.
The interview point is not to name every algorithm. It is to explain the tradeoff:
| Approach | Benefit | Cost |
|---|---|---|
| Exact counters | precise | high memory for huge cardinality |
| Approximate counters | bounded memory | possible ranking errors |
| Batch aggregation | simple and stable | freshness lag |
| Streaming aggregation | fresher | more operational complexity |
Autocomplete can usually tolerate approximate counts. Billing systems usually cannot.
#Time Windows and Decay
"Most popular" needs a time window.
All-time popularity is stable but slow to react. Last-minute popularity is fresh but noisy.
Common strategies:
- all-time count for stable suggestions
- trailing 24-hour count for current demand
- trailing 5-minute count for trends
- exponential decay so old events gradually matter less
Example score:
score = all_time_count * 0.5
+ last_24h_count * 2.0
+ last_10m_count * 5.0
The exact formula is a product decision. The system design point is that rankings should be precomputed per window, region, and language if those dimensions affect results.
#Serving Snapshots
The read path should not compute global popularity.
A typical flow:
Search events
-> Kafka or event log
-> stream/batch aggregation
-> top-k by prefix, region, language
-> durable suggestion store
-> serving cache
The serving service reads a compact result:
{
"prefix": "new y",
"region": "US",
"suggestions": [
{ "text": "new york weather", "score": 98210 },
{ "text": "new york times", "score": 77120 }
],
"generatedAt": "2026-04-22T12:00:00Z"
}
If the ranking job is delayed, serve the last good snapshot. Stale suggestions are usually better than an empty dropdown.
#Common Interview Mistakes
Mistake 1: Sorting on the request path.
The user is waiting after every keystroke. Serve precomputed top-k.
Mistake 2: Not defining the window.
All-time top results and trending results are different products.
Mistake 3: Using exact counters when approximate is acceptable.
Exactness costs memory and write throughput. Use it only where correctness demands it.
Mistake 4: Ignoring hot keys.
Popular prefixes can dominate writes and reads. Cache hot snapshots and partition aggregation carefully.
Mistake 5: Treating ranking as one global list.
Autocomplete often needs region, language, device, and safe-search variants.
#Summary: What to Remember
Top-k ranking turns event streams into small serving lists.
For autocomplete, the read path should return precomputed suggestions. The write path should aggregate completed searches by prefix and time window. Use exact counters when the cardinality is manageable, approximate heavy-hitter structures when memory matters, and stale snapshots when the ranking pipeline falls behind.
In interviews, always ask: top-k over what time window, for what region, and with what freshness target?