#Introduction
The interviewer says: "Design search autocomplete."
You say to run a database LIKE 'prefix%' query on every keystroke. Then the follow-up arrives: "What if there are billions of queries, users type fast, and trending searches need to appear within minutes?"
Autocomplete is a low-latency top-k prefix lookup system. The read path should be precomputed and cache-friendly, while the write path aggregates completed searches into refreshed rankings.
Ready to practice? Try the Autocomplete practice problem and build this system step-by-step with AI-guided feedback.
#Functional Requirements
1. Prefix matching
- A user types a prefix
- The system returns matching suggestions
- The response should usually include the top 5 suggestions
A Trie is the classic data structure. Each edge represents a character, and each node represents a prefix. If each node stores the top suggestions for that prefix, serving a request is fast.
2. Popularity ranking
- Suggestions are ranked by search frequency
- Ranking can be global, regional, language-specific, or personalized
- Completed search events are the source of popularity counts
The serving path should not count raw query events synchronously. Query events should flow into an aggregation pipeline that refreshes top-k snapshots.
#Non-Functional Requirements
Keystroke latency
Autocomplete should return in roughly 50-100ms. Use client-side debouncing, edge caching, in-memory prefix maps, and small response payloads. The read path is a specialized example of search serving architecture.
Trending freshness
The system does not need exact counts per keystroke, but it should reflect major trends within minutes or hours. Batch refreshes are simpler. Streaming refreshes are fresher but more complex.
High QPS
Autocomplete receives a request for nearly every character typed. Hot prefixes such as "a", "new", or "weather" need aggressive caching.
Availability
If ranking refresh is delayed, serving should continue from the last known snapshot. Stale suggestions are better than an empty dropdown.
#API Design
Fetch suggestions
GET /api/v1/suggestions?prefix=new%20y&limit=5&locale=en-US&userRegion=US
Response:
{
"suggestions": [
{ "text": "new york weather", "score": 98210, "source": "regional" },
{ "text": "new york times", "score": 77120, "source": "global" }
]
}
Record search event
POST /api/v1/search-events
Request:
{
"eventId": "evt_123",
"query": "new york weather",
"userRegion": "US",
"language": "en",
"timestamp": "2026-04-22T12:00:00Z"
}
Response:
{
"accepted": true,
"eventId": "evt_123"
}
#High Level Design
The client debounces input and calls the closest edge/API endpoint. Hot prefixes are served from edge cache. Cache misses go to the suggestion service, which reads from a memory-resident Trie or prefix map.
Completed searches are appended to an event log. A ranking aggregator computes counts and top-k suggestions by prefix, region, language, and time window. Refreshed snapshots are written to the suggestion store and loaded into serving caches.
#Detailed Design
Trie serving
Each Trie node can store:
- prefix
- top 5 global suggestions
- top 5 regional suggestions
- last updated timestamp
This makes reads fast but increases write complexity because a completed query affects every prefix along its path.
Aggregation
For the query "new york weather", increment counts for:
- n
- ne
- new
- new y
- new yo
- and so on
This can be done in batch jobs, stream processors, or a hybrid approach.
Caching
Cache hot prefixes at the edge and in service memory. Set minimum prefix lengths if one-character prefixes are too expensive or low quality.
Personalization
Start with global and regional rankings. Add personalization only after the base system is correct, because personalization increases privacy, storage, and ranking complexity.
#Common Interview Mistakes
- Querying the primary database on every keystroke
- Returning all matches instead of top-k suggestions
- Updating Trie rankings synchronously on the user request path
- Ignoring client-side debouncing
- Forgetting stale-but-available serving during ranking pipeline failures