#Introduction
You are designing autocomplete. A user types:
n
ne
new
new y
Your first idea is a SQL query:
SELECT query
FROM searches
WHERE query LIKE 'new y%'
ORDER BY frequency DESC
LIMIT 5;
That may work for a small table. It does not work when every keystroke from every user becomes a request.
Autocomplete needs a prefix data structure. The classic answer is a Trie.
#Why Prefix Search Needs a Trie
A Trie stores strings one character at a time.
root
n
e
w
" "
y
o
r
k
To look up new y, the service walks five characters from the root. It does not scan every search query. Once it reaches the prefix node, it reads the suggestions stored there.
The lookup cost is tied to prefix length, not the total number of indexed queries:
lookup("new y") = walk n -> e -> w -> space -> y
That is why Tries fit typeahead. The user's input is always a prefix.
#Storing Top-K at Each Node
A naive Trie can find all words below a prefix. Autocomplete does not need all words. It needs the best five.
Store top-k suggestions at each node:
node("new y"):
1. new york weather
2. new york times
3. new york city
4. new york flights
5. new year countdown
Now serving is simple:
- Walk to the prefix node.
- Return that node's top suggestions.
The tradeoff is write cost. When new york weather becomes more popular, the system may need to update top-k lists for:
n
ne
new
new
new y
new yo
...
new york weather
That is why many production systems update Trie snapshots asynchronously instead of updating every node on the user request path.
#Updates and Rebuilds
There are two common update strategies.
Incremental updates
Every completed search updates counters and may update top-k lists for each prefix.
This is fresher, but it can create write amplification. One query affects many prefixes.
Batch or streaming rebuilds
Completed searches are appended to an event log. A background job computes fresh top-k suggestions and publishes a new snapshot.
This is usually simpler:
Search events -> counters -> top-k per prefix -> new Trie snapshot -> serving fleet
The serving fleet can keep the old snapshot while the new one builds, then atomically swap.
For more on the ranking side of this pipeline, see Top-K Ranking and Heavy Hitters.
#Serving Architecture
The read path should be boring:
Client debounce
-> edge cache for hot prefixes
-> suggestion service
-> in-memory Trie or prefix map
-> JSON response
Hot prefixes like a, new, and weather should sit in memory or at the edge. Cold prefixes can fall back to a durable suggestion store.
A Trie is not the only representation. Some systems use:
- sorted arrays of terms with binary search by prefix
- finite state transducers for compact memory use
- key-value maps from prefix to top-k suggestions
The interview answer can start with Trie. Then mention that the production serving representation may be optimized for memory and deployment speed.
#Common Interview Mistakes
Mistake 1: Running a database prefix scan per keystroke.
Autocomplete is too latency-sensitive and high-QPS for the primary database.
Mistake 2: Returning all matches.
The UI needs top-k suggestions, not every string below the prefix.
Mistake 3: Updating every prefix synchronously.
That creates write amplification on the user request path.
Mistake 4: Ignoring memory.
Top-k lists at every node are fast but can be large. Compress, limit prefix depth, or store only popular prefixes.
Mistake 5: Forgetting client debouncing.
Even a perfect backend gets hammered if the client sends every intermediate keystroke without delay.
#Summary: What to Remember
A Trie is a natural fit for autocomplete because the request is a prefix lookup.
The production trick is not just "use a Trie." It is storing top-k suggestions at prefix nodes, refreshing rankings asynchronously, serving hot prefixes from memory or edge cache, and keeping the read path separate from the event ingestion path.
In interviews, explain prefix traversal, top-k storage, update write amplification, and snapshot rebuilds. That is enough depth for most autocomplete designs.