Tries and Prefix Search

Serving autocomplete with prefix traversal and top-k suggestions

S
System Design Sandbox··11 min read
Learn how Tries support low-latency autocomplete. Covers prefix traversal, storing top-k suggestions at nodes, write amplification, snapshot rebuilds, serving caches, and memory tradeoffs.

#Introduction

You are designing autocomplete. A user types:

n
ne
new
new y

Your first idea is a SQL query:

SELECT query
FROM searches
WHERE query LIKE 'new y%'
ORDER BY frequency DESC
LIMIT 5;

That may work for a small table. It does not work when every keystroke from every user becomes a request.

Autocomplete needs a prefix data structure. The classic answer is a Trie.


#Why Prefix Search Needs a Trie

A Trie stores strings one character at a time.

root
  n
   e
    w
     " "
       y
        o
         r
          k

To look up new y, the service walks five characters from the root. It does not scan every search query. Once it reaches the prefix node, it reads the suggestions stored there.

n
e
w
y
Root
n
ne
new
new y
Top-5 Suggestions

The lookup cost is tied to prefix length, not the total number of indexed queries:

lookup("new y") = walk n -> e -> w -> space -> y

That is why Tries fit typeahead. The user's input is always a prefix.


#Storing Top-K at Each Node

A naive Trie can find all words below a prefix. Autocomplete does not need all words. It needs the best five.

Store top-k suggestions at each node:

node("new y"):
  1. new york weather
  2. new york times
  3. new york city
  4. new york flights
  5. new year countdown

Now serving is simple:

  1. Walk to the prefix node.
  2. Return that node's top suggestions.

The tradeoff is write cost. When new york weather becomes more popular, the system may need to update top-k lists for:

n
ne
new
new
new y
new yo
...
new york weather

That is why many production systems update Trie snapshots asynchronously instead of updating every node on the user request path.


#Updates and Rebuilds

There are two common update strategies.

Incremental updates

Every completed search updates counters and may update top-k lists for each prefix.

This is fresher, but it can create write amplification. One query affects many prefixes.

Batch or streaming rebuilds

Completed searches are appended to an event log. A background job computes fresh top-k suggestions and publishes a new snapshot.

This is usually simpler:

Search events -> counters -> top-k per prefix -> new Trie snapshot -> serving fleet

The serving fleet can keep the old snapshot while the new one builds, then atomically swap.

For more on the ranking side of this pipeline, see Top-K Ranking and Heavy Hitters.


#Serving Architecture

The read path should be boring:

Client debounce
  -> edge cache for hot prefixes
  -> suggestion service
  -> in-memory Trie or prefix map
  -> JSON response

Hot prefixes like a, new, and weather should sit in memory or at the edge. Cold prefixes can fall back to a durable suggestion store.

A Trie is not the only representation. Some systems use:

  • sorted arrays of terms with binary search by prefix
  • finite state transducers for compact memory use
  • key-value maps from prefix to top-k suggestions

The interview answer can start with Trie. Then mention that the production serving representation may be optimized for memory and deployment speed.


#Common Interview Mistakes

Mistake 1: Running a database prefix scan per keystroke.

Autocomplete is too latency-sensitive and high-QPS for the primary database.

Mistake 2: Returning all matches.

The UI needs top-k suggestions, not every string below the prefix.

Mistake 3: Updating every prefix synchronously.

That creates write amplification on the user request path.

Mistake 4: Ignoring memory.

Top-k lists at every node are fast but can be large. Compress, limit prefix depth, or store only popular prefixes.

Mistake 5: Forgetting client debouncing.

Even a perfect backend gets hammered if the client sends every intermediate keystroke without delay.


#Summary: What to Remember

A Trie is a natural fit for autocomplete because the request is a prefix lookup.

The production trick is not just "use a Trie." It is storing top-k suggestions at prefix nodes, refreshing rankings asynchronously, serving hot prefixes from memory or edge cache, and keeping the read path separate from the event ingestion path.

In interviews, explain prefix traversal, top-k storage, update write amplification, and snapshot rebuilds. That is enough depth for most autocomplete designs.