Ad Click Aggregator

#Introduction

The interviewer says: "Design an ad click aggregator."

You say the browser records a click and the advertiser dashboard reads the database. Then the follow-up arrives: "What if the user double-clicks? What if the redirect service retries? What if advertisers query 90 days of data grouped by campaign and minute? What if you overbill them?"

This is a high-throughput event pipeline with a latency-sensitive redirect path, deduplication for billing correctness, stream aggregation with Flink, and an OLAP database for reporting.

#Functional Requirements

1. Click redirect

User clicks a tracking URL owned by the ad platform
The platform records the click event
The platform returns a server-side 302 redirect to the advertiser URL
The redirect path must be fast because the user is waiting

#Click Redirect Flow

Browser -> /click/{impressionId} -> Redirect Service
Redirect Service -> enqueue click event
Redirect Service -> 302 Location: advertiser_url
Browser -> Advertiser

The ad link should not point directly to the advertiser. If it does, the platform cannot reliably count or bill the click.

2. Minute-level aggregation

Count clicks by ad, campaign, publisher, country, device, and minute
Serve advertiser dashboards with recent and historical data
Preserve raw click events for audits and reprocessing

#Non-Functional Requirements

Billing correctness

Every event needs a stable deduplication key, usually impression_id or click_id. Duplicate redirects, retries, and replayed Kafka messages must not double-count billable clicks.

High throughput

The system should handle 10k+ clicks per second and scale much higher during campaigns. The redirect service writes to Kafka instead of updating dashboard counters synchronously.

Sub-second query latency

Advertiser dashboards are read-heavy and aggregation-heavy. Serve them from ClickHouse, Druid, Pinot, or another columnar OLAP store.

Fault tolerance

Kafka retains raw click events. Flink checkpoints aggregation state. OLAP writes should be idempotent or replaceable by window key.

#API Design

Track click and redirect

GET /api/v1/click/imp_abc123?ad_id=ad_9&campaign_id=cmp_7

Response:

302 Found
Location: https://advertiser.example/landing-page

The redirect service emits:

{
  "impressionId": "imp_abc123",
  "adId": "ad_9",
  "campaignId": "cmp_7",
  "publisherId": "pub_4",
  "timestamp": "2026-04-20T12:00:00Z",
  "ipHash": "hash",
  "userAgent": "Mozilla/5.0"
}

Query click reports

GET /api/v1/reports/clicks?campaign_id=cmp_7&from=2026-04-01&to=2026-04-20&groupBy=minute,ad_id

Response:

{
  "rows": [
    {
      "minute": "2026-04-20T12:00:00Z",
      "adId": "ad_9",
      "clicks": 841
    }
  ]
}

#High Level Design

Redirect service

Receives click requests, validates the impression, emits the click event, and returns a 302. This service is on the user path, so it must do minimal work.

Redis dedup

Fast check for recent impression IDs or click IDs. This prevents obvious duplicate billable events before Kafka.

Kafka click log

Durable append-only stream of accepted click events. It allows replay and decouples redirects from aggregation.

Flink aggregator

Deduplicates with state, computes 1-minute windows, handles late events, and writes aggregate rows.

OLAP store

Stores raw or aggregated click facts in a columnar format for advertiser dashboards.

Reporting API

Accepts time range, filters, group-by dimensions, and granularity. Reads pre-aggregated tables when possible.

#Detailed Design

#Reporting Path

The dashboard should not query the redirect database. It should query OLAP tables shaped for analytics:

raw_clicks(impression_id, ad_id, campaign_id, publisher_id, timestamp, country, device)
clicks_by_minute(minute, ad_id, campaign_id, publisher_id, clicks)
clicks_by_day(day, campaign_id, clicks)

Recent dashboards read minute aggregates. Long-range dashboards read daily aggregates. Raw clicks are retained for audits and new aggregate definitions.

#Deduplication

Use multiple layers:

redirect service checks Redis SET NX click:{impressionId}
Flink keeps state for recently seen click IDs
OLAP aggregate writes are keyed by window and dimensions

This is how at-least-once delivery becomes effectively exactly-once counting.

#Late Events

Clicks can arrive late due to network retries or Kafka lag. Flink should allow a lateness window, then either update the aggregate or write a correction event. Billing systems should make this policy explicit.

#Common Interview Mistakes

Letting the browser go directly to the advertiser. The platform cannot bill what it does not observe.

Incrementing SQL counters per click. This creates hot rows and write amplification.

Ignoring duplicate clicks. Duplicate billing breaks trust immediately.

Using OLTP storage for advertiser dashboards. Dashboard queries are OLAP workloads.