Design an Online Judge

Building a LeetCode-style platform for secure code execution and contest ranking

S
System Design Sandbox··15 min read
Learn how to design an online judge like LeetCode. Covers async submission execution, secure sandboxing, language worker pools, verdict persistence, WebSocket feedback, and Redis-backed contest leaderboards.

#Introduction

The interviewer says: "Design LeetCode."

A weak answer starts with problems, submissions, and a database. A stronger answer starts with the risk: users submit untrusted code, the system must execute it safely, and contest users expect feedback quickly.

An online judge is an async execution pipeline with a secure sandbox and a real-time results path.

Ready to practice? Try the Online Judge practice problem and build this system step-by-step with AI-guided feedback.

Related concepts for this design: Message Queues, Redis Sorted Sets, Scaling, Idempotency & Deduplication, and System Design Structure.


#Functional Requirements

1. Submit and execute code

  • Users submit source code for a problem and language
  • The platform compiles or interprets the code
  • The judge runs hidden and sample test cases
  • The system returns a verdict

Execution should use async job worker pools. The API accepts the submission and returns quickly. Workers compile and run the code later.

2. Competition leaderboard

  • Contests have many simultaneous submissions
  • Users need rank updates quickly
  • Ranking may depend on solved count, penalty time, or score

Use Redis Sorted Sets or a similar ranked serving structure. Do not run full SQL aggregates for every leaderboard refresh.


#Non-Functional Requirements

Secure sandboxing

User code is hostile by default. Run it in a sandboxed execution environment with CPU, memory, process, output, filesystem, and network limits.

Feedback latency

Most submissions should reach a verdict within a few seconds. Use warm worker pools for common languages, prioritize contest queues, and push status over WebSockets.

Reliability

Persist submissions before enqueueing execution jobs. If a worker crashes, the job should be reclaimed or marked as an infrastructure failure. This is where message visibility timeouts and idempotent consumers matter.

Fairness

Contest traffic should not be starved by bulk practice submissions. Use separate queues and quotas.


#API Design

Create submission

POST /api/v1/submissions

Request:

{
  "problemId": "two-sum",
  "language": "python3",
  "sourceCode": "print('hello')",
  "contestId": "weekly-451"
}

Response:

{
  "submissionId": "sub_123",
  "status": "queued"
}

Get submission result

GET /api/v1/submissions/sub_123

Response:

{
  "submissionId": "sub_123",
  "status": "accepted",
  "runtimeMs": 42,
  "memoryKb": 10240
}

Contest leaderboard

GET /api/v1/contests/weekly-451/leaderboard?cursor=0&limit=50

#High Level Design

Contest Client
API Gateway
Submission Service
Execution Queue
Worker Pool
MicroVM Sandbox
Submission DB
Leaderboard Cache
Result Pub/Sub

The API gateway authenticates users and sends submissions to the submission service. The submission service stores the source and metadata, then enqueues an execution job. This follows the same decoupling principle as Async Processing.

Judge workers claim jobs from the queue, start a sandbox, load test cases, run the code, and persist verdicts. Result events update the user's live connection and update the contest leaderboard cache. The ranking side is a direct application of Redis Sorted Sets.


#Detailed Design

Submission storage

Store source code, language, problem id, user id, contest id, status, and timestamps. The source of truth is durable storage, not the queue message.

Execution

Workers should run compile and test steps inside the sandbox. Each test has time and memory limits. The judge records the first failing test class, but hidden test details should not leak.

Leaderboard

On accepted submissions, update contest score in a sorted set:

ZADD contest:weekly-451:rank score userId
ZREVRANGE contest:weekly-451:rank 0 49 WITHSCORES

Persist final contest results to durable storage after the contest or periodically from the serving cache. Redis is the fast serving layer; the durable store remains the source of truth, as in Databases & Caching.

Realtime results

Use WebSockets for submission status changes:

queued -> running -> accepted
queued -> running -> wrong_answer
queued -> running -> time_limit_exceeded

Polling can remain as a fallback.


#Common Interview Mistakes

  • Running submitted code on the API server
  • Saying "Docker" without CPU, memory, network, and filesystem limits
  • Storing test cases inside queue messages
  • Computing leaderboard ranks with SQL scans during contests
  • Retrying wrong answers as if they were infrastructure failures

See also: Sandboxed Code Execution, Async Job Worker Pools, Redis Sorted Sets, and the Real-time Leaderboard solution.