Design Leetcode

Design Leetcode - Step-by-Step Tutorial | DevLoom

Introduction

Have you ever wondered how platforms like LeetCode work behind the scenes? How do they handle millions of code submissions, execute them safely, and manage live programming contests? In this comprehensive tutorial, we'll walk through designing a complete LeetCode-like system from scratch.

What you'll learn:

How to gather and analyze system requirements
API and database design fundamentals
Secure code execution strategies
Real-time leaderboard implementation
Advanced system design patterns

Don't worry if you're new to system design – we'll explain everything step by step!

What is LeetCode?

LeetCode is an online platform where programmers can:

Browse coding problems
Submit solutions in various programming languages
Get instant feedback on their code
Participate in timed programming contests
View live leaderboards during competitions

Now, let's design our own version!

Step 1: Gathering Requirements

Before writing any code or drawing diagrams, we need to understand exactly what our system should do. This is called requirements gathering, and it's the foundation of good system design.

Functional Requirements (What the system should do)

View Problems: Users can browse a list of coding problems and view individual problem details
Submit Code: Users can submit their solutions and get results (pass/fail)
Contest Support: Host programming contests with time limits
Live Leaderboard: Show real-time rankings during contests

Non-Functional Requirements (How well the system should perform)

Scale: Support 50,000 concurrent users
Latency: Leaderboard updates within 10 seconds
Security: Safely execute potentially malicious user code
Performance: Fast code execution and submission processing

> 💡 Pro Tip: Always start with requirements! They guide every decision you'll make in your design.

Step 2: API Design

The API defines how clients communicate with our system. Think of it as the "menu" of actions users can perform.

Core Endpoints

# Get a single problem
GET /problems/{problemId}

# Get list of problems (with pagination)
GET /problems?offset=0&amp;limit=50

# Submit code for evaluation
POST /submit/{problemId}

{
  "code": "def solution(nums): ...",
  "language": "python"
}

# Get contest leaderboard
GET /contest/{contestId}/leaderboard?offset=0&amp;limit=50

Why These Endpoints?

GET requests for reading data (problems, leaderboards)
POST requests for creating/updating data (code submissions)
Pagination prevents overwhelming responses
URL parameters for required identifiers (problemId, contestId)
Query parameters for optional filters (offset, limit)

Step 3: Database Design

Our system needs to store several types of data. Let's identify the main entities:

Core Entities

Users: Store user information and authentication data
Problems: Coding challenges with descriptions, constraints, examples
Submissions: User code submissions with results and timestamps
Contests: Competition details, participating problems, and duration
Leaderboards: Contest rankings and scores

Key Relationships

Users can have multiple submissions (one-to-many)
Each submission belongs to one problem (many-to-one)
Contests can include multiple problems (many-to-many)
Users can participate in multiple contests (many-to-many)

> 🎯 Quick Note: We don't need to design every table field right now. Focus on understanding the relationships between entities.

Step 4: High-Level System Architecture

Now comes the exciting part – designing the actual system! Let's start simple and add complexity as needed.

Basic Architecture

[Client] → [Web <topicpreview slug="server">Server</topicpreview>] → [Database]

This handles basic operations like viewing problems. The web server processes requests and fetches data from the database.

Adding Code Execution

We can't just run user code directly on our web servers – that would be a massive security risk! Imagine if someone submitted code that tried to:

Access our database
Mine cryptocurrency on our servers
Delete important files
Make unauthorized network requests

The Solution: Isolated Code Execution

[Client] → [Web Server] → [Message Queue] → [Code Execution Workers]
                            ↓
                       [Database]

How it works:

User submits code to the web server
Web server stores submission in the database
Code execution request goes to a message queue
Specialized workers pick up jobs from the queue
Workers execute code in isolated environments
Results are stored back in the database

Why Use a Message Queue?

A message queue acts like a buffer between your web server and code execution workers. Benefits include:

Reliability: If a worker crashes, the job stays in the queue
Scalability: Add more workers during high traffic
Decoupling: Web server doesn't wait for slow code execution

Popular message queues: Apache Kafka, RabbitMQ, Amazon SQS

Step 5: Secure Code Execution Deep Dive

This is one of the most critical parts of our system. Let's explore different approaches to safely execute untrusted code.

Option 1: Virtual Machines (VMs)

Pros: Maximum isolation, very secure
Cons: Slow to start, expensive

Option 2: Docker Containers (Recommended)

Pros: Fast startup, good isolation, resource limits
Cons: Slightly less secure than VMs (but sufficient)

Option 3: Cloud Functions

Pros: Auto-scaling, managed infrastructure
Cons: Vendor lock-in, latency issues

Option 4: WebAssembly (Creative)

Pros: Runs in user's browser, no server costs
Cons: Limited language support

Our Choice: Docker Containers

Docker gives us the best mix of isolation, speed, and cost.

Container Configuration:

No network access
Limited file system
Memory limit (e.g., 128MB)
CPU and timeout limits

Execution Flow

Receive code submission
Create a secure Docker container
Execute code against test cases
Compare output with expected results
Return pass/fail
Destroy container

Step 6: Real-Time Leaderboards

How do we handle thousands of users watching leaderboards?

Naive Approach (Don’t do this!)

SELECT user_id, SUM(points) as total_score
FROM submissions
WHERE contest_id = 'current_contest' AND status = 'accepted'
GROUP BY user_id
ORDER BY total_score DESC;

Problems:

Slow, especially with lots of data
Poor performance under load

Smart Approach: Redis Sorted Sets

Redis offers in-memory Sorted Sets that auto-maintain order.

Benefits:

O(log n) insert
O(n) retrieval
High-speed leaderboard updates

Leaderboard Example

// Add user score
redis.zadd('contest_123_leaderboard', user_score, user_id);

// Get top 10
redis.zrevrange('contest_123_leaderboard', 0, 9, 'WITHSCORES');

Updating in Real Time

Client-side polling every 10 seconds works well:

50,000 users → 5,000 requests/second
Redis handles this easily

Bonus: Use WebSockets for true real-time but requires more infra

Step 7: Advanced Considerations

Handling Traffic Spikes

Auto-scale execution workers
Pre-scale before contests
Rate limit abusive users
Load balance across servers

Ensuring Data Consistency

If Redis update fails but DB write succeeds:

Use Change Data Capture (CDC) tools like Debezium
Sync DB → Redis via message queue

Test Case Management

Run user code against multiple test cases:

Approach:

Store test cases in files
Use driver code to run user solution
Compare actual vs. expected output
Language-agnostic format

Example Test Case File:

# Input
[1, 2, 3, 4, 5]
# Expected Output
15

# Input
[10, 20, 30]
# Expected Output
60

Step 8: Complete System Architecture

┌─────────────┐    ┌──────────────┐    ┌─────────────┐
│   Clients   │ → │ Load Balancer│ → │ Web Servers │
└─────────────┘    └──────────────┘    └─────────────┘
                             │
         ┌───────────────────┼───────────────────┐
         │                   │                   │
   ┌───────────┐    ┌─────────────────┐    ┌──────────┐
   │ Database  │    │  Message Queue  │    │  Redis   │
   └───────────┘    └─────────────────┘    └──────────┘
                                   │
                         ┌─────────────────┐
                         │ Code Execution  │
                         │    Workers      │
                         │   (Docker)      │
                         └─────────────────┘

Performance Characteristics

Scalability

50,000 concurrent users
Message queue smooths spikes
Redis supports 5,000+ reads/sec

Latency

Code execution: 1–10 seconds
Leaderboard updates: < 10 seconds
Problem viewing: sub-second

Security

Containerized execution
CPU/memory/time/network sandboxing

Key Takeaways for Beginners

Start with requirements – don’t rush into coding
Sandbox user code – never trust it
Choose efficient structures – like Redis Sorted Sets
Design for scale – use queues and caches early
Understand trade-offs – speed vs. safety, cost vs. performance
Test and monitor – always plan for real-world failure

What's Next?

Practice by designing systems like HackerRank or Twitter
Build a basic version of this project
Study distributed systems, Redis, Docker, Kafka
Experiment with WebSockets, CDC tools

Conclusion

We’ve gone from basic requirements to a scalable, secure architecture ready for production. The key principles:

Security first
Scale thoughtfully
Start simple
Plan for failure

System design can feel overwhelming, but step-by-step planning makes it approachable.

Happy designing! 🚀

Want to practice system design? Try designing other popular systems like Twitter, Netflix, or Uber.

The principles you learned here apply everywhere!

DevLoom