The ability of a system to handle growing amounts of work.
Scalability is a system's ability to handle increased load - more users, traffic, or data - without performance degrading. A scalable system grows smoothly as demand increases.
Think of roads: a two-lane road works for a small town but becomes gridlocked when the town grows into a city. Scalable infrastructure expands capacity as needed.
Growth: Your app has 100 users today, might have 100,000 tomorrow. Can your system handle it?
Traffic Spikes: Product launches, viral posts, flash sales create sudden traffic surges. Scalability prevents crashes.
Cost Efficiency: Scale up during peak hours, scale down during quiet times. Pay only for what you use.
Vertical Scaling (Scaling Up): Make your server bigger. Add more CPU, RAM, storage to one machine.
Pros: Simple. No code changes needed.
Cons: Physical limits. You cannot add infinite RAM. Expensive at scale. Single point of failure.
Horizontal Scaling (Scaling Out): Add more servers. Distribute load across many machines.
Pros: No upper limit. Add servers as needed. If one fails, others keep working.
Cons: More complex. Requires load balancing, data synchronization, distributed system challenges.
Master the art of database scaling - from simple vertical upgrades to complex horizontal architectures that power the world's largest applications.

Understand why system design problems seem impossible at first and discover the mindset shift that makes them manageable.

Ensure each component can scale, survive failures, and stay available independently.

Explore the complete matrix of database architectures from simple monoliths to complex distributed systems.
Horizontal Scaling vs Vertical Scaling
A Complete Guide to Sharding for Horizontal Scale and High Availability
Designing LeetCode - A System Design Tutorial for Beginners
Instagram: Started on one server. Now runs on thousands. Horizontal scaling enabled this growth.
Netflix: Handles millions of concurrent streams by distributing traffic globally across thousands of servers.
Black Friday Sales: E-commerce sites temporarily add servers to handle traffic spikes, then scale back down.
Databases are often the bottleneck.
Read Replicas: Create copies of your database for reading. Writes go to primary, reads distributed across replicas.
Sharding: Split data across multiple databases. Users A-M on Database 1, N-Z on Database 2.
Caching: Use Redis or Memcached to cache frequent queries. Reduces database load dramatically.
Stateless Services: Do not store user session on servers. Use tokens or external session stores. Any server can handle any request.
Microservices: Split application into smaller services that scale independently. Authentication service scales separately from payment service.
Async Processing: Use queues for heavy tasks. User requests return immediately while work processes in background.
Cloud providers (AWS, Google Cloud, Azure) automatically add servers when traffic increases and remove them when traffic drops.
Example: Your API normally runs on 2 servers. Traffic spikes 10x during a sale. Auto-scaling spins up 20 servers. Sale ends, scales back down to 2. You pay only for the hours extra servers ran.
Response Time: Does your API still respond in 200ms with 10,000 concurrent users?
Throughput: Can you handle 1,000 requests per second? 10,000? Where does it break?
Cost: Handling 10x traffic should not cost 100x more. Efficient scaling keeps costs proportional.
Database: Often the first bottleneck. Add read replicas, implement caching.
Single Server: One server cannot handle infinite load. Add horizontal scaling.
Unoptimized Code: N+1 queries, missing indexes, inefficient algorithms. Scale code first before scaling infrastructure.
Early Startups: Focus on building product first. Premature optimization wastes time.
Growing Products: When you see consistent traffic increases or approaching server limits, plan for scaling.
Known Traffic Events: Product launches, marketing campaigns, seasonal spikes. Scale proactively.
Scalable systems are more complex. Multiple servers, load balancers, distributed databases. This complexity costs development time and infrastructure.
Scale when you need it, not before. Instagram ran on one server for their first users. Premature scaling is wasted effort.
Design for scalability from the start (stateless services, separate databases from app servers), but do not build complex distributed systems until you need them.
The best code is code that does not exist yet. Add complexity only when growth demands it.