Copying data across multiple database servers to improve availability, performance, and fault tolerance.
Database replication means maintaining multiple copies of the same data across different servers. When you write to one database, that data automatically copies to other databases.
This provides redundancy - if one server fails, others continue serving data without interruption.
High Availability: One database server crashes. Users still access data from replicas without downtime.
Read Scalability: Distribute read queries across multiple servers. Handle more traffic without overloading a single database.
Geographic Distribution: Place replicas near users globally. Faster response times for users in different regions.
Disaster Recovery: Complete data center fails. Replicas in other locations keep your application running.
Primary (Master): Handles all write operations. Single source of truth.
Replicas (Slaves): Read-only copies that receive updates from primary.
Applications write to primary, read from replicas. This distributes load and improves performance.
No related topics found.
This happens continuously as data changes.
Synchronous Replication: Primary waits for replicas to confirm write before responding to application.
Pros: Replicas always have latest data. No data loss if primary fails. Cons: Slower writes. Network latency affects performance.
Asynchronous Replication: Primary responds immediately. Replicas catch up later.
Pros: Fast writes. Better performance. Cons: Brief delay before replicas have latest data. Small data loss risk if primary fails.
Most systems use asynchronous for performance, accepting small replication lag.
Time difference between primary having data and replicas receiving it.
Normal: Milliseconds to seconds. Problem: Seconds to minutes indicates issues.
Users might read stale data from replicas. Most applications accept this trade-off for better performance.
Multiple servers accept writes simultaneously. Changes sync between them.
Benefits: Better write performance, no single point of failure for writes.
Challenges: Conflict resolution when same data modified on different primaries. Complex to implement correctly.
Rarely used due to complexity. Single primary is simpler and sufficient for most applications.
Instagram: Replicates user data globally. Users in India read from nearby replicas for fast access.
GitHub: Replicates repository data. Handles millions of reads while primary handles writes.
Banking Systems: Replicate transaction data for compliance and disaster recovery.
Primary server fails. System must promote a replica to become new primary.
Automatic Failover: System detects failure and promotes replica automatically. Minutes of downtime.
Manual Failover: Human intervention required. Can take longer but avoids false positives.
Automated failover is standard for critical systems requiring high availability.
Read-after-Write: User writes data, immediately reads it. Might read from replica that has not yet updated. User does not see their own write.
Solution: Route user reads to primary temporarily, or wait for replication before confirming write.
Monotonic Reads: User reads data, then reads again later. Should not see older version on second read.
Solution: Route user to same replica consistently (sticky sessions).
Replication is NOT backup. Corruption or bad delete replicates to all servers instantly.
Backups are point-in-time snapshots stored separately. Both replication and backups are necessary.
High traffic sites: Read-heavy applications benefit from read replicas.
Global applications: Users worldwide need fast access.
Mission-critical systems: Cannot tolerate downtime.
Not needed: Low-traffic applications on single server. Premature optimization adds complexity.
Trusting replication as backup: Backups are separate and necessary.
Ignoring replication lag: Applications must handle stale reads gracefully.
Complex failover logic: Keep failover simple and test it regularly.
Over-replicating: Too many replicas waste resources. 2-3 replicas usually sufficient.
AWS RDS: Supports read replicas automatically. Handles failover.
Google Cloud SQL: Built-in replication and high availability.
MongoDB Atlas: Global clusters with automatic replication.
Managed databases handle replication complexity. Focus on application logic instead of database operations.
Replication provides availability, scalability, and fault tolerance. Essential for production systems serving significant traffic.
Start with single database. Add replication when traffic grows or availability requirements increase. Do not prematurely optimize.
Understand replication lag and design applications to handle eventual consistency. Most users accept milliseconds of staleness for better performance.