# 8 System Design: Database Replication

Welcome to the 8th episode :)

Cmd + C, Cmd+V

Is that replication ?

Yes it is ! copying data from one source to another and repeating this process is called Replication.

Data is copied for a variety of reasons:

To safeguard against data loss during system failures ( Disaster Recovery -DR)
To service greater traffic
To reduce latency

Below, we'll go through some more particular use examples.

Why does it matter?

Replication is a important technique adopted in distributed systems. With data being present across various servers/nodes and that they all are connected by networks, failure in one shouldn't impact the service. Hence its super common to store data in multiple places to prevent data loss.

You can also think of it as a way for application to scale. Since data is now in more than one place, data access latency is low because its closer to the user and performance is optimal and users have great and consistent experience regardless of their location and system load.

Replication is super easy if the data did not change, however that's what is the reality in distributed systems.

Let's take the case of a "database" write to a "leader" also known as the primary database. The write needs to propagate to the followers consistently.

1. Synchronous Replication

This is going to be slow, if its done synchronously, since the this kind of replication requires the leader and the followers to commit before the write is deemed successful. While this keeps the data up-to-date in the followers, there is some latency to it.

What if one of the follower in the crew fails ? the write query will fail ,even if the other followers are still is up !!!

2. Asynchronous Replication:

This is the opposite of the previous one where, transaction speed is more important than the consistency of the data. With async replication, the leader sends the writes to its followers asynchronously ( ie no waiting for it to commit or acknowledge) and return the response to the user immediately.

But this comes with a tradeoff, that there might be some stale data in the followers when you want to read the data from it. Also, if the leader shuts down we don't have the "most up-to-date" information !!

Better to be cautious, than to lose...

Leader failures are inevitable. In those situations one of the follower is promoted to be the leader and takes over.

Fail over is a big problem in asynchronous replication. But that doesn't mean synchronous is any better.

Without a leader you can't make writes !!!

How can we solve this leader problem now?

A simple way to mitigate leader failure is to designate more than one leader. Leader-leader or multi-leader replication simply means that more than one database is available to take writes, meaning, if one leader goes down, the other can step in. This does introduce a slight lag as data must be replicated to both (or more) leaders and engineers must contend with more complexity — mainly conflict resolution when discrepancies arise between leaders — but the added durability mostly outweighs the additional lag time in the real world.

Consensus algorithms can be used to "elect" a new leader if one of more leaders goes down, adding another layer of protection to the system. The most common consensus algorithm is Paxos. Many consider it a difficult algorithm to understand, perhaps because the leader election process is part of a larger process which aims to reach agreement through data replication.

Trivia: Google, which uses Paxos as the foundation of Spanner, its scalable-yet-synchronously-replicated distributed database, published a paper on its struggles to create a fault-tolerant system based on Paxos.

A newer alternative to Paxos called Raft effectively breaks the agreement process into two steps, thereby making leader election easier to understand.

Leaderless replication

Why maintain the leader-follower hierarchy at all if leader election and conflict resolution are so painful? Amazon's DynamoDB re-popularized the idea of leaderless replication, and now most cloud providers include something similar.

If you hear nothing but "anarchy" when someone mentions leaderless replication, you wouldn't be alone. However, there are some clever methods for dealing with the chaos that comes with managing a network of read-AND-write-capable replicas.

Read repair allows clients to detect errors (e.g. several nodes return a consistent value, but one node returns something else) and fix them by sending a write request to the inconsistent node.

Genius Idea !!!!

Background processes that fix errors exist in many cloud-based products. For example, Amazon's DynamoDB uses an"anti-entropy" function.

Quorums allow replicas to pull up-to-date information quickly in asynchronous leaderless replication by specifying a minimum number of replicas that need to accept a write before reading.

When to implement a replication strategy

There are so many reasons to include replicas and so many strategies to choose from, It's generally recommended, including replicas in anything more than the most basic server-database system. The key is choosing the right strategy.
Need to service lots of reads?

Go with a simple leader-follower replication strategy. Read replicas are simple and inexpensive. This is the best option if you have a read-heavy application like an online news source, or if your read-heavy system is scaling globally and you want to provide a consistent user experience.

Need to increase the reliability of your system?

Go with multi-leader replication so that if and when a leader goes down, you can continue to operate without data loss. You will have to include some sort of conflict resolution strategy, though. More specifically, multi-leader is most often used when you're scaling across multiple data centers, because you'd want to have one leader in each data center that can perform writes and then replicate to other data centers.

Need to service lots of writes or scale-up globally?

Consider a leaderless solution. If your system runs on-premise as opposed to in the cloud, make sure you build in appropriate conflict resolution strategies.

Pursuing a multi-region strategy?

Use replicas as database backups for disaster recovery on a per-region basis. For example, you want to be able to handle major outages or natural disasters that affect particular regions, so you implement a multi-leader strategy in affected regions to handle writes in case of failover

eg: Amazon RDS provides a highly available Multi-AZ configuration.

Read Amazon's original DynamoDB paper for an interesting discussion around the challenges the team faced in fusing techniques like consistent hashing, quorum, anti-entropy-based recovery, and more.

Try your luck with Paxos Made Simple, the slightly frustrated attempt by brilliant computer scientist Leslie Lamport to explain his widely-used but little-understood consensus algorithm.