Leaderless Replication – Writing to a database when a node is down

If you need a refresher on leaderless replication please read my previous blog post on: What is Leaderless replication? Or read Martin Kleppman book on ‘Designing Data-Intensive Applications’.

Scenario of a node going down

Conceptualise a situation where you have a database with 3 replicas and 1 of the replicas is currently unavailable!

This could be caused by the system being rebooted to install an update…

In a typical leader based configuration to continue processing writes, you may need to perform a failover
On the other hand in a leaderless configuration failover does not exist ✋
- For example…
  1. User 1, 2 and 3 send a write to 3 replicas in parallel
  2. And only 2 available replicas miss it…
    - (For an example, lets configure this so it’s sufficient that 2 out of 3 replicas to acknowledge the write)
  3. After User 1, 2, and 3 receives two okay responses we consider the write to be successful
  4. The client simply ignores the facts that we have missed the write
- Now conceptualise the unavailable node comes back online! And then a client starts reading from it.
  - Any writes that happen while the node was down are now missing from that node
  - This means if you read from that node you may get a stale/outdated values as responses
    - ⚙️ Possible solution!
      - When a client reads from a database it doesn’t just send its requests to one replica
      - Read requests are also sent to several nodes in parallel
      - The client may get different responses from different nodes
      - Perhaps an update to data value from one and a stale value from another
      - Version numbers are used to determine which value is newer

Read repair and anti-entropy

Replication schemes should ensure that eventually all data will be read by every replica.

Now, if an unavailable node comes back online, how does this catch up on the writes that have been missed?

Two mechanisms are often used in Dynamo styled data stores…

Read repairs and anti entropy processes:
1. Read repair
  - When a client makes a read from several nodes in parallel it can detect any stale response
    - For example, User 1, 2 and 3
      - Read some information and gets a version 6 value from replica 3
      - And then version 7 value from replicas 1 and 2…
      - The client sees that replica 3 has a stale value and then writes the newer value to that replica!
      - This approach works well for values that are frequently read
2. Anti-Entropy process
  - Some data store have a background process that constantly runs to check for data differences in replicas
  - This then copies any missing data to one replica to another
  - Unlike the replication log in leader based replication
  - This anti-entropy process does not copy writes in any particular order
  - There maybe a significant delay before data is copied

Final Note

Not all systems implement both read repair and anti-entropy, for example Voldemort does not have anti entropy process.

To note, without an anti entropy process value that are rarely read may be missing from some replicas. Thus, this reduces the durability as read repairs are only performed when a value is ready by the application.

📚 Further Reading & Related Topics

If you’re exploring leaderless replication and writing to a database when a node is down, these related articles will provide deeper insights:

• Distributed Data-Intensive Systems: Replication vs. Partitioning vs. Clustering vs. Sharding – Learn how leaderless replication fits into the broader context of data distribution strategies like replication, sharding, and partitioning to ensure system reliability and data consistency.

• Understanding Partitioning: Proportional to Nodes – Explore how partitioning strategies in distributed systems interact with leaderless replication to manage data across nodes when failures occur.

Scalable Human Blog