What are Sloppy Quorums and Hinted Handoffs?

Why optimised quorums useful?

Databases with appropriately configured quorums can tolerate the failure of individual nodes, this can allow systems to:

  • Not rely on a need for a failover
  • Tolerate individual slow nodes, because requests do not have to wait for all N (nodes in quorum group) to respond
  • They can return when W (write) or R (read) nodes had responded

These characteristics make leaderless replication appealing for use cases:

  • Require high availability
  • Low latency
  • Tolerate occasional stale reads

“Achilles heel” of Quorums!

Quorums can provide some reliable outcomes for end users, they are not fault tolerantNetwork interruptions can easily cut off a client from a large number of database nodes!

  • Although those nodes are alive and other clients maybe able to connect to them,
  • To a client that is cut off from a database nodes they may as well be dead!
    • In this situation it is likely that fewer than write or read reachable nodes remain, so the client can no longer reach a quorum

Large Clusters?

It is likely that the client can connect to some database nodes during a network interruption when there are many nodes, just not the quorums that are needed for assembly for a particular value.

Database Designer Trade offs?

👉 Is it better to return errors to all requests from which we cannot reach a Quorum W and R nodes?

Or… (Sloppy Quorum)

👉 We accept writes anyway. Write them to the same nodes, that are reachable but are not among the N nodes where the value usually lives?

Sloppy Quorums

  • Writes and reads still require W and R successful responses
    • But those may include nodes that are among designated N home nodes for a value
    • Analogy example:
      1. Locking yourself out of your house…
      2. You knock on the neighbours door and ask if you can stay around there’s for a bit
        • Once the network interruption is fixed 👨‍🔧
        • Any writes that node temporarily accepted on behalf of another node are sent to the appropriate home nodes
        • This is called Hinted Handoff
      3. Continuing the analogy….
      4. Once you find your keys to your house, you then proceed to go home

When to use Quorums?

  • Sloppy quorums are particularly useful for increasing write availability
    • As long as any W nodes are available the database can accept writes 🤝
  • However this means even if W + R > N, you cannot be sure to read the latest value for a key
    • Because the latest value could of been temporarily written to some nodes outside of N

Final note

As discovered a sloppy quorum is not really a quorum at all, when we compare this to the traditional approach. It is only an assurance of durability, as the data is stored on W nodes. There is no guarantee that R node will see it… 👎 Only until the hinted handoff is completed.

Sloppy Quorums are optional in Dynamo styled databases (noSQL), in Riak (noSQL) they are enabled by default as from my current research. Whilst Cassandra (noSQL) and Voldemort (noSQL) are disabled by default.

One thought on “What are Sloppy Quorums and Hinted Handoffs?

Leave a comment