How do Reading and Writing Quorums work?

  • To start, lets consider the write to be successful when it is only processed by only 2 of the 3 replicas βœ…
    • What if 1 of the 3 replicas accepted the write… How far can we push this πŸ€”
    • If we know every write guaranteed out of 2 of the replicas that means 1 replica could be stale 😡
  • Therefore, if we read from at least 2 replicas at least we know that 1 replica of the two is update to date πŸ’‘
    • If the 3rd replica is down or slow to repond
      • Reads can continue returning up to date values

Illustrate the Quorum Variables

  • W = Minimum write nodes ✍️
  • R = Minimum read nodes πŸ“–
  • N = Nodes in the quorum group πŸ“š

Summary of Quorum

  • More generally if there are N replicas πŸ“š
  • Every write must be confirmed by W nodes ✍️ to be considered successful.
  • We must query R nodes πŸ“– for each query
    • Because at least R nodes πŸ“– will be at least up to date

Reads and writes that obey this R πŸ“– and W ✍️ values are called Quorum reads and writes.

  • You can think R πŸ“– and W ✍️ as the minimum number of votes for the read to be valid.

In a Dynamo style databases:

  • The parameters N πŸ“š, W ✍️ and R πŸ“– are typically configurable
  • Note, there maybe more than N πŸ“š nodes in a cluster
    • But any given value will be stored only on N πŸ“š nodes
    • This allows for data sets to be partitioned
      • Supporting data sets that are larger that can fit on 1 node

The Quorum Condition (W+R) > N

The (W+R) > N allows us to tolerate theses nodes with these types of stipulations:

  • If W < N we can still process writes if a node is unavailable
  • If R < N we can still process we can still process reads if a node is unavailable

For example:

  • πŸ“š N = 3
  • ✍️ W = 2
  • πŸ“– R = 2
  • We can tolerate one unavailable node

Similarly:

  • πŸ“š = 5
  • ✍️ W = 3
  • πŸ“– R = 3
  • We can tolerate 2 unavailable nodes

Normally reads and writes are always to all N πŸ“š replicas in parallel.

πŸ‘‰ Tip: The parameters N and R determine how many node we wait for…

  • That is how many of the N nodes needed to report success before we consider the read and write to be successful.

πŸ‘‰ Tip: If there are fewer W or R nodes are available writes will return errors

  • Nodes can be unavailable for many reasons
    • Node down
    • Crashing
    • Powering down
    • Error in an operation
      • Inability to write as the disc is full
    • Due to a network interruption between the client and the node
    • Etc..
  • We only care if whether the node returns a successful response
    • And we do not need to distinguish between different kinds of faults

One thought on “How do Reading and Writing Quorums work?

Leave a comment