How does Partitioning work when Requests are being Routed?

Let depict a situation where we have now partitioned our dataset running across multiple nodes and multiple machine…

There is a question here… When a client wants to make a request, how does it know which node to connect to? 🤔

  • As partitions are rebalanced, the assignment of partitions to nodes changes.
  • Somebody needs to stay on top of those changes, in order to answer the question…
    • If I want to read or write the key “sig1”
      • Which IP/port number do I need to connect to?!

This is an instance of a more general problem called… Service Discovery.

  • Which is not limited to just the databases
  • Any piece of software that is available over a network has this problem
    • Especially, if it is aiming for high availability
    • Running a redundant configuration on multiple machines

Service Discovery

Many companies have written there own in house Service Discovery tools… And many of these have been released as open source!

At a high level there are few different approaches to this problem:

  1. Allow clients to contact any node
    • For example, via round robin load balancer
    • If that node coincidentally owns the partition to which request applies, it can handle the request directly.
    • Otherwise it forwards the request to the appropriate node
    • Receives the reply and passes the reply along to the client
  2. Send all requests from clients to a routing tier first
    • Which determines the node that should handle each request
      • And forwards it accordingly
      • This routing tier does not itself handle any requests
        • It only acts as a partition aware load balancer
  3. Require that clients are aware of the partitioning and the assignment of partitions to nodes
    • In this case the client can connect to the appropriate node without any intermediary

In all cases the key problem is how does the component make the routing decision?

  • This maybe any of the below…
    • One of the nodes
    • The routing tier
    • Or the client

This is a challenging problem!

  • Because it is important that all participants agree
  • Otherwise requests will be sent to the wrong nodes 😔
    • And handled incorrectly 👎

There are protocols to achieving consensus in a distributed system. 💡

  • ⚠️ Although consensus protocols are hard to implement correctly 😵‍💫
  • 👉 Many distributed data systems rely on separate co-ordination service

Co-ordination Service (ZooKeeper)

Zookeeper is a type of co-ordination service that tracks cluster meta data.

  • Each node registers itself in ZooKeeper
  • And ZooKeeper maintains the authoritative mapping of partitions to nodes
  • Other actors such as the routing tier, or partition aware clients can subscribe to this information in ZooKeeper
  • Whenever a partition changes ownership, or node is added or removed
    • ZooKeeper notifies the routing tier so that it can keep its routing tier up to date

For example:

LinkenIn’s Espresso has used Helix for cluster management. Implementing a routing tier which in turn relies on ZooKeeper!

Other notable mentioned that track ZooKeeper assignment:

Alternatives to ZooKeeper

  • MongoDB – has a similar architecture but it relies on it’s own configuration server implementation and Mongo’s daemons as the routing tier
  • Cassandra and Riak take a different approach… they use a Gossip Protocol among the nodes
    • To decimate any changes in cluster state, requests can be sent to any node
      • And that node forwards them to that appropriate node for the requested partition

Approach 1 of the 3 approaches of the request routing problem highlighted earlier in this blog.

  • This model puts more complexity on the databases nodes
  • But avoids the dependency on an external coordination service such as ZooKeeper

CouchBase does not rebalance automatically which simplifies the design…

  • Normally it is configured by the routing tier
    • This leans about routing changes from the cluster nodes

It still needs to find the IP address to connect to?

When using a routing tier, or when sending requests to node clients, it still needs to find the IP address to connect to!?

  • These are not as fast changing as the assignment of partitions to nodes 👎
  • So it is often sufficient to offer DNS for this purpose 🤗

10 responses to “How does Partitioning work when Requests are being Routed?”

  1. What is Partitioning and Why is it Important? – Scalable Human Avatar

    […] was also discussed in previous blog post on routing queries to the appropriate partition, which ranges from simple partition are load balancing to […]

    Like

  2. What is Partitioning of Key-Value Data? – Scalable Human Blog Avatar

    […] • How Does Partitioning Work When Requests Are Being Routed? – Learn how partitioning affects request distribution and system scalability in distributed environments. […]

    Like

  3. How to Relieve Hotspots with Skewed Workloads – Scalable Human Blog Avatar

    […] • How Does Partitioning Work When Requests Are Being Routed? – Learn how partitioning strategies can help distribute workloads more evenly and prevent hotspots. […]

    Like

  4. What is Dynamic Partitioning? – Scalable Human Blog Avatar

    […] • How Does Partitioning Work When Requests Are Being Routed? – Learn how partitioning strategies affect request distribution, scalability, and performance in distributed environments. […]

    Like

  5. Distributed databases – Is Performance, Scalability and Transactional Guarantees Achievable? – Scalable Human Blog Avatar

    […] • How Does Partitioning Work When Requests Are Being Routed? – Learn how partitioning strategies affect distributed database performance and ensure balanced data distribution across nodes. […]

    Like

  6. Horizontal Scaling vs Vertical Scaling? – Scalable Human Blog Avatar

    […] • How Does Partitioning Work When Requests Are Being Routed? – Learn how partitioning strategies impact scalability and workload distribution in horizontally scaled architectures. […]

    Like

  7. What is Partitioning? – Scalable Human Blog Avatar

    […] • How Does Partitioning Work When Requests Are Being Routed? – Learn how partitioning strategies affect request distribution and overall system performance. […]

    Like

  8. What is Consistent Hashing? – Scalable Human Blog Avatar

    […] • How Does Partitioning Work When Requests Are Being Routed? – Explore how different partitioning mechanisms impact load distribution and system efficiency. […]

    Like

  9. What is Partitioning by Hash of Key? – Scalable Human Blog Avatar

    […] • How Does Partitioning Work When Requests Are Being Routed? – Explore how request routing interacts with partitioning strategies like hash-based partitioning, ensuring efficient data access in distributed systems. […]

    Like

  10. Replication vs Partitioning vs Clustering vs Sharding (1 minute read) – Scalable Human Blog Avatar

    […] • How Does Partitioning Work When Requests Are Being Routed? – Explore how partitioning strategies like sharding and clustering affect data access and request routing in distributed databases. […]

    Like

Leave a reply to What is Consistent Hashing? – Scalable Human Blog Cancel reply

I’m Sean

Welcome to the Scalable Human blog. Just a software engineer writing about algo trading, AI, and books. I learn in public, use AI tools extensively, and share what works. Educational purposes only – not financial advice.

Let’s connect