-
Continue reading →: Databases – Multi-Object OperationsAs per the previous post on What is the meaning of ACID in Databases?, Atomicity and Isolation describe what a database should do if a client makes several writes within the same transaction… What does Atomicity mean? Atomicity means if an error occurs, half way through a sequence of writes the…
-
Continue reading →: Distributed databases – Is Performance, Scalability and Transactional Guarantees Achievable?Almost every relational databases today and a few none relational databases support transactions… Most follow the protocols that were introduced in 1975! Although some implementation details have changed, the general technical patterns here have remained the dame for over 40 years! Systems that are similar to R, that have transaction…
-
Continue reading →: What is the meaning of ACID in Databases?The safety guarantees provided by transactions are often described by the well known acronym ACID. What does ACID stand for? Where does it come from? In 1983 both Andreas Reuter and Theo Härder, outlined these properties to create clarification for what a fault tolerance mechanism is for databases. The truth of ACID…
-
Continue reading →: Do we take Transactions for granted?Transaction are often something we take for granted as developers, when we decide what applications we want to build. The mechanism of a transaction is important to understand, as we also need to know when it is appropriate to use them! In this blog post I will summarise transactions and…
-
Continue reading →: How to understand Apache KafkaThe awesome open source Apache Kafka, which currently meets the demands of todays technological landscape, enabling event stream processing, real-time data pipelines, and data integration at scale. ⭐️ Kafka GitHub Repo Where did it come from? This was originally created to process the real-time data feeds at LinkedIn all the…
-
Continue reading →: Why is Lombok a game changing library in Java development?Java has the reputation of being a verbose language, therefore it is not uncommon to end up with may lines of code, especially for those getter and setters. Lombok to the rescue? Lombok is a Java library that directly tackles this issue of code verbosity and repetition, there are many…
-
Continue reading →: Spring Boot FiltersWhat are Spring Boot filters? Spring boot uses filters to distill HTTP requests, the process here is: What can filters be used for? A filter is able to perform two operations, this is done on the response and the request. This can be used to restrict URL access to users…
-
Continue reading →: Spring Boot – Code Layout & StructureSpring Boot and it layout and code structure is something that is not predefined, it is up to the developer to follow the best practices to find the best practices available to them. You may ask what are these best practices? What typically occurs is that the project gets divided…
-
Continue reading →: Replication vs Partitioning vs Clustering vs Sharding (1 minute read)In the complex world of database management, terms like replication, partitioning, clustering, and sharding are often interchanged, leading to confusion. As someone who has delved into these topics for over a year, I believe it’s crucial to demystify these concepts. Replication: The Art of Duplication Replication involves duplicating tables or…
-
Continue reading →: What is Partitioning and Why is it Important?In my previous blogs published, I have covered the different ways of which how partitioning is managed with large datasets and thereby reduced to smaller subsets. All this content was derived from the author Martin Kleppman in “Designing Data Intensive Applications“. In this post I will summarise all the topics…
-
Continue reading →: Parallel Query Execution – What is it?To begin as Martin Kleppman from Designing Data Intensive Applications outlines, there are simple queries that read and write a single key. Plus there are scatter gather queries, in the case of document partition secondary indexes. Massive Parallel Processing (MPP) This is about the level of access supported by most…
-
Continue reading →: How does Partitioning work when Requests are being Routed?Let depict a situation where we have now partitioned our dataset running across multiple nodes and multiple machine… There is a question here… When a client wants to make a request, how does it know which node to connect to? 🤔 As partitions are rebalanced, the assignment of partitions to…
-
Continue reading →: How to decide between Automatic and Manual Rebalancing in Operations? (Partitions)There is a critical question to be raised on deciding whether we choose automatic or manual rebalancing… “There is a gradient between fully automatic rebalancing, in which the system decides automatically when to move partitions to one node to another without any administrator interaction, and fully manual rebalancing, in which…
-
Continue reading →: Understanding Partitioning Proportional to NodesWith dynamic partitioning the number of partitions is proportional to the size of the dataset. Since splitting and merging processes, this keeps the size of each partition between some fixed number of partitions But there is another option A third option is used by Cassandra… What happens when you increase…
-
Continue reading →: What is Dynamic Partitioning?For database that use key range partitioning, a fixed number of partitions with fixed boundaries would be very inconvenient! For that reason… Key range partition databases such as HBase and RethinkDB create partitions dynamically! 👍 Advantage of Dynamic Partitioning Is that the number of partitions adapts to the total data…
-
Continue reading →: What is Fixed Partitioning?There are few ways to assigning partitions to nodes, in this post we will discuss fixed partitioning. How not to rebalance partitions! When partitioning by the hash of a key… it is best to divide the possible hashes into ranges and assign each range to a partition… Why don’t we…
-
Continue reading →: Rebalancing Partitions – What is it?Overtime things change in a database… For example ↘️ All of these changes call for data and requests to be moved from one node to another… 🟢 ←→ 🟢 ←→ 🟢 ←→ 🟢 ←→ 🟢 ←→ 🟢 The process of moving load from one node in a cluster to another…
-
Continue reading →: Partitioning Secondary Indexes by Term – What is it?Rather than each partition having its own secondary index (local index). We can construct a global index that covers data in all partitions. Global Index and term partitioning? A global index must also be partitioned. But it can be partitioned differently from the primary key index. Here is how this…
-
Continue reading →: How to Partition with Secondary Indexes by DocumentConceptualise a situation where you are operating a website for used cars. 🚗 🏎 🚕 🚓 🚘 🚖 🚔 🚙 The need for the secondary index… You want to let users search for cars, allowing them to filter by colour and by make. Secondary index relationships with partitions In the…
-
Continue reading →: What to Consider with Secondary Indexes and PartitioningThe partitioning schemes that have been covered in the previous blogs, rely on a key value data model. If records are only ever accessed via a primary key… What are secondary indexes? The situation becomes more complicated when secondary indexes are involved… 🤯 “Secondary indexes are the bread and butter…







