What is a Write-ahead Logging (WAL)?

Key-value store store is a fundamental component, which is gaining exponential demand in a multitude of horizontally scaling environments, including:

IoT environments
Social networks
Online retail environments
Cloud services

Example of features of key value storage engine:

Transactions
Versioning
Replication

Write-ahead Logging (WAL) is used in storage engines to provide transactions with:

Atomicity (guarantee prevents updates to the database occurring only partially)
Durability

Storage engines and WAL

Log statements are compacted appended and garbage collected in the background.

In the case of a B-Tree

This overwrites disc blocks
Every log is written to the write-ahead log
- This is so it can return to a consistent state after a crash

In either case a log is an appended sequence of bytes containing all writes to the database.

💡 We can use the same log to build another replica on another node

Besides writing a log to a disc

The leader sends this across the network to it other followers
When a follower processes this log, it builds a copy of the exact same data structures as found on the leader
This method of replication is used in the postgresql, oracle and others…

Disadvantages of WAL

A WAL will contain details on which bytes were changed in which disc block.

This makes replication closely coupled with the storage engine 👎
When you want to change your storage format to another it is either not possible to run different versions of the database software on the leaders and followers 👎
Above can cause a larger operational impact 👎

🤔 Alternatively…

If the replication software allows for the follower to run a different database versions, you can do this for all the followers with zero downtime…

Then you can perform a failover with the leader and a new follower can be elected

📚 Further Reading & Related Topics

If you’re exploring Write-Ahead Logging (WAL) in distributed data-intensive systems, these related articles will provide deeper insights:

• Distributed Data-Intensive Systems: Logical Log Replication – Learn how logical log replication complements Write-Ahead Logging to ensure data integrity and consistency in distributed systems.

• Distributed Data-Intensive Systems: Reading and Writing Quorums – Explore how quorum-based approaches, when combined with WAL, enhance fault tolerance and consistency in distributed databases.

Scalable Human Blog