What is a Write-ahead Logging (WAL)?

Key-value store store is a fundamental component, which is gaining exponential demand in a multitude of horizontally scaling environments, including:

  • IoT environments
  • Social networks
  • Online retail environments
  • Cloud services

Example of features of key value storage engine:

  • Transactions
  • Versioning
  • Replication

Write-ahead Logging (WAL) is used in storage engines to provide transactions with:

  • Atomicity (guarantee prevents updates to the database occurring only partially)
  • Durability

Storage engines and WAL

Log statements are compacted appended and garbage collected in the background.

In the case of a B-Tree

  • This overwrites disc blocks
  • Every log is written to the write-ahead log
    • This is so it can return to a consistent state after a crash

In either case a log is an appended sequence of bytes containing all writes to the database.

  • 💡 We can use the same log to build another replica on another node

Besides writing a log to a disc

  • The leader sends this across the network to it other followers
  • When a follower processes this log, it builds a copy of the exact same data structures as found on the leader
  • This method of replication is used in the postgresql, oracle and others…

Disadvantages of WAL

A WAL will contain details on which bytes were changed in which disc block.

  • This makes replication closely coupled with the storage engine 👎
  • When you want to change your storage format to another it is either not possible to run different versions of the database software on the leaders and followers 👎
  • Above can cause a larger operational impact 👎

🤔 Alternatively…

If the replication software allows for the follower to run a different database versions, you can do this for all the followers with zero downtime…

  • Then you can perform a failover with the leader and a new follower can be elected

📚 Further Reading & Related Topics

If you’re exploring Write-Ahead Logging (WAL) in distributed data-intensive systems, these related articles will provide deeper insights:

• Distributed Data-Intensive Systems: Logical Log Replication – Learn how logical log replication complements Write-Ahead Logging to ensure data integrity and consistency in distributed systems.

• Distributed Data-Intensive Systems: Reading and Writing Quorums – Explore how quorum-based approaches, when combined with WAL, enhance fault tolerance and consistency in distributed databases.

Leave a comment

I’m Sean

Welcome to the Scalable Human blog. Just a software engineer writing about algo trading, AI, and books. I learn in public, use AI tools extensively, and share what works. Educational purposes only – not financial advice.

Let’s connect