Why does PostgreSQL need both WAL buffer and WAL segment file?

Question

I'm trying to understand more about the apacheAGE extension therefore I'm reading the inner workings of PostgreSQL. From what I understand every operation that alters the table is written at the WAL buffer, but after it is commited/aborted it is immediately written to the WAL segment file on the storage.

Why is the first part needed? Isn't having 2 steps more time-consuming, since the WAL segment file is enough by itself to recover from a server crash?

I am unsure what exactly your question is. Why WAL is used at all? Or why a change is written to the WAL buffer first? — , Mar 01 '23 at 15:17
I understand why WAL is used, i mean why both WAL buffer and WAL segment file? After commit/abort the transaction will be written in WAL segment file so why bother also write it on WAL buffer and not directly on the WAL segment file? — Panagiotis Foliadis, Mar 01 '23 at 16:50

score 1 · Answer 1 · answered Mar 01 '23 at 14:56

For a transaction involving a single change on a system with a single user, yes, it would be better to write directly to the file. But if you have many changes in a transaction, and many concurrent sessions running transactions, writing each one directly to storage is terribly inefficient because each log file write takes a millisecond or more.

score 1 · Accepted Answer · answered Mar 01 '23 at 16:44

1

WAL buffers is just a cache for WAL; eventually the data are written to the WAL segment files. Like with all caches, the goal is to boost performance.

answered Mar 01 '23 at 16:44

Laurenz Albe

209,280
17
206
263

score 1 · Answer 3 · answered Jul 09 '23 at 17:01

There's essentially two parts to your question:

What is WAL buffer and WAL segment file?

The WAL buffer is in memory and the WAL segment buffer is on disk. WAL buffer temporarily holds the transactions in memory till the point they are written on disk. If WAL buffer becomes full, the WAL segment buffer file comes in to play where all the WAL buffer is dumped.

Why are they required and their advantages?

This 2-step process helps us with two things:

The performance is significantly improved because of the use of memory rather than disk as the later is much slower.
Having a backup in the form of WAL segments files allows for crash recovery as transactions can be easily redone in case there's an issue.

It is important to note that while in normal use case we might not care much about these things but imagine a 100 transactions happening together, that is exactly when these little changes start to play a significant role. Hope this helps.

score 0 · Answer 4 · answered Jun 30 '23 at 17:28

Though it may seem that this 2 step method creates additional overhead but it is necessary in order to achieve both performance and durability. The WAL buffering is used to efficiently handle data modification as memory I/O operation is much faster than Disk I/O operation. The WAL segment file storage is used to ensure data consistency and data recovery in case of any crash.

score 0 · Answer 5 · answered Jul 09 '23 at 17:14

The most basic difference between them is that the WAL buffer exists in memory (Volatile Storage), whereas the WAL segment file exists in disk (Persistent Storage).

Since the WAL buffer exists in memory, it allows for faster I/O operations than writing directly to disk. This allows the data to be quickly written to the WAL buffer, and then later flushed to the WAL segment file on disk in bulk, which is significantly more efficient than performing multiple small I/O operations.

Why does PostgreSQL need both WAL buffer and WAL segment file?

5 Answers5