I was doing some research on how databases prevents dataloss from writing to page cache using mmap. I don't understand how the memory mapped not lost upon crashing. Could anyone tell me how it really works?
Asked
Active
Viewed 255 times
0
-
Upon *what* crashing? The database program? Some hardware component? The whole machine? – John Bollinger Dec 19 '18 at 21:09
-
@JohnBollinger I'm sorry. I mean the database program crashes (like "kill -9"), not crashes of hardware or the whole machine. – Yi Lin Liu Dec 19 '18 at 21:13
-
Duplicate of https://stackoverflow.com/questions/5877797/how-does-mmap-work ? Maybe? – John Dec 19 '18 at 21:26
-
Possible duplicate of [How does mmap work?](https://stackoverflow.com/questions/5877797/how-does-mmap-work) – John Dec 19 '18 at 21:26
-
I'm not convinced that this is a dupe, but I remain a little uncertain about what is being asked. Why *should* a program crash cause data loss via mapped memory? – John Bollinger Dec 19 '18 at 21:34
-
@JohnBollinger Maybe the program crashed while writing to the page cache, will the data suppose to write to the file lost? – Yi Lin Liu Dec 19 '18 at 21:46
-
If the _system_ crashes (e.g. kernel panic, hardware failure/freezeup, etc) all bets are off. But, if a program is terminated (even with `kill -9`) _anything_ written into a mapped page will be flushed (eventually--by the kernel) to the backing store (e.g. the mapped file). Remember the _kernel_ isn't crashing, just the _program_. The kernel is still around to do cleanup. From the app's perspective it is a crash. From the kernel's perspective, this is just an app exit/termination – Craig Estey Dec 19 '18 at 22:14
-
I thought there is no guarantee that the page cached will write to hardware without calling msync() @CraigEstey . Also, what will happen if the data was only written half way to page cached when crashed? – Yi Lin Liu Dec 19 '18 at 22:24
-
2Re. `msync`: _Without use of this call, there is no guarantee that changes are written back before munmap(2) is called_. But, `munmap` is [implicitly/effectively] called when the program is terminated. `msync` is only if you wish to _force_ the flush early. As to partial data, that is different. If you write byte 0 to page A but get killed _before_ writing byte 1 to page B (or page A for that matter), only the first byte will be flushed – Craig Estey Dec 19 '18 at 22:31
-
2Loosely, in general, to prevent data loss, databases write out things in a precise order, do a sync, then do more data. (e.g.) they write a journal record, force a sync to disk, then rewrite the actual data to a block. The data is written to a _different_ block [so the original data is still in the old block], and then the block _mapping_ is changed and written to disk. So, the database stays whole. It either sees the old state or the new state but not a partial mixing. And, recovery is possible because of the journal. This is mostly for _system_ crashes – Craig Estey Dec 19 '18 at 22:40