0

I was doing some research on how databases prevents dataloss from writing to page cache using mmap. I don't understand how the memory mapped not lost upon crashing. Could anyone tell me how it really works?

Cinder Biscuits
  • 4,880
  • 31
  • 51
Yi Lin Liu
  • 169
  • 1
  • 11
  • Upon *what* crashing? The database program? Some hardware component? The whole machine? – John Bollinger Dec 19 '18 at 21:09
  • @JohnBollinger I'm sorry. I mean the database program crashes (like "kill -9"), not crashes of hardware or the whole machine. – Yi Lin Liu Dec 19 '18 at 21:13
  • Duplicate of https://stackoverflow.com/questions/5877797/how-does-mmap-work ? Maybe? – John Dec 19 '18 at 21:26
  • Possible duplicate of [How does mmap work?](https://stackoverflow.com/questions/5877797/how-does-mmap-work) – John Dec 19 '18 at 21:26
  • I'm not convinced that this is a dupe, but I remain a little uncertain about what is being asked. Why *should* a program crash cause data loss via mapped memory? – John Bollinger Dec 19 '18 at 21:34
  • @JohnBollinger Maybe the program crashed while writing to the page cache, will the data suppose to write to the file lost? – Yi Lin Liu Dec 19 '18 at 21:46
  • If the _system_ crashes (e.g. kernel panic, hardware failure/freezeup, etc) all bets are off. But, if a program is terminated (even with `kill -9`) _anything_ written into a mapped page will be flushed (eventually--by the kernel) to the backing store (e.g. the mapped file). Remember the _kernel_ isn't crashing, just the _program_. The kernel is still around to do cleanup. From the app's perspective it is a crash. From the kernel's perspective, this is just an app exit/termination – Craig Estey Dec 19 '18 at 22:14
  • I thought there is no guarantee that the page cached will write to hardware without calling msync() @CraigEstey . Also, what will happen if the data was only written half way to page cached when crashed? – Yi Lin Liu Dec 19 '18 at 22:24
  • 2
    Re. `msync`: _Without use of this call, there is no guarantee that changes are written back before munmap(2) is called_. But, `munmap` is [implicitly/effectively] called when the program is terminated. `msync` is only if you wish to _force_ the flush early. As to partial data, that is different. If you write byte 0 to page A but get killed _before_ writing byte 1 to page B (or page A for that matter), only the first byte will be flushed – Craig Estey Dec 19 '18 at 22:31
  • 2
    Loosely, in general, to prevent data loss, databases write out things in a precise order, do a sync, then do more data. (e.g.) they write a journal record, force a sync to disk, then rewrite the actual data to a block. The data is written to a _different_ block [so the original data is still in the old block], and then the block _mapping_ is changed and written to disk. So, the database stays whole. It either sees the old state or the new state but not a partial mixing. And, recovery is possible because of the journal. This is mostly for _system_ crashes – Craig Estey Dec 19 '18 at 22:40

0 Answers0