10

Are mmap calls atomic in their effect?

That is, does a mapping change made by mmap appear atomically to other threads accessing the affected region?

As a litmus test, consider the case you do a mmap in a file of all zeros (from thread T1 which is at this point the only thread), then start a second thread T2 reading from the region. Then, again on T1 (the original thread) do a second mmap call for the same region, replacing the mapping with a new one against a file of all ones.

Is it possible for the reader thread to read a one from some page (i.e., see the second mmap in effect) and then subsequently read a zero from some page (i.e., see the first mapping in effect)?

You may assume that the reads on the reader thread are properly fenced, i.e., that the effect above does not occur solely due to CPU/coherency level memory access reordering.

BeeOnRope
  • 60,350
  • 16
  • 207
  • 386
  • 1
    *Is it possible for any of the reader threads to read a one from some page (i.e., see the second mmap in effect) and then subsequently read a zero from some page (i.e., see the first mapping in effect)?* Without putting enough thought into this to actually formulate an answer, I don't think you can rule out pages getting replaced in any order. If multiple pages get replaced, I suspect there is no atomicity nor any ordering guarantees. – Andrew Henle Jan 21 '20 at 17:12
  • 1
    @AndrewHenle - indeed, unless the kernel were to suspend all process threads while it updates the mapping, or if were to create an entirely new mapping with the changes offline and then swap the page table pointer (e.g., CR3 on x86) to the new mapping, it's hard to see how it could be atomic, but I am ready to be surprised... – BeeOnRope Jan 21 '20 at 17:15
  • Re, two conflicting, unsynchronized mmap calls from two different threads, both attempting to map the same VM region. I certainly _hope_ that one of the two mmap calls would fail. But personally, I would not worry much about the precise details of _how_ it would fail, because I would never intentionally write a program that depended on that race being resolved in any particular way. – Solomon Slow Jan 21 '20 at 17:49
  • 1
    @SolomonSlow - that's not the scenario: the two `mmap` calls are from the same thread, only one thread ever calls `mmap` here. Clearly I would expect the `mmap` calls to appear atomic to the thread making the call (i.e., the `mmap` has fully taken effect from the POV of the code after the return), but the question is about a second thread reading from (or writing to) the region affected by the `mmap` call. I'll try to clarify the question. – BeeOnRope Jan 21 '20 at 17:55
  • 2
    I don't think it's legal for one thread to access a chunk of virtual address space while the mapping for that address space might be changing. As far as I know, no guarantees are made whatsoever and the operation could fault or even corrupt things. It's not only not atomic, it's permitted to unmap all the pages first and then start mapping the new ones in any order or otherwise operate in any way it wants to so long as it doesn't break pages not altered by the operation. – David Schwartz Jan 21 '20 at 18:10

2 Answers2

1

Mmap(2) is atomic with respect to the mappings across all threads; in part, at least, because unmap(2) also is. To break it down, the scenario described looks something like:

MapRegion(from, to, obj) {
     Lock(&CurProc->map)
     while MapIntersect(&CurProc->map, from, to, &range) {
            MapUnMap(&CurProc->map, range.from, range.to)
            MapObjectRemove(&CurProc->map, range.from, range.to)
     }
     MapInsert(&CurProcc->map, from, to, obj)
     UnLock(&CurProc->map)
}

Following this, map_unmap has to ensure that while it is removing the mappings, no thread can access them. Notice the Lock(&thisproc->map).

MapUnMap(map, from, to) {
    foreach page in map.mmu[from .. to] {
         update page structure to invalidate mapping
    }
    foreach cpu in map.HasUsed {
         cause cpu to invoke tlb cache invalidation for (map, from, to)
    }
}

The first phase is to re-write the processor specific page tables to invalidate the area(s).

The second phase is to force every cpu that has ever loaded this map into its translation cache to invalidate that cache. This bit is highly architecture dependent. On an older x86, rewriting cr3 is typically enough, so the HasUsed is really CurrentlyUsing; whereas a newer amd64 might be able to cache multiple address space identifiers, so would be HasUsed. On an ARM, local tlb invalidation is broadcast to the local cluster; so HasUsed would refer to cluster ids rather than cpu ones. For more detail, search for tlb shootdown, as this is colloquially known as.

Once these two phases are complete, no thread can access this address range. Any attempt to do so will cause a fault, which will cause the faulting thread to Lock its mapping structure, which is already locked by the mapping thread, so it will wait until the mapping is complete. When the mapping is complete, all of the old mappings have been removed and replaced by new mappings, so there is no way to retrieve a previous mapping after this point.

What if another thread references the address range during the update? It will either continue with stale data or fault. In this respect stale data isn't an inconsistency, it is as if it had been referenced just before the mapping thread had entered mmap(2). The faulting case is the same as for faulting thread above.

In summary, update to the mappings is implemented using a series of transactions which ensure a consistent view of the address space. The cost of these transactions is architecture specific. The code to implement this can be quite intricate as it needs to guard against implicit operations, such as speculative fetching, as well as explicit ones.

mevets
  • 10,070
  • 1
  • 21
  • 33
  • I'm not sure how the `munmap` case addresses `mmap`, though. If we are replacing one valid page table entry with another, there is no case in which the thread would fault and thus no ability to take a lock. In between the page table update and the TLB shootdown, why can't we have a situation where the thread accesses one page, misses the TLB, walks the page table and gets the new mapping with the new data; then accesses another page which is still in its TLB and gets the old mapping with the old data? – Nate Eldredge Aug 09 '21 at 14:01
  • 1
    It's true that stale data in itself isn't an inconsistency, but new data followed by stale data is. – Nate Eldredge Aug 09 '21 at 14:02
  • It doesn't; it follows a strict transaction of invalidate; add. Any other behaviour is indisputably broken. New followed by stale doesn't happen; only stale followed by new. – mevets Aug 09 '21 at 14:29
  • 1
    I see, and with a shootdown after the invalidate. That makes sense. And an access while the pages are invalidated will trigger a fault, and the thread will block in the fault handler (waiting for the lock) until the new mapping is ready. – Nate Eldredge Aug 09 '21 at 14:51
-3

Memory-mapping occurs at the process level and is therefore instantaneously seen by all threads that are part of that same process.

Mike Robinson
  • 8,490
  • 5
  • 28
  • 41
  • Huh, how is that implemented in practice when the mapping for multiple pages need to be changed? Is the entire process paused by the kernel? – BeeOnRope Jan 21 '20 at 18:32
  • @BeeOnRope Probably, since it has to modify the process's page table. – Barmar Jan 21 '20 at 18:52
  • That seems like an giant scalability bottleneck in heavily threaded processes! – BeeOnRope Jan 21 '20 at 18:54
  • 2
    @BeeOnRope `mmap()` is not generally found in performance-critical inner loops. – Barmar Jan 21 '20 at 20:50
  • 3
    @Barmer, yes its not like anyone would base malloc on it, or something. – mevets Jan 22 '20 at 14:45
  • 1
    I don't think this answer is technical enough to be considered an answer. If you are able to expand more precisely what you mean by "process level" and how that manifests as the guarantees needed by the question, that would be great. Right now it looks awfully like hand-waving, much like people slapping `volatile` on things and declaring it thread safe. – GManNickG Mar 31 '20 at 17:51