There are many resources describing the architecture of NUMA from a hardware perspective and the performance implications of writing software that is NUMA-aware, but I have not yet found information regarding the how the mapping between virtual pages and physical frames is decided with respect to NUMA.
More specifically, the application running on modern Linux still sees a single contiguous virtual address space.
How can the application tell which parts of the address space are mapped onto local memory and which are mapped onto the memory of another NUMA node?
If the answer is that the application cannot tell, how does the OS decide when to map a virtual page to the physical memory of another NUMA node rather than the local physical memory?