Confusion of virtual memory

Question

Consider a sample below.

char* p = (char*)malloc(4096);
p[0] = 'a';
p[1] = 'b';

The 4KB memory is allocated by calling malloc(). OS handles the memory request by the user program in user-space. First, OS requests memory allocation to RAM, then RAM gives physical memory address to OS. Once OS receives physical address, OS maps the physical address to virtual address then OS returns the virtual address which is the address of p to user program.

I wrote some value(a and b) in virtual address and they are really written into main memory(RAM). I'm confusing that I wrote some value in virtual address, not physical address, but it is really written to main memory(RAM) even though I didn't care about them.

What happens in behind? What OS does for me? I couldn't found relevant materials in some books(OS, system programming). Could you give some explanation? (Please omit the contents about cache for easier understanding)

The subject is vast and there are a lot of things going on under the hood when you call `malloc` and the details totally depend on the platform. Basically for the moment you don't need to worry, just take if for granted that the memory returned by `malloc` is "yours", that is is somewhere in the RAM of your computer and that you're free to write anything you want to this memory and also read from it. — Jabberwocky, Feb 01 '21 at 16:49
Maybe this helps: https://stackoverflow.com/questions/29247670/how-does-malloc-work-in-details — Jabberwocky, Feb 01 '21 at 16:50

score 1 · Answer 1 · answered Feb 01 '21 at 17:01

You have to understand that virtual memory is virtual, and it can be more extensive than physical memory RAM, so it is mapped differently. Although they are actually the same.

Your programs use virtual memory addresses, and it is your OS who decides to save in RAM. If it fills up, then it will use some space on the hard drive to continue working.

But the hard drive is slower than the RAM, that's why your OS uses an algorithm, which could be Round-Robin, to exchange pages of memory between the hard drive and RAM, depending on the work being done, ensuring that the data that are most likely to be used are in fast memory. To swap pages back and forth, the OS does not need to modify virtual memory addresses.

Summary overlooking a lot of things

score 1 · Answer 2 · answered Feb 01 '21 at 17:04

You want to understand how virtual memory works. There's lots of online resources about this, here's one I found that seems to do a fair job of trying to explain it without getting too crazy in technical details, but also doesn't gloss over important terms.

https://searchstorage.techtarget.com/definition/virtual-memory

score 1 · Answer 3 · answered Feb 01 '21 at 18:03

For Linux on x86 platforms, the assembly equivalent of asking for memory is basically a call into the kernel using int 0x80 with some parameters for the call set into some registers. The interrupt is set at boot by the OS to be able to answer for the request. It is set in the IDT.

An IDT descriptor for 32 bits systems looks like:

struct IDTDescr {
   uint16_t offset_1; // offset bits 0..15
   uint16_t selector; // a code segment selector in GDT or LDT
   uint8_t zero;      // unused, set to 0
   uint8_t type_attr; // type and attributes, see below
   uint16_t offset_2; // offset bits 16..31
};

The offset is the address of the entry point of the handler for that interrupt. So interrupt 0x80 has an entry in the IDT. This entry points to an address for the handler(also called ISR). When you call malloc(), the compiler will compile this code to a system call. The system call returns in some register the address of the allocated memory. I'm pretty sure as well that this system call will actually use the sysenter x86 instruction to switch into kernel mode. This instruction is used alongside an MSR register to securely jump into kernel mode from user mode at the address specified in the MSR (Model Specific Register).

Once in kernel mode, all instructions can be executed and access to all hardware is unlocked. To provide with the request the OS doesn't "ask RAM for memory". RAM isn't aware of what memory the OS uses. RAM just blindly answers to asserted pins on it's DIMM and stores information. The OS just checks at boot using the ACPI tables that were built by the BIOS to determine how much RAM there is and what are the different devices that are connected to the computer to avoid writing to some MMIO (Memory Mapped IO). Once the OS knows how much RAM is available (and what parts are usable), it will use algorithms to determine what parts of available RAM every process should get.

When you compile C code, the compiler (and linker) will determine the address of everything right at compilation time. When you launch that executable the OS is aware of all memory the process will use. So it will set up the page tables for that process accordingly. When you ask for memory dynamically using malloc(), the OS determines what part of physical memory your process should get and changes (during runtime) the page tables accordingly.

As to paging itself, you can always read some articles. A short version is the 32 bits paging. In 32 bits paging you have a CR3 register for each CPU core. This register contains the physical address of the bottom of the Page Global Directory. The PGD contains the physical addresses of the bottom of several Page Tables which themselves contain the physical addresses of the bottom of several physical pages (https://wiki.osdev.org/Paging). A virtual address is split into 3 parts. The 12 bits to the right (LSB) are the offset in the physical page. The 10 bits in the middle are the offset in the page table and the 10 MSB are the offset in the PGD.

So when you write

char* p = (char*)malloc(4096);
p[0] = 'a';
p[1] = 'b';

you create a pointer of type char* and making a system call to ask for 4096 bytes of memory. The OS puts the first address of that chunk of memory into a certain conventional register (which depends on the system and OS). You should not forget that the C language is just a convention. It is up to the operating system to implement that convention by writing a compatible compiler. It means that the compiler knows what register and what interrupt number to use (for the system call) because it was specifically written for that OS. The compiler will thus take the address stored into this certain register and store it into this pointer of type char* during runtime. On the second line you are telling the compiler that you want to take the char at the first address and make it an 'a'. On the third line you make the second char a 'b'. In the end, you could write an equivalent:

char* p = (char*)malloc(4096);
*p = 'a';
*(p + 1) = 'b';

The p is a variable containing an address. The + operation on a pointer increments this address by the size of what is stored in that pointer. In this case, the pointer points to a char so the + operation increments the pointer by one char (one byte). If it was pointing to an int then it would be incremented of 4 bytes (32 bits). The size of the actual pointer depends on the system. If you have a 32 bits system then the pointer is 32 bits wide (because it contains an address). On a 64 bits system the pointer is 64 bits wide. A static memory equivalent of what you did is

char p[4096];
p[0] = 'a';
p[1] = 'b';

Now the compiler will know at compile time what memory this table will get. It is static memory. Even then, p represents a pointer to the first char of that array. It means you could write

char p[4096];
*p = 'a';
*(p + 1) = 'b';

It would have the same result.

score 1 · Answer 4 · answered Feb 01 '21 at 18:39

First, OS requests memory allocation to RAM,…

The OS does not have to request memory. It has access to all of memory the moment it boots. It keeps its own database of which parts of that memory are in use for what purposes. When it wants to provide memory for a user process, it uses its own database to find some memory that is available (or does things to stop using memory for other purposes and then make it available). Once it chooses the memory to use, it updates its database to record that it is in use.

… then RAM gives physical memory address to OS.

RAM does not give addresses to the OS except that, when starting, the OS may have to interrogate the hardware to see what physical memory is available in the system.

Once OS receives physical address, OS maps the physical address to virtual address…

Virtual memory mapping is usually described as mapping virtual addresses to physical addresses. The OS has a database of the virtual memory addresses in the user process, and it has a database of physical memory. When it is fulfilling a request from the process to provide virtual memory and it decides to back that virtual memory with physical memory, the OS will inform the hardware of what mapping it choose. This depends on the hardware, but a typical method is that the OS updates some page table entries that describe what virtual addresses get translated to what physical addresses.

I wrote some value(a and b) in virtual address and they are really written into main memory(RAM).

When your process writes to virtual memory that is mapped to physical memory, the processor will take the virtual memory address, look up the mapping information in the page table entries or other database, and replace the virtual memory address with a physical memory address. Then it will write the data to that physical memory.

Thanks for the kind exaplanation. OS initialize hardware components (aka. Hardware Adoption Layer), then get base address and availiable size of physical memory. After it is done in kernel, I know that OS request call related to physical memory management to Physical Memory Manager(RAM). So actual availiable physical memory is chosen by RAM firmware, but your answer seems to say that OS can manage fully itself after hardware initialization. — progr, Feb 02 '21 at 04:03

Support Ukraine · Accepted Answer · 2021-02-01T20:21:12.563

A detailed answer to your question will be very long - and too long to fit here at StackOverflow.

Here is a very simplified answer to a little part of your question.

You write:

I'm confusing that I wrote some value in virtual address, not physical address, but it is really written to main memory

Seems you have a very fundamental misunderstanding here.

There is no memory directly "behind" a virtual address. Whenever you access a virtual address in your program, it is automatically translated to a physical address and the physical address is then used for access in main memory.

The translation happens in HW, i.e. inside the processor in a block called "MMU - Memory management unit" (see https://en.wikipedia.org/wiki/Memory_management_unit).

The MMU holds a small but very fast look-up table that tells how a virtual address is to be translated into a physical address. The OS configures this table but after that, the translation happens without any SW being involved and - just to repeat - it happens whenever you access a virtual memory address.

The MMU also takes some kind of process ID as input in order to do the translation. This is need because two different processes may use the same virtual address but they will need translation to two different physical addresses.

As mentioned above the MMU look-up table (TLB) is small so the MMU can't hold a all translations for a complete system. When the MMU can't do a translation, it can make an exception of some kind so that some OS software can be triggered. The OS will then re-program the MMU so that the missing translation gets into the MMU and the process execution can continue. Note: Some processors can do this in HW, i.e. without involving the OS.

Thanks for the kind explanation. I have one more question if you are OK. Both page directories and page tables are in RAM? or MMU has physical memory for them? also, both are managed by MMU(hardware), not OS? if so, Does OS just call for memory management to MMU? — progr, Feb 02 '21 at 04:13
@progr As already mentioned this topic is too huge be answered here in details. So this is rather simplified (i.e. not correct for all systems/details) but perhaps it will give you a "picture". When a system starts the OS owns all RAM. When a user process is started the OS assigns a part of the RAM to the process. The OS keeps a big translation table in a part of the RAM. This table holds all translations between virtual and physical addresses. The OS also programs a (small) subset of this table into the MMU (aka TLB), i.e. in dedicated memory in the MMU (e.g. a CAM). Once that is ... — Support Ukraine, Feb 02 '21 at 07:02
... done the MMU can do the translations (for known addresses) automatically. For unknown addresses the MMU generates an exception (aka interrupt) which invokes the OS so that the OS can bring the missing translation into the MMU memory. The OS do not "call for memory management" - the OS controls the MMU - it's rather the other way around, i.e. when the MMU can't do a translation, the MMU calls the OS for help using an exception/interrupt. And just to make clear - the MMU does not execute any code - it's "simply" some HW that can do translations based on rules programmed into it by the OS. — Support Ukraine, Feb 02 '21 at 07:08
Thanks a lot! @4386427. It would be sufficient. Thanks again! — progr, Feb 02 '21 at 11:34

Confusion of virtual memory

5 Answers5