Memory addressing in assembly / multitasking

Question

I understand how programs in machine code can load values from memory in to registers, perform jumps, or store values in registers to memory, but I don't understand how this works for multiple processes. A process is allocated memory on the fly, so must it use relative addressing? Is this done automatically (meaning there are assembly instructions that perform relative jumps, etc.), or does the program have to "manually" add the correct offset to every memory position it addresses.

I have another question regarding multitasking that is somewhat related. How does the OS, which isn't running, stop a thread and move on to the next. Is this done with timed interrupts? If so, then how can the values in registers be preserved for a thread. Are they saved to memory before control is given to a different thread? Or, rather than timed interrupts, does the thread simply choose a good time to give up control. In the case of timed interrupts, what happens if a thread is given processor time and it doesn't need it. Does it have to waste it, can it call the interrupt manually, or does it alert the OS that it doesn't need much time?

Edit: Or are executables edited before being run to compensate for the correct offsets?

Uh, I'm not 100% sure of my answer. I'm not sure I'm up-to-date with the latest/greatest assembly/processors/compilers. I'd let my answer sit for a bit so that other time zones can get a change to answer. :-) — Gray, Apr 17 '12 at 05:17
Okay, I unchecked it. I'll come back in 24 hours to see the results. Thanks for your help :) I'm thinking about designing (not necessarily building) my own, simple CPU. — Void Star, Apr 17 '12 at 05:25

score 2 · Answer 1 · answered Apr 17 '12 at 05:49

2

That's not how it works. All modern operating systems virtualize the available memory. Giving every process the illusion that it has 2 gigabytes of memory (or more) and doesn't have to share it with anybody. The key component in a machine that does this is the MMU, nowadays built in the processor itself. Another core feature of this virtualization is that it isolates processes. One misbehaving one cannot bring another one down with it.

Yes, a clock tick interrupt is used to interrupt the currently running code. Processor state is simply saved on the stack. The operating system scheduler then checks if any other thread is ready to run and has a high enough priority to get first in line. Some extra code ensures that everybody gets a fair share. Then it just a matter of setting the MMU to resume execution on the other thread. If no thread is ready to run then the CPU gets physically turned off with the HALT instruction. To be woken again by the next clock interrupt.

This is ten-thousand foot view, it is well covered in any book about operating system design.

answered Apr 17 '12 at 05:49

Hans Passant

922,412
146
1,693
2,536

But which stack is the processor state saved to? Each process will be using its own stack frame. Does the state get saved to the operating system's stack, or does it go into the stack of the process that was previously running? Is every process automatically given 2GB+? Can a process ask for more memory? In C, what does malloc do? Does the MMU keep track of what pages of memory for programs were stored where or is it the job of the OS to tell the MMU to move pages around? – Void Star Apr 18 '12 at 00:10
Erm, what was the part in the middle? Click the "Ask Question" button. – Hans Passant Apr 18 '12 at 00:19
I'm confused... This is on the same topic, I shouldn't ask a NEW question for clarification. Basically I am still confused over these questions: Do programs have the illusion that they have 2GB of memory that they COULD ask for, or if they are given 2GB and can later ask for more in addition to that. Which stack frame is the processor state saved to, the stack frame for the OS or the stack frame for the process most recently being run? Does the MMU just swap pages and map memory while the OS does all the heavy lifting or does the MMU also keep track of where pages were swapped to? – Void Star Apr 18 '12 at 01:07
Okay, I guess I'll put this in a new question soon. You provided more clear information but Gray addressed the majority of my queries. I'm checking his post for the answer but anybody learning from this discussion should definitely read both answers. Thank you – Void Star Apr 19 '12 at 05:08

Gray · Accepted Answer · 2012-04-17T13:43:07.867

1

A process is allocated memory on the fly, so must it use relative addressing?

No, it can use relative or absolute addressing depending on what it is trying to address.

At least historically, the various different addressing modes were more about local versus remote memory. Relative addressing was for memory addresses close to the current address while absolute was more expensive but could address anything. With modern virtual memory systems, these distinctions may be no longer necessary.

A process is allocated memory on the fly, so must it use relative addressing? Is this done automatically (meaning there are assembly instructions that perform relative jumps, etc.), or does the program have to "manually" add the correct offset to every memory position it addresses.

I'm not sure about this one. This is taken care of by the compiler normally. Again, modern virtual memory systems make make this complexity unnecessary.

Are they saved to memory before control is given to a different thread?

Yes. Typically all of the state (registers, etc.) is stored in a process control block (PCB), a new context is loaded, the registers and other context is loaded from the new PCB, and execution begins in the new context. The PCB can be stored on the stack or in kernel memory or in can utilize processor specific operations to optimize this process.

Or, rather than timed interrupts, does the thread simply choose a good time to give up control.

The thread can yield control -- put itself back at the end of the run queue. It can also wait for some IO or sleep. Thread libraries then put the thread in wait queues and switch to another context. When the IO is ready or the sleep expires, the thread is put back into the run queue. The same happens with mutex locks. It waits for the lock in a wait queue. Once the lock is available, the thread is put back into the run queue.

In the case of timed interrupts, what happens if a thread is given processor time and it doesn't need it. Does it have to waste it, can it call the interrupt manually, or does it alert the OS that it doesn't need much time?

Either the thread can run (perform CPU instructions) or it is waiting -- either on IO or a sleep. It can ask to yield but typically it is doing so by [again] sleeping or waiting on IO.

edited Apr 17 '12 at 13:43

answered Apr 17 '12 at 05:04

Gray

115,027
24
293
354

Oh cool, this answers my question very nicely. I hadn't though about pushing the register values to the stack, that's handy to know about. I think I was mixing up threads and processes a little bit, but you clarified that for me as well. – Void Star Apr 17 '12 at 05:17
People learning from this should also read the next answer as it addresses some aspects of my questions more clearly. – Void Star Apr 19 '12 at 05:09
hi @Gray + 1 , when a thread reads a global variable it stores it in a register and hence it will always be worth the same for the life of the thread, right? - https://stackoverflow.com/questions/75390627/does-the-context-switch-retain-keep-state-of-variables-the-value-of-the-variab#comment133029662_75390627 – Feb 09 '23 at 01:14
@Gray I asked a question about this, could you explain to me how global variables are saved in a thread? THANKS IN ADVANCE – Feb 09 '23 at 01:15
1

Not a register. CPU memory cache. It might never change for the life of the thread _if_ the thread never crosses a memory barrier and never gets repurposed by the OS causing it's memory to be refreshed. Both of these are unlikely @Coder23. – Gray Feb 09 '23 at 19:44
thanks @Gray +1, so does it mean that a thread can see a "fresh" global variable every time **it reads it** inside the thread? – Feb 09 '23 at 20:09
1

No. I needs to cross a memory barrier to force the updating of the local cache variables to be sure of the most up-to-date value. This depends on what language you are using. Maybe ask a question about this? – Gray Feb 09 '23 at 20:10
great @Gray, **I already did it yesterday**, I already have answers, but I need the answer from someone top (expert) like you, this is the question https://stackoverflow.com/questions/75390627/does-the-context-switch-retain-keep-state-of-variables-the-value-of-the-variab#comment133029662_75390627 – Feb 09 '23 at 20:13
THANKS IN ADVANCE @Gray I would like to see your response/comment there :) – Feb 09 '23 at 20:13
Sorry @Coder23. I know more about Java than general cache coherency. Thought this was a Java question. Whether or not the values get written to main memory depends on whether the memory is marked as being write-back and other details. I wouldn't consider myself to be an expert in this area. – Gray Feb 09 '23 at 20:28
thanks @Gray +1, you mean that for a update made by a thread on a global (shared) variable to be visible in other threads there has to be a memory barrier, e.g. a mutex, right? – Feb 10 '23 at 13:25

score 0 · Answer 3 · answered Mar 27 '15 at 15:58

I probably walked into this question quite late, but then, it may be of use to some other programmers. First - the theory.

The modern day operating system will virtualize the memory, and to do so, it maintains, within its system memory area, a series of page pointers. Each page is of a fixed size (usually 4K), and when any program seeks some memory, its allocated memory addresses that are virtualized using the memory page pointer. Its approximates the behaviour of "segment" registers in the prior generation of the processors.

Now when the scheduler decides to get another process running, it may or may not keep the previous process in memory. If it keeps it in memory, then all that the scheduler does is to save the entire register snapshot (now, including YMM registers - this bit was a complex issue earlier as there are no single instructions that saved the entire context : read up on XSAVE), and this has a fixed format (available in Intel SW manual). This is stored in the memory space of the scheduler itself, along with the information on the memory pages that were being used.

If however, the scheduler needs to "dump" the current process context that is about to go to sleep to the hard disk - this situation usually arises when the process that is waking up needs extraordinary amount of memory, then the scheduler writes the memory page files in the disk blocks (called pagefile - reserved area of memory - also the source of "old grandmother wisdom" that pagefile must be equal to size of real memory) and the scheduler preserves the memory page pointer addresses as offsets in the pagefile. When it wakes up, the scheduler reads from pagefile the offset address, allocates real memory and populates the memory page pointers, and then loads the contents from the disk blocks.

Now, to answer your specific questions : 1. Do u need to use only relative addressing, or you can use absolute?

And. You may use either - whatever u perceive to be as absolute is also relative as the memory page pointer relativizes that address in an invisible format. There is no really absolute memory address anywhere (including the io device memories) except the kernel of the operating system itself. To test this, u may unassemble any .EXE program, to see that the entry point is always CALL 0010 which clearly implies that each thread gets a different "0010" to start the execution.

How do threads get life and what if it surrenders the unused slice.

Ans. The threads usually get a slice - modern systems have 20ms as the usual standard - but this is sometimes changed in special purpose compilation for servers that do not have many hardware interrupts to deal with - in order of their position on the process queue. A thread usually surrenders its slice by calling function sleep(), which is a formal (and very nice way) to surrender your balance part of the time slice. Most libraries implementing asynchronous reads, or interrupt actions, call sleep() internally, but in many instances, top level programs also call sleep() - e.g. to create a time gap. An invocation to sleep will certainly change the process context - the CPU actually is not given the liberty to sleep using NOP.

The other method is to wait for an IO to complete, and this is handled differently. The program on asking for an IO process, will cede its time slice, and the process scheduler flags this thread to be in "WAITING FOR AN IO" state - and this thread will not be given a time slice by the processor till its intended IO is completed, or timed out. This feature helps programmers as they do not have to explicitly write a sleep_until_IO() kind of interface.

Trust this sets you going further in your explorations.

Memory addressing in assembly / multitasking

3 Answers3