Unexpected behaviour in simple pointer arithmetics in kernel space C code

Question

I am trying to follow this and this tutorial for learning. I have encountered a weird behavior which I still don't understand.

I have the following piece of C code:

void print(const char* str, char* vidptr, TerminalColor color){
    for (unsigned i = 0; str[i] != '\0'; ++i){
        unsigned int j = i*2;
        vidptr[j] = str[i];
        vidptr[j+1] = color;
    }
    return;
}

When I call it with the following parameters:

const char* str = "My string";
char* vidptr = (char*) 0xb8000;
print(str, vidptr, GREY); // GREY is equal to 7

memory looks like this at the end of the first execution of for body:

0xb8000: 0x07  0x00  0x00  0x00  0x00  0x00  0x00  0x00

when I expected:

0xb8000: 0x4D  0x07  0x00  0x00  0x00  0x00  0x00  0x00

Running the program with gdb connecting to the qemu-system-x86_64 gdbserver i noticed that until vidptr[j] = str[i]; everything runs fine then the execution of vidptr[j+1] = color; overwrites the first, as if vidptr[j] is equal to vidptr[j+1]. The code disassembly (as shown by gdb in split layout), is the following:

; vidptr[j] = str[i]
0x100144 <print+35>     mov    edx,DWORD PTR [rbp-0x8]
0x100147 <print+38>     mov    rax,QWORD PTR [rbp-0x18]
0x10014b <print+42>     add    rax,rdx
0x10014e <print+45>     mov    ecx,DWORD PTR [rbp-0x4]
0x100151 <print+48>     mov    rdx,QWORD PTR [rbp-0x20]
0x100155 <print+52>     add    rdx,rcx
0x100158 <print+55>     movzx  eax,BYTE PTR [rax]
0x10015b <print+58>     mov    BYTE PTR [rdx],al
; vidptr[j+1] = color
0x10015d <print+60>     mov    eax,DWORD PTR [rbp-0x4]
0x100160 <print+63>     add    eax,0x1
0x100163 <print+66>     mov    edx,eax  ;eax = 0x1
0x100165 <print+68>     mov    rax,QWORD PTR [rbp-0x20] ;rax = 0xb8000
0x100169 <print+72>     add    rax,rdx ;here I observe the following: after I give the ni command to gdb the pc goes to the 
                                       ;equivalent instruction `add eax,edx` but I oddly see the value of rax going to 0xb7fff. 
                                       ;Then I send another ni command to gdb which shows again the instruction pointer to the 
                                       ;next instruction (print+75).
0x10016c <print+75>     mov    edx,DWORD PTR [rbp-0x24] ; here the value of rax is again 0xb8000 instead of 0xb8001
0x10016f <print+78>     mov    BYTE PTR [rax],dl ; here the value written in the previous instruction is overwritten
0x100171 <print+80>     add    DWORD PTR [rbp-0x8],0x1
0x100175 <print+84>     mov    edx,DWORD PTR [rbp-0x8]
0x100178 <print+87>     mov    rax,QWORD PTR [rbp-0x18]
0x10017c <print+91>     add    rax,rdx
0x10017f <print+94>     movzx  eax,BYTE PTR [rax]
0x100182 <print+97>     test   al,al
0x100184 <print+99>     jne    0x100139 <print+24>

Any clue on why this is happening?

As a reference follows the output of gdb:

1: x/i $pc
=> 0x100166 <print+69>: add    rax,rdx
2:x/8xb 0xb8000
0xb8000:        0x71    0x00    0x20    0x00    0x20    0x00    0x20    0x00
4: i = 0
5: j = 0
6: /x $rax = 0xb8000
(gdb) ni
1: x/i $pc
=> 0x100167 <print+70>: add    eax,edx
2:x/8xb 0xb8000
0xb8000:        0x71    0x00    0x20    0x00    0x20    0x00    0x20    0x00
6: /x $rax = 0xb7fff
(gdb) p/x $eax
$3 = 0xb7fff
(gdb) p/x $edx
$4 = 0x1
(gdb) ni
1: x/i $pc
=> 0x100169 <print+72>: mov    edx,DWORD PTR [rbp-0x24]
2:x/8xb 0xb8000
0xb8000:        0x71    0x00    0x20    0x00    0x20    0x00    0x20    0x00
6: /x $rax = 0xb8000
(gdb) ni
1: x/i $pc
=> 0x10016c <print+75>: mov    BYTE PTR [rax],dl
2:x/8xb 0xb8000
0xb8000:        0x71    0x00    0x20    0x00    0x20    0x00    0x20    0x00
6: /x $rax = 0xb8000
(gdb) ni
1: x/i $pc
=> 0x10016e <print+77>: add    DWORD PTR [rbp-0x8],0x1
2:x/8xb 0xb8000
0xb8000:        0x07    0x00    0x20    0x00    0x20    0x00    0x20    0x00
6: /x $rax = 0xb8000

This might seem like a very stupid question, but are you sure you are in 64-bit long mode? You'd see things like this if you were running 64-bit long mode code in 32-bit protected mode. — Michael Petch, Mar 24 '18 at 22:44
It is not a stupid question as this is exactly the situation, I am still in protected mode, but I don't understand why such a thing should happen anyways. — fsalvio, Mar 24 '18 at 22:48
The reason it is happening is that when you write assembly targeting 64-bit long mode the instructions are not encoded the same way as 32-bit code. The result is that you have 64-bit encoded instructions running in 32-bit protected mode. Enough of the instructions will make it appear like things are working when thy probably aren't. Often (coincidentally) this is first observed by strange output to the video display that isn't as expected. That's what lead me to the question I posed. — Michael Petch, Mar 24 '18 at 22:52
Change GDB to debug (disassemble) as i386 code rather than 64-bit code and you should see how the processor is decoding the 64-bit instructions in 32-bit protected mode. That doesn't solve the problem, but you get an idea of what instructions are actually being executed. — Michael Petch, Mar 24 '18 at 22:52
Try issuing this in GDB `set architecture i386:intel` before stepping through the code. To actually fix the problem you'll have to either - switch into long mode or generate 32-bit code instead of 64-bit code. — Michael Petch, Mar 24 '18 at 22:56
I believe you'll find the answer in my comments above are reflected in the answer I gave to a duplicate question. In particular read my answer specifically the section _Primary Issue Causing Unusual Behaviour_ . Although the question is worded differently it comes down to the issue of 64-bit code running in 32-bit mode. The person asking that question was also getting unexpected video output that they couldn't explain — Michael Petch, Mar 24 '18 at 23:07
Thank you for the clarification. Then maybe my question is: how should I compile and link the code that brings the machine in long mode for a x86-64 architecture? I tried to compile it as 32 bit code (with -m32) with an elf64 format but (not too surprisingly) gcc refused to. — fsalvio, Mar 24 '18 at 23:11
You have two choices you create a 32-bit OS if you remain in 32-bit protected mode. You would have to compile/assemble and link as 32-bit code and objects (the answer I gave in the answer to the other question explains how to do that as well). The alternative is to keep building 64-bit code. But you manually have to modify your bootloader to switch into 64-bit long mode. That includes havign to set up paging. The links you gave seem to discuss how to enter 64-bit long mode. — Michael Petch, Mar 24 '18 at 23:14
Your bootloader has to do the switch to long mode, it isn't something the 64-bit compiled/assembled code does for you. The assumption is that you have already put the processor in the right mode (64-bit longmode) before trying to execute it. — Michael Petch, Mar 24 '18 at 23:15
If you have a question about how to enter 64-bit long mode you can ask a new question, although the answer is in the tutorial you linked to in your question: https://os.phil-opp.com/entering-longmode/ (although it is a rust tutorial, the code to switch still applies). I asusme you are booting with GRUB or QEMU's `-kernel` option? — Michael Petch, Mar 24 '18 at 23:18
OS Dev Wiki also has a section on entering long mode: https://wiki.osdev.org/Setting_Up_Long_Mode — Michael Petch, Mar 24 '18 at 23:21
@MichaelPetch this means there is no way I can have sections of my executable which are 32bit and others which are 64 right? — fsalvio, Mar 24 '18 at 23:24
Anyways thank you very much @MichaelPetch, the issue is clear now. Sorry for not spotting it before while searching among the previous questions. — fsalvio, Mar 24 '18 at 23:27
You can have sections that are 32-bit protected mode and 64-bit long mode if the processor is in 32-bit protected mode when running the code in the 32-bit sections, and when running code in the 64-bit sections the processor is in long mode. You have to be in the right mode before executing whatever code is in the sections. The 64-bit ELF format supports 32-bit code sections. — Michael Petch, Mar 24 '18 at 23:28
@MichaelPetch: @ prl asked a question about your last comment; how you get gcc/binutils to actually link `gcc -m32` and `-m64` output into an ELF64 with a 32-bit code section: https://stackoverflow.com/questions/49473061/linking-32-and-64-bit-code-together-into-a-single-binary — Peter Cordes, Mar 25 '18 at 13:28
@PeterCordes : You don't link a 32-bit elf file and a 64-bit elf file. You use elf64 files to contain 32-bit and 64-bit code. Normally in these OSDev questions the 32-bit sections are inside an assembly file where you can use (with NASM) the `bits 32` directive, but you still produce an elf64 object. — Michael Petch, Mar 25 '18 at 14:02
@MichaelPetch: can you post that comment on prl's question? If you aren't already answering it with some way to get gcc to put compiler-generated 32-bit code in a 64-bit ELF object :P — Peter Cordes, Mar 25 '18 at 14:20

Unexpected behaviour in simple pointer arithmetics in kernel space C code

0 Answers0

Linked