Triple fault when jumping to 64-bit longmode

Question

The following code that transitions from 32-bit protected mode (with A20 enabled) to 64-bit longmode seems to be giving me issues. I identity map the 1GiB page from 0x00000000 to 0x3fffffff; enable PAE; enable the longmode bit in the EFER MSR; install a GDT; enable paging; and then do a simulated FAR JMP to my 64-bit entry point:

lea eax, [PML4]
mov cr3, eax

mov eax, cr4
or eax, 100000b
mov cr4, eax

mov ecx, 0xc0000080
rdmsr
or eax, 100000000b
wrmsr

mov eax, cr0
mov ebx, 0x1
shl ebx, 31
or eax, ebx
mov cr0, eax

call gdt64_install
push 8
push longmode
retf ;<===================== faults here

The program triple faults in BOCHS when the RETF instruction is executed but doesn't seem to return any error. If i type info tab before this jump I get:

0x00000000-0x3fffffff -> 0x000000000000-0x00003fffffff

It appears to me paging is working. This is sreg output:

es:0x0010, dh=0x00cf9300, dl=0x0000ffff, valid=1
    Data segment, base=0x00000000, limit=0xffffffff, Read/Write, Accessed
cs:0x0008, dh=0x00cf9b00, dl=0x0000ffff, valid=1
    Code segment, base=0x00000000, limit=0xffffffff, Execute/Read, Non-Conforming, Accessed, 32-bit
ss:0x0010, dh=0x00cf9300, dl=0x0000ffff, valid=31
    Data segment, base=0x00000000, limit=0xffffffff, Read/Write, Accessed
ds:0x0010, dh=0x00cf9300, dl=0x0000ffff, valid=31
    Data segment, base=0x00000000, limit=0xffffffff, Read/Write, Accessed
fs:0x0000, dh=0x00009300, dl=0x0000ffff, valid=1
    Data segment, base=0x00000000, limit=0x0000ffff, Read/Write, Accessed
gs:0x0000, dh=0x00009300, dl=0x0000ffff, valid=1
    Data segment, base=0x00000000, limit=0x0000ffff, Read/Write, Accessed
ldtr:0x0000, dh=0x00008200, dl=0x0000ffff, valid=1
tr:0x0000, dh=0x00008b00, dl=0x0000ffff, valid=1
gdtr:base=0x0000000000008252, limit=0x1f
idtr:base=0x0000000000000000, limit=0x3ff

My GDT entry is:

gdt64_install:
    lgdt[GDT_addr]
    ret


    GDT_addr:
    dw (GDT64_end - GDT64) - 1
    dd GDT64

    GDT64:
    dd 0, 0

    dd 0xffff  ; segment limit
    dd 0xef9a00

    dd 0xffff  ; segment limit
    dd 0xef9200

    dd 0, 0
    GDT64_end:

My page table structure using a PML4 and PDP is defined as:

align 4096 ;;align to 4 KB
    PML4:
        dq 0 or 1b or 10b or PDP;;preset bit, r/w bit
        dq 511 dup(PDP or 10b)
    PDP:
        dq 0 or 1b or 10000000b ;;dq zero, because we map memory from start so 0x0000, present bit
        ;;PDPE.PS to indicate 1gb pages
        dq 511 dup(10000000b)

Any ideas why it might be triple faulting?

A copy of my project can be found on Github

oh wow, someone actually writing to the `cr` registers. Respect. — Mike Nakis, Apr 16 '17 at 14:46
@harold with a FAR RET in 32-bit protected mode a 32-bit DWORD will be popped from the stack into _CS_ and the top 16 bits are discarded. Then the next DWORD is popped into _EIP_. Although I'd just use _JMP_ like you say, I don't consider the bug to be the construction of the return address with his FAR RET. — Michael Petch, Apr 16 '17 at 15:53
Do you have a project you can make available with all your code? As it is you don't present us a minimal complete verifiable example so as it is it can be difficult to trouble shoot. — Michael Petch, Apr 16 '17 at 15:56
@MichaelPetch everything is located at my friend's github here, we have commited the newest code today: https://github.com/cuaox/RIOS he asked if i want to help and i am trying to do some paging but well, something is not working — Marcus Dem, Apr 16 '17 at 16:30

Michael Petch · Accepted Answer · 2017-04-17T21:19:02.697

The primary problem is that your GDT appears to have been designed with 32-bit in mind. For 64-bit descriptors you'll want to set the 64-bit descriptor bit. From OSDev wiki we can see the layout of the GDT as well as the flags and access bits:

As described in the wiki these changes apply to 64-bit descriptors:

x86-64 Changes

'L' bit (bit 21, next to 'Sz') is used to indicate x86-64 descriptor

'Sz' bit (bit 22) has to be 0 when the 'L' bit is set, as the combination Sz = 1, L = 1 is reserved for future use (and will throw an exception if you try to use it)

Intel also recommends aligning the GDT on an 8 byte boundary for performance reasons. In 64-bit descriptors the base and limit should be set to 0. If you intend to ever use the GDT table from 64-bit mode later on you'll want to change dd GDT64 to be a quadword. With these things in mind I modified your GDT to be a bit more readable:

    GDT_addr:
        dw (GDT64_end - GDT64) - 1
        dq GDT64                     ; Use quadword so we can use this GDT table
                                     ;     from 64-bit mode if necessary

align 8                              ; Intel suggests GDT should be 8 byte aligned

    GDT64:                           ; Global Descriptor Table (64-bit).

    ; 64-bit descriptors should set all limit and base to 0
    ; NULL Descriptor
        dw 0                         ; Limit (low).
        dw 0                         ; Base (low).
        db 0                         ; Base (middle)
        db 0                         ; Access.
        db 0                         ; Flags.
        db 0                         ; Base (high).

    ; 64-bit Code descriptor
        dw 0                         ; Limit (low).
        dw 0                         ; Base (low).
        db 0                         ; Base (middle)
        db 10011010b                 ; Access (present/exec/read).
        db 00100000b                 ; Flags 64-bit descriptor
        db 0                         ; Base (high).

    ; 64-bit Data descriptor    
        dw 0                         ; Limit (low).
        dw 0                         ; Base (low).
        db 0                         ; Base (middle)
        db 10010010b                 ; Access (present/read&write).
        db 00100000b                 ; Flags 64-bit descriptor.
        db 0                         ; Base (high).
    GDT64_end:

Other Observations

You use this to transition into 64-bit long mode:

push 8
push longmode
retf

While this works, if you are using FASM, or NASM it is much easier to use a FAR JMP if you are still in 32-bit mode:

jmp 0x08:longmode

There is an issue with doing FAR JMP once in 64-bit code since some early AMD64 processor types didn't support JMP mem16:64. Using the PUSH/RETF method makes the code more universal. Doing such a FAR JMP once in 64-bit long mode would only be used in very rare cases.

You have another issue in your code with regards to reading sectors. I discovered that not all of your code and data was being read into memory. In your exread.inc you define:

SECTOREAD equ 20

I discovered that when I built your floppy disk image that the file size was 13976. That is 28 sectors worth (512*28=14336). Your value of 20 wasn't reading enough. Make sure this isn't an issue for you and if need be read more sectors if necessary.

Not related to the issues at hand, I notice in your Makefile you have:

qrun: deploy_all
    qemu-system-i386 kernel.bin

If you want to run 64-bit code in QEMU you'll want to use qemu-system-x86_64 not qemu-system-i386. I found this to be more useful:

qrun: deploy_all
    qemu-system-x86_64 -fda floppy.bin -no-shutdown -no-reboot -d int

The -no-shutdown -no-reboot -d int options are useful for debugging. It will cause QEMU to not reboot and shutdown on triple faults. -d int provides useful information on interrupts and exceptions that are thrown.

Triple fault when jumping to 64-bit longmode

1 Answers1

Other Observations