9

I am writing a kernel by myself, and after the first page error interrupt handler, when IRET is executed, it causes an interrupt 13(general protection), and error code is 0x18. I don't know what is wrong, the content pushed on the stack comes from the cpu.

Here is the register state when interrupt occurs, and memory where the registers were stored.In addition, IRET is returned from a page error interrupt handler.

It is sure that %ESP is the same before IRET executing and interrupt occurring.

enter image description here

enter image description here

Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985
venus.w
  • 2,171
  • 7
  • 28
  • 42

3 Answers3

9

If the exception is from IRET itself, then most likely IRET is failing to restore one of the saved segment registers, but the value (8 or 0x18, btw?) is somehow wrong. It can be wrong because you never (re)initialized the register in protected mode or your handler set it to a bad value before doing IRET or something happened to the GDT...

EDIT: From the picture it's apparent that the page fault handler didn't remove the exception code (value of 4 at address in ESP) before executing IRET. And so IRET interpreted 4 as the new value for EIP, 0x1000018 as the new value for CS and 0x23 as the new value for EFLAGS, whereas it should be using 0x1000018, 0x23 and 0x3206 for those three registers. Obviously, a data segment selector (which 0x1000018 is interpreted as after truncation to 0x0018) cannot be loaded into CS and this causes #GP(0x18).

Alexey Frunze
  • 61,140
  • 12
  • 83
  • 180
  • "It can be wrong because you never (re)initialized the register in protected mode",which register should I initialize? – venus.w May 14 '12 at 14:56
  • All segment registers should be properly initialized in protected mode (CS,DS,ES,SS,FS,GS). Actually, at this moment it's kind of hard to say what exactly is wrong without knowing more about your code. Post the code. – Alexey Frunze May 14 '12 at 15:01
  • The `IRET` description should give you some specific hints as to what can be wrong. Look at all the conditions leading to `#GP(selector)`. – Alexey Frunze May 14 '12 at 15:08
  • I have put the register states just now. – venus.w May 14 '12 at 15:17
  • That is good, but what what's in the memory, on the stack, from where `IRET` gets register values? – Alexey Frunze May 14 '12 at 15:26
  • Looks like you didn't remove the page fault error code from the stack (that 4 value). `IRET` doesn't remove it for you. So, `IRET` tried to take this error code (4) for `EIP` (instead of taking 0x1000018), then it tried to take 0x1000018 for `CS` and... there you go, your #GP(0x18). – Alexey Frunze May 14 '12 at 15:41
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/11235/discussion-between-venus-w-and-alex) – venus.w May 14 '12 at 15:55
  • :I debug my little kernel on qemu, but met a problem when implementing the system call fork(), I don't know whether it does matter to the emulator. – venus.w May 26 '12 at 09:41
  • Try other emulators and a dedicated test PC. You can flush out bugs trying different execution environments just like you can find some bugs when using different compilers. – Alexey Frunze May 26 '12 at 09:43
  • Do you have some to recommend? – venus.w May 26 '12 at 09:45
  • None is perfect. Try all that you can get and use (possibly with changes in your code): Bochs, DosBox, VirtualPC, VMWare, Hyper-V. Use a real PC too. – Alexey Frunze May 26 '12 at 09:49
6

Expanding on Alexey:

When some interrupts happen (but not others), they automatically push a 4 byte error code to the stack. Page fault is one of them.

This error code contains extra information about the interrupt.

Intel Manual Volume 3 System Programming Guide - 325384-056US September 2015 Table 6-1. "Protected-Mode Exceptions and Interrupts" column "Error Code" tells us exactly which interrupts push the error code, and which do not.

38.9.2.2 "Page Fault Error Codes" explains what the error means.

So you will need either:

pop %eax
/* Do something with %eax */
iret

Or if you want to ignore the error code:

add $4, %esp
iret

For a minimal example, see this page handler and try commenting out the pop.

Compare the above with a Division error exception which does not need to pop the stack.

Note that if you do simply int $14, no extra byte gets pushed: this only happens on the actual exception.

A neat way to deal with this is to push a dummy error code 0 on the stack for the interrupts that don't do this to make things uniform. James Molloy's tutorial does exactly that.

The Linux kernel 4.2 seems to do something similar. Under arch/x86/entry/entry64.S it models interrupts with has_error_code:

trace_idtentry page_fault do_page_fault has_error_code=1

and then uses it on the same file as:

.ifeq \has_error_code
pushq $-1 /* ORIG_RAX: no syscall to restart */
.endif

which does the push when has_error_code=0.

Related question: Do I have to pop the error code pushed to stack by certain exceptions before returning from the interrupt handler?

Community
  • 1
  • 1
Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985
1

I know this is a very old question, but adding my answer here for the sake of completeness.

Depending on your assembler, you may need to explicitly use iretd (the 32-bit variant) instead of iret (16-bits). The CPU will raise a GP if you try to use the 16-bit variant to return to a 32-bit segment.

B. Reynolds
  • 21
  • 1
  • 3