1

I've been looking at the output of cat /proc/iomem and noticed a 1 kB section of reserved addresses at the end of the first block of System RAM. At first, I thought that this was a fluke of my installation, but my research seems to indicate that it's fairly common. However, I haven't found a satisfactory explanation to why it exists. Here's an example: [CentOS Deployment Guide][1]

00000000-0009fbff : System RAM
0009fc00-0009ffff : reserved 
000a0000-000bffff : Video RAM area
000c0000-000c7fff : Video ROM 
000f0000-000fffff : System ROM
00100000-07ffffff : System RAM   
00100000-00291ba8 : Kernel code
00291ba9-002e09cb : Kernel data 
e0000000-e3ffffff : VIA Technologies, Inc. VT82C597 [Apollo VP3] e4000000-e7ffffff : PCI Bus #01   
e4000000-e4003fff : Matrox Graphics, Inc. MGA G200 AGP   
e5000000-e57fffff : Matrox Graphics, Inc. MGA G200 AGP 
e8000000-e8ffffff : PCI Bus #01   
e8000000-e8ffffff : Matrox Graphics, Inc. MGA G200 AGP 
ea000000-ea00007f : Digital Equipment Corporation DECchip 21140 [FasterNet]
ea000000-ea00007f : tulip ffff0000-ffffffff : reserved

My questions:

  1. Why does the first block of System RAM end at 0009fbff? The page size for the system is 4 kB so shouldn't it end at xxxxxfff to make the block a multiple of page size? If the page sizes are not consistent for a resource struct, how would the OS keep track of which are 4 kB and which are smaller?
  2. What is the reserved section at 0009fc00? I've seen a comment about it relating to [real-mode computing][2], but I'd appreciate an explanation in this context if possible.

Thanks! I've been a long time reader of stack overflow and this is my first time as a participant :)

EDIT: The information provided by @MichaelPetch answers my second question. Reading more about the EBDA section led me to this useful article on real mode and how the segmentation granularity changes from 1 B to 4 kB: Memory Map x86

It seems to me that after switching to protected mode the address range 00000000-0009fbff were reclaimed by the kernel to use to address System RAM. This leaves me with further questions:

  1. In protected mode, does each of the addresses in the range designated for System RAM correspond to a 4 kB frame rather than 1 B of main memory? If that is the case, it makes sense that the range does not have to be 4 kB page aligned because each physical address now corresponds to a 4 kB frame rather than an individual byte of main memory. This would also explain why the Kernel Code and Kernel Data address ranges in the remainder of System RAM are not aligned to 4 kB.
  2. Why is the range 00000000-0009fbff available to the kernel for use to address main memory while the range used for the EBDA section still reserved? This may be an OS-specific question, but I was curious if there is a general OS design principle that requires that the EBDA section still be available.
  3. I understand from looking through resource.c, which exports the iomem_resource symbol, that /proc/iomem essentially walks a tree of resource structs that describe the address range. I'm now curious if iomem_resource is the definitive structure by which the Linux kernel manages the assignment of physical addresses. By definitive, I mean that adding a node to the tree effectively reserves that physical address range for a device. Or is iomem_resource created off of another mechanism within the kernel that actually defines physical address range assignment?
  • That 1k (and on some systems more) is the extended BIOS data segment. http://www.matrix-bios.nl/system/ebda.html – Michael Petch Sep 15 '15 at 00:28
  • In protected mode often OSes set the granularity of the segment descriptors to 4k so that 4gb of memory can be addressed (limited by the 20 bit segment limit) 2^20*4096=4gb (assuming segment limit is 0xFFFFF)/However there is a granularity bit that allows a segment descriptor to be set to 1 byte granularity. 2^20*1=1mb (assuming max segment limit of 0xFFFFF) . In traditional real mode granularity is 1 byte. – Michael Petch Sep 15 '15 at 01:17
  • As a new user, I had to remove my first two links in my initial question. I'm posting them as a reply for future readers. [1]: https://www.centos.org/docs/5/html/5.2/Deployment_Guide/s2-proc-iomem.html [2]: http://superuser.com/questions/480451/what-kind-of-memory-addresses-are-the-ones-shown-by-proc-ioports-and-proc-iomem – user5335880 Sep 15 '15 at 15:34
  • #3 depends on the granularity specified in a segment descriptor which depends on the OS implementation (These can only be set while in Protected Mode). It can be either 1B or 4096(4k) on a 32 bit x86 processor. You could write a Protected mode OS with all 1 byte granularity (very restrictive). – Michael Petch Sep 15 '15 at 16:53
  • The EBDA area (if it is present) is considered unusable since many computers have SMM firmware that will write to this memory area (making it ineffective for general purpose use). See https://en.wikipedia.org/wiki/System_Management_Mode – Michael Petch Sep 15 '15 at 17:14
  • If you jump into realmode (or V8086 mode) from protected mode then usually it is good practice not to use 0x0000 to 0x04FF. Generally the interrupt vector table runs from 0x0000 to 0x03FF and there is the original BDA from 0x0400-0x4FF. If you don't need the BDA info then you can in theory use the memory from 0x0400 onwards. Overwriting the real mode(or V8086 mode) interrupt vector table although allowable can be problematic for obvious reasons. – Michael Petch Sep 15 '15 at 17:24
  • While in real mode (usually done before a kernel enters protected mode) the system memory map is acquired by making BIOS calls to map all the holes. That data is then made available for use during Protected Mode. See http://wiki.osdev.org/Detecting_Memory_(x86). The GRUB bootloader (or any bootloader written to Multi-Boot spec) can do this memory mapping step before entering protected mode and calling a kernel. A multiboot loader makes this data structure available upon request. – Michael Petch Sep 15 '15 at 17:34
  • This article has a reasonably good intro to Segmentation/Paging/Descriptors etc http://www.internals.com/articles/protmode/protmode.htm . – Michael Petch Sep 15 '15 at 18:40
  • Going through the code of a 2.4.18 and (2.6, 3.x and 4.x) it is clear that iomem is displaying the unpaged region values. When it comes to actually reserving the memory and generating page table entries for them they are rounded down to the next nearest page boundary on the x86. Another interesting observation is that because of bugs on different hardware: different kernels may report different sizes for EBDA even when run on the same hardware. Newer kernels take more conservative approach to determining the EBDA region to avoid buggy reporting by certain hardware. – Michael Petch Sep 16 '15 at 21:07
  • @MichaelPetch Thanks for the thoroughly researched answer(s)! – user5335880 Sep 30 '15 at 02:27

0 Answers0