3

I'm doing bare-metal programming (I'm developing a kernel) on an ARM-Cortex A53, SoC BCM2837 (Raspberry PI 3 in other words). I'm actually writing the piece of software responsable to handle the mini UART (a sort of hello world, as reported on OsDev wiki https://wiki.osdev.org/ARM_RaspberryPi_Tutorial_C). So I've written a set of functions to handle the mini UART, let's consider the following, since the problem persists for any of the other functions:

void miniUartSendByte(unsigned char byte){
  // FIFO can accept at least one byte
  while(*AUX_MU_LSR_REG & 0b100000);

  // write byte to buffer
  *AUX_MU_IO_REG = byte;
  return;
}

where AUX_MU_* is of type volatile unsigned int*. this the disassemble of the code above:

  1000ac:       d10043ff        sub     sp, sp, #0x10
  1000b0:       39003fe0        strb    w0, [sp, #15]
  1000b4:       d503201f        nop
  1000b8:       d28a0a80        mov     x0, #0x5054                     // #20564
  1000bc:       f2afc420        movk    x0, #0x7e21, lsl #16
  1000c0:       b9400000        ldr     w0, [x0]
  1000c4:       121b0000        and     w0, w0, #0x20
  1000c8:       7100001f        cmp     w0, #0x0
  1000cc:       54ffff61        b.ne    1000b8 <miniUartSendByte+0xc>   // b.any
  1000d0:       d28a0800        mov     x0, #0x5040                     // #20544
  1000d4:       f2afc420        movk    x0, #0x7e21, lsl #16
  1000d8:       39403fe1        ldrb    w1, [sp, #15]
  1000dc:       b9000001        str     w1, [x0]
  1000e0:       d503201f        nop
  1000e4:       910043ff        add     sp, sp, #0x10
  1000e8:       d65f03c0        ret

and this the execution as reported by QEMU:

----------------
IN: kernel_main
0x00100050:  a9bf7bfd  stp      x29, x30, [sp, #-0x10]!
0x00100054:  910003fd  mov      x29, sp
0x00100058:  52800c60  movz     w0, #0x63
0x0010005c:  94000012  bl       #0x1000a4 // jump to miniUartSendByte

----------------
IN: miniUartSendByte
0x001000a4:  d10043ff  sub      sp, sp, #0x10
0x001000a8:  39003fe0  strb     w0, [sp, #0xf]
0x001000ac:  d503201f  nop      
0x001000b0:  d28a0a80  movz     x0, #0x5054
0x001000b4:  f2afc420  movk     x0, #0x7e21, lsl #16
0x001000b8:  b9400000  ldr      w0, [x0]
0x001000bc:  121b0000  and      w0, w0, #0x20
0x001000c0:  7100001f  cmp      w0, #0
0x001000c4:  54ffff61  b.ne     #0x1000b0

----------------
IN: 
0x00000200:  00000000  .byte    0x00, 0x00, 0x00, 0x00 // ??

As you can see, when the machine executes the jump, it receives an exception and jumps to the address 0x200, where the interrupt handler is placed (note, no interrupt handler has been configured, I've not implemented it yet) and gets stucked at address 0x200 executing an infinite loop (default behaviour when no interrupt handler is present). Now, from QEMU I was able to capture the type of exception:

Taking exception 1 [Undefined Instruction]
...from EL3 to EL3
...with ESR 0x0/0x2000000
...with ELR 0x200
...to EL3 PC 0x200 PSTATE 0x3cd

I'm compiling with the following command:

aarch64-elf-gcc -Wall -O0 -ffreestanding -nostdinc -nostdlib -nostartfiles -mcpu=cortex-a53 -g -c ... -o ...

I've also tried to see if this was a "compiler" problem, trying to execute the following totally useless code:

void a(){
  for(int j=0; j<10; j++);
  return;
}

void b(char* string){
  for(int i = 0; i<10; i++){
    a();
  }
  return;
}

void kernel_main(){

  a();
  b("test");

  while(1);

  return;
}

but the execution goes without any problem... now I can't figure out what's going wrong. I mean, there is nothing bad in the C code producing that assembler, and addresses in the assembly code seems ok... any idea where the problem arises?? Why that Undefined Instruction exception?? If more information are needed I can provide more details

AlePalu
  • 73
  • 1
  • 8
  • Is there a reason why you are using `unsigned int *AUX_MU_IO_REG` to write a byte? This will write at least 16 bits. What else will be written to as a consequence? – Weather Vane May 28 '19 at 16:15
  • well unisgned int* refers to the address register AUX_MU_IO_REG is mapped to, in other words to write in AUX_MU_IO_REG I need to write to that memory location, furthermore according to the BCM2837 datasheet, even if the register is 32-bits wide, only the first 8 bits are accessible, so I can only write one byte at a time, all the rest is simply ignored. Moreover any code snippet I've seen uses volatile unsigned int* to access registers (take a look for example at the OsDev link in the question, function mmio_write() does exactly what I do (except they use uint32_t instead of unsigned int) – AlePalu May 28 '19 at 16:27
  • Presumably then the processor is little-endian. – Weather Vane May 28 '19 at 16:29
  • ok, accorfing to ARM documentation (http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0500e/CHDCGHEA.html) instructions are always little-endian. Can you please explain me how this impact the execution? I mean why miniUartSendByte() causes the exception on jump and functions like a() or b() don't?? Also, if the problem is in the data stored in registers, I should not experience problems on UART behaviour (I don't know, I want to send 'c' but instead a totally different thing is sent) and not on the code execution?? Sorry but I'm bit confusing... – AlePalu May 28 '19 at 16:45
  • I've tried to compile with the -mlittle-endian flag in gcc, but the result doesn't changes. Actually gcc already compiles in little endian format (I think it is able to detect the endianness of the target by looking at -mcpu argument), indeed if I compile with -mbig-endian, the following is returned "compiled for a big endian system and target is little endian". I don't think this is the problem... – AlePalu May 28 '19 at 16:56
  • It perhaps has nothing to do with the error, except I wondered if it could cause something indirectly, for example triggerring an interrupt that wasn't expected by accidentally configuring another register, but you say this can't be the case. The relevance of little-endian is that if the register is mapped to the first of the four byte range, then it needs to be little-endian for the 8-bit value to align. – Weather Vane May 28 '19 at 17:05
  • You seem to be missing the initialization of the UART. You need to disable interrupts explicitly somewhere. Are you doing that? – Marcos G. May 28 '19 at 17:10
  • yes, before call any routine related to UART, I call a function miniUartInit() responsable for the initialization, but also this causes the same exception to occur... actually, using gdb I was able to discover that the exception occurs at the very first instruction executed ```*(AUX_ENABLES) |= 0x1```. nothing to say, if I comment this line the exception occurs at the line after.... – AlePalu May 28 '19 at 17:20
  • @WeatherVane its a 32 bit register, pretty sure, no reason in this case to use a byte sized write. well versed in the pi's and doing an str vs strb is not the problem. – old_timer May 28 '19 at 17:21
  • I'm able to reproduce the error directly in the main, by inserting, for example, ```*(AUX_ENABLES |= 0x1```. maybe @WeatherVane the problem is exactly what you're saying, there can be something wrong in how the memory is accessed – AlePalu May 28 '19 at 17:23
  • you are not using interrupts (yet) correct? are you using the pi foundations code to sort the cores I assume or are you taking care of that? that should not cause a fault like this it will instead just multiply the number of characters come out. so come out of reset, someone sorts the cores, one core inits the uart which is a handful/dozen writes, then write this register but none of that works? – old_timer May 28 '19 at 17:23
  • Have you looked at the countless examples at the raspberry pi bare metal forum? lots of working code, arm/aarch32, aarch64, hyp mode svc mode, they manage the sorting of cores. you manage the sorting of cores execution layer 3, execution layer 2, etc... – old_timer May 28 '19 at 17:25
  • I see .org in those examples you linked....find other examples. – old_timer May 28 '19 at 17:26
  • If this is a pi3 with an led on a gpio, try blinking an led first, then try the uart later. somewhere between 10 and 20 lines of C code to do that. my uart init is 15 lines of C then a similar poll and poke putchar as yours. two lines. you could post your whole project and the disassembly and filename used and if you have a config.txt and if so whats in it. – old_timer May 28 '19 at 17:31
  • actually only core0 is working at this moment, I've wrote a bootloader which before launching the kernel puts all other cores with an ID different from 0 in wait (they get halted in a WFI instruction), I will manage multicore later... hence you can consider the system as single core, no concurrency or stuffs like that at the moment – AlePalu May 28 '19 at 17:31
  • so you have already done work on this platform? but now you have an elementary ldr/str problem? confused. did you start on a pi-zero before coming to the pi3 to get your feet wet with baremetal and/or the peripherals here and/or arm. And you are using qemu not hardware? start with something simpler... – old_timer May 28 '19 at 17:33
  • if on qemu you likely dont need to init the uart, you can often cheat and poke the register, so like 3 or 4 lines of assembly for the whole project. one line of C – old_timer May 28 '19 at 17:33
  • problem solved, and was a totally stupid problem... I've set the wrong base address for the peripherals, I mean I've trusted the BCM datasheet (which reports as base address 0x7E000..0) but the right one is 0x3F000..0. Thank you to all of you guys! – AlePalu May 28 '19 at 17:47
  • having code at 0x0010xxxx looks wrong as well you making a kernel8.img? – old_timer May 28 '19 at 20:44

1 Answers1

2

The problem was in the wrong base address set for accessing the peripherals' registers. Once set it to 0x3F000000, no exception is raised anymore.

I had setted as base address 0x7E000000, misleading what reported in the BCM datasheet:

Physical addresses range from 0x3F000000 to 0x3FFFFFFF for peripherals. The bus addresses for peripherals are set up to map onto the peripheral bus address range starting at 0x7E000000. Thus a peripheral advertised here at bus address 0x7Ennnnnn is available at physical address 0x3Fnnnnnn.

But later is reported:

The peripheral addresses specified in this document are bus addresses. Software directly accessing peripherals must translate these addresses into physical or virtual addresses

AlePalu
  • 73
  • 1
  • 8