I'm focusing on a snippet of ARM Assembly where add
command it is used. The snippet, see below, simply states: to the address of the program counter add the offset calculated to find the position of the string stored at L._str
, where L._str
is the symbol (the address) of a string contained in the data segment.
movw r0, :lower16:(L_.str-(LPC1_0+4))
movt r0, :upper16:(L_.str-(LPC1_0+4))
LPC1_0:
add r0, pc
The first two instructions (movw
and movt
) load the 32-bit number representing the address of that string. I'm in Thumb mode, right?
Ok, so said this, I've difficulties on how to figure out the overall memory layout. Does the following is the right representation of the code segment of the memory? In addition, are LPC1_0
and L._str
the base addresses of add r0, pc
the address of A simple string
string? What is the dimension of each box? 32 bit or 64 bit depending on the architecture.
--------------------------------------------
| movw r0, :lower16:(L_.str-(LPC1_0+4)) |
--------------------------------------------
| movt r0, :upper16:(L_.str-(LPC1_0+4)) |
-------------------------------------------- LPC1_0
| add r0, pc |
--------------------------------------------
.
.
.
-------------------------------------------- L._str
| "A simple string" |
--------------------------------------------
If so, I can just retrieve the offset (that will be add to the pc
) using the difference L_.str-LPC1_0
. But, here +4
also is taken into account.
ADD Rd, Rp, #expr
If Rp is the pc, the value used is: (the address of the current instruction + 4) AND &FFFFFFFC.
So, it appears that if the pc
is the Rp
I need to take into account also +4
more bytes for the offset offset. Ok. so, where are these bytes added? Why these 4 bytes are taken into account into mov
instructions and not before the add
command? Is this a optimization features introduced by the compiler?