2

I'm building a Windows Phone project with some bits of it in assembly. My assembly file is in ARM mode (CODE32), and it tries to jump to a C function that I know is compiled to Thumb. The code goes like this:

    ldr r12, [pFunc]
    mov pc, r12
pFunc
    dcd My_C_Function

Here's the weird thing. The value at pFunc in the snippet is a pointer at a function thunk plus one. That is, the 0th bit is set, as if the jump target is meant to be Thumb and the jump instruction is meant to be BX. But the thunk is clearly ARM! The thunk loads the address of the function body plus one and executes a BX to it, properly switching modes.

Trying to BX to that address would probably crash, because that would switch modes and trying to execute ARM code in Thumb mode is not a good idea. Trying to simply jump to that address (as the current code does) would probably crash too, because PC would end up unaligned.

I could, in theory, manually clean up the 0th bit and then jump, but there's gotta be some error to my thinking. The thunk is generated by the C compiler - right? The C compiler knows that the thunk is ARM code. The address under pFunc is generated by the linker, since it's a cross-module call. So the low bit is placed there by the linker; why doesn't the linker know that those thunks are ARM?

Any explanation, please?

I don't have a WP8 device now, so I can't try it in real hardware. Staring hard at the generated code is the only debugging technique that I have :(

EDIT: but what if those thunks are not ARM, but Thumb-2? Thumb-2 supports some 32-bit command IIRC. Is their encoding the same as in ARM mode? How does Thumb-2 decode commands, anyway?

Seva Alekseyev
  • 59,826
  • 25
  • 160
  • 281
  • If you have defined all of your labels right in assembly (with gnu assembler you preceed thumb labels with .thumb_func, not sure how to do it with other toolchains), and the C should take care of itself. from there the assembler, compiler, and linker will take care to bx to an even or odd address depending on the destination. – old_timer Aug 16 '13 at 01:33
  • The assembler is the ARM flavor of MASM. I'll check if I can decorate a label accordingly. But on general grounds, I may not, in theory, know what modes are other object files in. What if they are third party objects? I'd expect the linker to be smart about it. – Seva Alekseyev Aug 16 '13 at 02:03
  • that is my point, I would expect the linker to be smart about it, use the disassembler if you have one to check and make sure the tools did it right. – old_timer Aug 16 '13 at 03:10
  • That's exactly whay I'm doing, lacking a device. – Seva Alekseyev Aug 16 '13 at 13:03

3 Answers3

3

The details you want are specified in section "A2.3.2 Pseudocode details of operations on ARM core registers" of "ARM Architecture Reference Manual, ARMv7-A and ARMv7-R edition". Here is the relevant pseudocode (from the above manual) about writes to PC register:

BXWritePC(bits(32) address)
    if CurrentInstrSet() == InstrSet_ThumbEE then
        if address<0> == '1' then
            BranchTo(address<31:1>:'0');  // Remaining in ThumbEE state
        else
            UNPREDICTABLE;
    else
        if address<0> == '1' then
            SelectInstrSet(InstrSet_Thumb);
            BranchTo(address<31:1>:'0');
        elsif address<1> == '0' then
            SelectInstrSet(InstrSet_ARM);
            BranchTo(address);
        else // address<1:0> == '10'
            UNPREDICTABLE;

If the low bit of the address (bit 0) is set, the processor will clear this bit, switch to Thumb mode, and perform a jump to the new address.

This behaviour is correct for ARMv7 and later (i.e. applies to all Windows Phone devices, but not all Android/iOS devices).

Marat Dukhan
  • 11,993
  • 4
  • 27
  • 41
  • This confirms my suspicion that the code is way off, but doesn't explain why would the linker write a Thumby pointer to an ARMy destination. It is the linker, right? – Seva Alekseyev Aug 16 '13 at 02:06
  • 1
    The misunderstood part in your question is `Trying to simply jump to that address (as the current code does) would probably crash too, because PC would end up unaligned.`. As suggested by the above pseudo-code, the processor will clear the lowest bit automatically, and only use it as an indication to switch to Thumb mode. – Marat Dukhan Aug 16 '13 at 04:02
  • IDK which tool is responsible for setting the bit 0 in function pointers on Windows Phone, it could be either static linker (addresses of exported Thumb functions will have bit 0 set) or loader (it can use metainfo about Thumb functions to set bit 0 in their pointer in import table). Practically, there is little difference. – Marat Dukhan Aug 16 '13 at 04:02
  • I'm looking at the DLL code in the file, not at the executing code. The loader didn't have a chance yet :) – Seva Alekseyev Aug 16 '13 at 13:07
2

You might be well have a valid / real problem on your hand. Afaik Windows Phone environment is required to be Thumb-2 only, so linker you are using might not be able to handle calls to ARM mode. See the same link for some considerations when you are mixing assembly and C.

If this was a Linux / ELF question, I would have answered differently;

The address under pFunc is generated by the linker, since it's a cross-module call. So the low bit is placed there by the linker.

pFunc is generated by compiler and a linker will fix it when you are building a loadable image which will already be fully resolved for static relocation of calls. All portable object files should contain a table about function modes so later linkers can process them and do relocation and updating call sequences accordingly.

See ELF for the ARM Architecture - 4.6 Relocation for how this is done via ELF files.

auselen
  • 27,577
  • 7
  • 73
  • 114
  • Relevant quote: "The assembler sets the LSB of each word of a symbol in a code section because it interprets them as destination instructions." Sounds like this might be it. – Seva Alekseyev Aug 16 '13 at 13:07
2

The disassembler (IDA Demo, specifically) misled me by uniting several commands into one line. It made a single mov line out of a mov-orr-orr sequence that was meant to assign a full 32-bit constant to a register. The thunks were Thumb after all. The linker works as designed.

IDA is otherwise great. I knew in the back of my mind about this particular behavior regarding ARM, but this time it slipped.

My bad, thanks and upvotes to everyone who tried to help.

Seva Alekseyev
  • 59,826
  • 25
  • 160
  • 281