Fast way to get a valid imm num for arm mov?

Question

Arm mov has a limitation that the immediates must be an 8bit rotated by a multiple of 2, we can write:

mov ip, #0x5000

But we cannot write that:

mov ip, #0x5001

The 0x5000 can be split as 0x5000 + 1, my means, the sum of a valid immediate number and a small number.

So for a given 32bit number, how to find the closest valid immediate number fast? Like this:

uint32 find_imm(uint32 src, bool less_than_src) {
...
}

// x is 0x5000
uint32 x = find_imm(0x5001, true);

So, exactly what are you asking? Seems like a fairly straightforward case of "check if there is anything in the lower bits [that can trivially be added back in], use AND to mask that off and see if the rest can be shifted down to a small number". — Mats Petersson, Jul 20 '13 at 09:59
My question is: I want to move 0x5001 to ip, but I cannot write: `mov ip,#0x5001`; but I can write: `mov r4, #1; add ip, r4, #0x5000`; so I want to find an arithmetic can get 0x5000 from 0x5001. And I don't want to wirte arm pseudo instruction... — scvyao, Jul 20 '13 at 10:10
I understand your goal of loading #5001 as "5 << 12 + 1", but I don't understand what you are asking for? — Mats Petersson, Jul 20 '13 at 10:12
split 0x5001 to 0x5000+1 is easy, but how to split **any** 32bit number? — scvyao, Jul 20 '13 at 10:14
Well, presumably "any" doesn't work, since 0x47111723 for example, can't be made from shifting. But the method I described above, "mask off the low part and shift" works for 0x4200012 and 0x13ff (assuming there value is 8 bits, a shift and another 8 bits) — Mats Petersson, Jul 20 '13 at 10:47
why 0x47111723 cannot? 0x47111723 = 0x47000000+0x111723, and 0x47000000 is the closest valid immediate number to 0x47111723. But my method is test all numbers, has no efficiency... — scvyao, Jul 20 '13 at 11:09
Oh, now I'm lost again. What are the limits for each number? E.g "base shifted + addend" - what are the limits for the "base" and "addend"? Typically, when you do these things, it's because you can't load a 32-bit number, but you can load a 16 bit or 8 bit number. — Mats Petersson, Jul 20 '13 at 11:11
Maybe I can continue use this method to handle 0x111723... anyway, please write the arithmetic you described above in c, so I can understand it more clearly... — scvyao, Jul 20 '13 at 11:22

score 2 · Answer 1 · answered Jul 20 '13 at 13:11

It is quite simple, look at the distance between the ones. 0x5001 = 0b101000000000001. 15 significant digits, so it will take you two instructions at 8 bits of immediate per. Also remember to put a rotate in your test, if there are enough zeros 0x80000001 and you rotate that around 0x88000000 or 0x00000003 that is only two significant digits from a distance between the ones measurement. So take the immediate, perform a distance between the ones type test, rotate one step, perform the test again, and repeat until all the possible (counter-)rotations have happened and go with one of the ones with the smallest number of instructions/immediates.

gnu as already does this and gas is open source so you can just go get their code if you prefer. When you use the load address trick:

ldr rd,=const

If that const can be resolved in a single move immediate instruction then it encodes it as a

mov rd,#const

if it cant then it tries to find a location to put the word and encodes it as a pc relative load:

ldr rd,[pc,#offset]
...
.word const

Also, the *gnu assembler* takes the `mvn` form as well. If there is a single instruction it is used, if not then a `ldr rd,[pc,#offset]` is used. This is faster than the multiple computations as ALU units are not used. Only the *load unit* is busy to fetch the constant and the constant is the same size as an ARM instruction. — artless noise, Jul 21 '13 at 19:18
right you have to also do a ones compliment of all the candidates and see if that fits within a mvn... — old_timer, Jul 21 '13 at 21:27

score 1 · Accepted Answer · answered Jul 20 '13 at 12:45

There is not a straightforward rule or function for finding ways to construct values. Once a value exceeds what can be loaded easily from immediate values, you usually load it by defining it in the data section and loading it from memory, rather than constructing it from immediate values.

If you do want to construct a value from two immediate values, you must consider a variety of operations, including:

Adding two immediates.
Subtracting two immediates.
Multiplying two immediates.
More esoteric instructions, such as some of the “SIMD” instructions that split 32-bit registers into multiple lanes.

If you must go to three immediate values, there are more combinations. One can find some patterns in the possibilities that reduce the search, but some portion of it remains a “brute force” search. Generally, there is no point in using complicated instructions sequences, since you can simply load the data from a prepared location in memory.

The ARM assembler has an instruction form to assist this:

LDR Rd, =const

When the assembler sees this, it places the const value in the literal pool and generates an instruction to load the value from the pool. If you are using a different assembler, it might not have the same instruction form, but you can write the necessary code manually.

Thank you. arm asm let me headache than x86 – scvyao Jul 20 '13 at 13:09 — scvyao, Jul 20 '13 at 13:09

Fast way to get a valid imm num for arm mov?

2 Answers2