Why REP LODS AL instruction exists?

Question

In other words, is there any case I might need this instruction?

From Intel Instructions Manual, this is what the instruction do:

Load (E)CX bytes from DS:[(E)SI] to AL.

Take the following example in NASM:

section .data
    src: db 0, 1, 2, 3

section .code
    mov esi, src
    mov ecx, 4
    rep lodsb    ; mov al, byte[esi + ecx -1]; ecx++

While the instruction rep lodsb is executing, I don't have any control over the value loaded in al. All I can do is waiting, until the instruction terminates to get the last value of al, which, of course I can get directly, without the loop.

The same question goes for the rest of the family: REP LODS AX, REP LODS EAX and REP LODS RAX.

My ancient Microsoft MASM "Programmer's Guide" says that `LODS` is the only one in that family of instructions which does not take a repeat prefix. — Weather Vane, Jun 11 '17 at 17:20
It exists because it was too expensive to make it not exist. You can't afford niceties when you have to build a 16-bit processor from only 20000 transistors. — Hans Passant, Jun 11 '17 at 17:32
@WeatherVane, I think it's not up to MASM to decide for us, LOL — Bite Bytes, Jun 11 '17 at 17:36
@BiteBytes certainly it assembles with inline assembly in my current MSVC C compiler, and runs correctly too. — Weather Vane, Jun 11 '17 at 17:43
Note that `lodsX` has side effects: it changes `r/e/di` and the direction of the loop is determined by the `DF` flag. Furthermore, it can trigger an exception and will (inefficiently) cache the data (if caching is enabled). — Margaret Bloom, Jun 11 '17 at 18:51
@MargaretBloom it changes `al` too, but we can't do anything, besides watching it changing. — Bite Bytes, Jun 11 '17 at 19:04
If you would go through the instruction set thoroughly, you would find several instructions which feel like "this is not really needed". While I learned to appreciate the design of x86 instruction set over years, and give their designer(s) more credit the I initially thought, you still have to consider the actual HW layout of transistors on the chip a major factor forming the content of the instruction set. How convenient it will be to code-for was important, but sometimes compromises happened. As the movs/lods/stos surely shares many transistors, I guess you can't remove `rep lods` easily. — Ped7g, Jun 11 '17 at 21:26
If the MASM manual says that, @Weather, it is incorrect. Intel's documentation for `LODS` clearly says: *"The `LODS`, `LODSB`, `LODSW`, and `LODSD` instructions can be preceded by the `REP` prefix for block loads of `ECX` bytes, words, or doublewords. More often, however, these instructions are used within a `LOOP` construct because further processing of the data moved into the register is usually necessary before the next transfer can be made."* So either the manual means that `REP` prefixes are rarely *useful*, or it means `REPE`/`REPNE` prefixes (that test `ZF`) can't be used with `LODS`. — Cody Gray - on strike, Jun 12 '17 at 11:45
@CodyGray my comment was based on a table "Requirements for String Instructions" with headings for instruction, repeat prefix(es), source/destination and register pair(s) involved. The repeat prefix for `LODS` is given as *None*. Further on in the discussion of `LODS` it says *Unlike other string instructions, `LODS` is not normally used with a repeat prefix*. So it was a careless paraphrase, mybad, I missed out the word "usually". — Weather Vane, Jun 12 '17 at 17:34
@WeatherVane: MASM could have chosen not to assemble `rep lods`, on the assumption that nobody would ever use it on purpose (since you could always code it with `db` or whatever). Apparently that's not the case, but both the Intel and MASM docs *could* have been correct in saying what you claimed. — Peter Cordes, Jun 28 '17 at 06:16

score 13 · Accepted Answer · answered Jun 11 '17 at 17:22

Reading from memory can have side effects in some cases. For example, this can happen if the memory is mapped to hardware registers. The exact side effect would depend on the hardware. So rep lodsb can be useful in very rare cases where you need the side-effect to happen, but don't need the values that were read.

score 8 · Answer 2 · answered Jun 11 '17 at 17:49

8

...is there any case I might need this instruction?

You might not need it per se, but coding in the 16-bit world, where space conservation is important, I've more than once preferred

rep lodsb

over the longer (in bytes)

add si, cx
mov al, [si-1]

Of course this is all very situation dependent.

Note that your replacement instruction to fetch without the loop is actually mov al, byte[esi + ecx -1]

answered Jun 11 '17 at 17:49

Sep Roland

33,889
7
43
76

It seems it's very difficult to need it. Thank's for the correction too. – Bite Bytes Jun 11 '17 at 18:09

Why REP LODS AL instruction exists?

2 Answers2

Linked