-1

I'm writing a program in real-mode x86 assembly and I'm trying to do this:

xor %ds, %ds

instead of the alternative:

mov $0, %ax
mov %ax, %ds

However, I'm getting an assembler error:

Error: operand type mismatch for `xor'

Why is this? Is there a way to get around this to decrease code size?

S.S. Anne
  • 15,171
  • 8
  • 38
  • 76
  • 4
    There is no `xor` with segment registers for operands; refer to the instruction set reference for details. If you want to shorten the code size, you can do `pushw $0; popw %ds`. – fuz Apr 11 '19 at 19:44
  • That would still take two bytes. – S.S. Anne Apr 11 '19 at 20:23
  • Three bytes, actually. I don't think there is a shorter solution. – fuz Apr 11 '19 at 20:25
  • For my purpose, there is a shorter version: `mov %cs, %ds`. As the data for the program resides in the same space that the program does, they should essentially be the same value. – S.S. Anne Apr 11 '19 at 20:28
  • 3
    That won't work either because there is no encoding for a MOV instruction where both operands are segment registers. You could do `push %cs; pop %ds` which is two bytes. Note that if you're writing a bootsector it's a common mistake to assume CS is 0. – Ross Ridge Apr 11 '19 at 20:31
  • Thanks. Does that mean that I have to set up the data segment after I initialize the stack? – S.S. Anne Apr 11 '19 at 20:34
  • @JL2210 No, do as you like. – fuz Apr 11 '19 at 20:38
  • Is there any way to make this question less "primarily opinion-based" without invalidating the current answers? – S.S. Anne Apr 14 '19 at 21:16
  • @JL2210 well, you may reword it as "I wodner why is this. Is there a way to get around this to decrease code size?" This way your question is about the constructive part but also allows commentary bits on the first part. – YakovL Jun 16 '19 at 09:02

2 Answers2

5

Is there a way to get around this to decrease code size?

The shortest way seems to be the following:

xor %ax, %ax
mov %ax, %ds

If I didn't make a mistake, this is one byte shorter than the variant using mov $0, %ax.

Do you know why they chose to not allow segment registers as operands for xor? That would be interesting to know.

I doubt that the designers of the CPU intended that the operation xor %ax, %ax was used for zeroing a register. Instead, the CPU designers wanted that it is possible to xor the ax register with any other register.

And because it would have been more difficult not to allow xor-ing ax with itself, the CPU does not only allow operations on two different registers (like xor %bx, %ax) but also operations that use the same register for both arguments.

For segment registers this is different:

The only purpose of segment registers is to store a memory segment; these registers are not intended to store any other kind of information.

There are only very few situations where an arithmetic (or bit-wise) operation with memory segments makes sense. One example would be an addition in the case of arrays that are longer than 64 KiB; in such cases an add operation might have been useful.

However, in most cases you don't perform arithmetic operations on values that represent memory segments.

I think for this reason the designers of the CPU decided not to provide any arithmetic operations, so they could design a less expensive CPU (see Margaret Bloom's comments).

S.S. Anne
  • 15,171
  • 8
  • 38
  • 76
Martin Rosenau
  • 17,897
  • 3
  • 19
  • 38
  • This doesn't actually answer the question itself; it just addresses one of the comments I made. – S.S. Anne Apr 11 '19 at 22:05
  • 1
    The answer to the question is in the very first comment by @fuz. The rest is just an extra bonus :) – tum_ Apr 11 '19 at 23:36
  • Not so they could design a *less expensive* CPU, but because they didn't want to waste opcode coding space on very-special-case instructions whose functionality can be replicated in a couple extra bytes with any GP integer register. Original 8086 left a bit of room for more opcodes and kept things simple. `0F` escape bytes for 2-byte opcodes came later, IIRC, so yes that was a cost-saving measure for 8086, but the reason no later x86 CPUs added segment-register add/sub/xor is that it's so rarely needed. `mov SR, imm16` would have been the first priority, but we don't even have that. – Peter Cordes Apr 11 '19 at 23:49
  • @JL2210 I'm sorry. I thought that your comment was an "extension" to your question which was not answered, yet. – Martin Rosenau Apr 12 '19 at 05:30
  • @JL2210 I added the first part of the answer: You can save one additional byte. – Martin Rosenau Apr 12 '19 at 05:33
  • 4
    @PeterCordes They could have designed the 8086 in a way that the `repz` prefix replaces `ax-dx` by `cs-ss` and `repnz` replaces `si-bp`. The instruction `repz add cx,si` would be interpreted as `add ds,si` and `repnz add si,cx` would be interpreted as `add ds,cx`. This would not have wasted any byte in the opcode, a lot of operations (e.g. with "huge" arrays) would be possible with less operations and using less bytes but the CPU would have been much, much more expensive. – Martin Rosenau Apr 12 '19 at 05:41
  • 1
    Oh good point, prefixes that don't apply to some opcodes actually creates a lot of wasted coding space that doesn't fault with #UD, so it's easier to forget about it. – Peter Cordes Apr 12 '19 at 05:43
  • @MartinRosenau Thanks. I ended up copying the code segment to the data segment. – S.S. Anne Apr 13 '19 at 13:47
3

The encodings for the XOR instruction do not permit the segment registers as either the first or second operand. You may only use general purpose registers, memory locations, and immediate values.

S.S. Anne
  • 15,171
  • 8
  • 38
  • 76
Govind Parmar
  • 20,656
  • 7
  • 53
  • 85
  • Do you know why they chose to not allow segment registers as operands for `xor`? That would be interesting to know. – S.S. Anne Apr 11 '19 at 20:25
  • 5
    It's relatively easy to answer the why part, though. That would require one more bit for encoding a register, which would not fit in nicely in the space available (modrm, sib, opcode). Plus "a quarter" of that bit would be wasted as there are only six segment registers. All of this to achieve a useless feature: segment registers are seldom changed, even in the old days. Doing arithmetic on them would promote them to GP registers but still you couldn't change `cs`, `ds` and `ss` as these are constantly used. That would leave `es`, `fs` and `gs` as the only benefit. It's not worth it. ... – Margaret Bloom Apr 11 '19 at 20:34
  • 4
    ... on a more practical side, it's possible that the segment registers of the 8086 were not connected to the ALU but only used for computing the linear address with a dedicated adder+shifter. This is just speculation though. – Margaret Bloom Apr 11 '19 at 20:35
  • @MargaretBloom Thanks. That makes sense. – S.S. Anne Apr 11 '19 at 20:38
  • since the folks that made the decision are possibly not around or not on stackoverflow (or just not answering) it is not possible to know the "why" question. Its like me trying to guess why you bought the shoes you are wearing today 30 years from now. Has no value and not possible to know most of the why questions asked here. Finding a 100 year old photo of some stranger at a garage sale and hmm I wonder why they have a spoon on the table in the background but dont have a fork. Not a stackoverflow question. – old_timer Apr 11 '19 at 21:05
  • 1
    they could have connected or microcoded anything they wanted with the alu or an alu. You cant have everything you could possibly wish it wouldnt fit, would never be debugged and you wouldnt be able to afford it. x86 is not an architecture you want to admire as far as an instruction set goes, there are many that are worth studying, pretty much all others. The thing to admire is how long they have kept it around and how they keep re-inventing the guts to keep it relevant and perform somewhat well for the power it uses. – old_timer Apr 11 '19 at 21:10
  • 1
    Or even better, intels role in inventing and continuing to drive the tech, despite their flagship product they use that tech for. visual6502, reading up on the pdp-11 and its predecessors, and some others those are some interesting things to ponder. Why did commodore not continue to dominate. why didnt xerox take over and own the world of computing, they pretty much invented it. Atari, etc...Those are interesting why questions. – old_timer Apr 11 '19 at 21:12