memcmp in DMD v.s GDC AND std.parallelism: parallel

Question

I'm implementing a struct with a pointer to some manually managed memory. It all works great with DMD, but when I test it with GDC, it fails on the opEquals operator overload. I've narrowed it down to memcmp. In opEquals I compare the pointed to memory with memcmp, which behaves as I expect in DMD but fails with GDC.

If I go back and write the opEquals method by comparing each value stored in the manually managed memory 1 at a time using == on the builtin types, it works in both compilers. I prefer the memcmp route because it was shorter to write and seems like it should be faster (less indirection, iteration, etc).

Why? Is this a bug?

(My experience with C was 10 years ago, been using python/java since, I never had this kind of problem in C, but I didn't use it that much.)

Edit:

The memory I'm comparing represents a 2-D array of 'real' values, I just wanted it to be allocated in one chunk so I didn't have to deal with jagged arrays. I'll be using the structs a lot in tight loops. Basically I'm rolling my own matrix struct that will (eventually) cache some frequently used values (trace, determinant) and offers an alternate read only view into the transpose that doesn't require copying it. I plan to work with matrices of about 10x10 to about 1000x1000 (though not always square).

I also plan on implementing a version that allocates memory with the GC via a ubyte[] and profiling the two implementations.

Edit 2:

Ok, I tried a couple of things. I also have some parallel loops, and I had a hunch that might be the problem. So I added some version statements to make a parallel and non-parallel version. In order to get it to work with GDC, I had to use the non-parallel version AND change real to double.

All cases compiled under GDC. But the unit tests failed, not always consistently on the same line, but consistently at an opEquals call when I used real or parallel. In DMD all cases compiled and ran no problem.

Thanks,

can you post any mor info? It might have to do with something like uninitialized padding or something — Adam D. Ruppe, Jan 03 '15 at 20:03
@AdamD.Ruppe Both arrays are initialized with values in them before they are compared with opEquals (at least in the unittests), but I'm not sure what you mean by padding. — Ryan, Jan 03 '15 at 20:32
`real` has a kinda strange size so I was thinking the compilers might pack arrays of them differently. But I just did a quick test between dmd and gdc on my computer and memcmp worked on both compilers. What happens if you change real to double, just to rule that out? — Adam D. Ruppe, Jan 03 '15 at 20:40
The size of `real` also depends on the platform. On Windows and OSX I believe it's 10 bytes, but it's 12 bytes on Linux. At least they are when passed as parameters, I don't know if the size is different when stored as a local or accessed indirectly though. — Orvid King, Jan 03 '15 at 21:16
@AdamD.Ruppe, since I never said anything about the parallel before and you were right about the doubles too, if you post that as an answer I'll happily accept it. I'm still confused, though. Does memcmp work different in different compilers? Why would it matter how they are packed? As long as the areas in memory they are comparing are the same, should be no problem. — Ryan, Jan 04 '15 at 00:04
GDC developer here: I remember we had a similar issue when two reals were compared with `is` instead of `==` but I can't find the bug report right now. IIRC the problem is that GCC backend treats real as 80bit data + xbit-'garbage' and usually doesn't copy or initialize the garbage part. Therefore the 'garbage' part is often simply random data from whatever used to be at that memory location previously, causing bit-by-bit comparisons to fail. I'm not sure if this is actually defined in the D spec or how hard this is to fix in GDC, but please file a bug report on http://bugzilla.gdcproject.org/ — jpf, Jan 04 '15 at 10:53

score 2 · Accepted Answer · answered Jan 06 '15 at 04:07

real has a bit of a strange size: it is 80 bits of data, but if you check real.sizeof, you'll see it is bigger than that (at least on Linux, I think it is 10 bytes on Windows, I betcha you wouldn't see this bug there). The reason is to make sure it is aligned on a word boundary - a multiple of four bytes - for the processor to load more efficient in arrays.

The bytes between each data element are called padding, and their content is not always defined. I haven't confirmed this myself, but @jpf's comment on the question said the same thing my gut does, so I'm posting it as answer now.

The is operator in D does the same as memcmp(&data1, &data2, data.sizeof), so @jpf's comment and your memcmp would be the same thing. It checks the data AND the padding, whereas == only checks the data (and does a bit special for floating types btw because it also compares for NaN, so the exact bit pattern is important to those checks; actually, my first gut when I saw the question title was that it was NaN related! but not the case)

Anyway, apparently dmd initializes the padding bytes as well, whereas gdc doesn't, leaving it as garbage which doesn't always match.

memcmp in DMD v.s GDC AND std.parallelism: parallel

1 Answers1