I'm trying to find a way to read a file into an array with "gaps":
So the read data is in the byte array buffer
at the positions buffer[0], buffer[2], .., buffer[2*i]
, without any significant speed loss.
More specifically I want to read it int-wise (i.e. b[0], b[4], ..., b[i * 4]
).
Is that in any way possible (C#, C++) or should I look for another approach?
A bit more background:
I'm trying to speed up a hash algorithm (hashes the file blockwise, concats blockhashes, hashes it, and takes the resulting hash).
The idea is/was to take SSE3 and do 4 blocks in "parallel", which is why I need the data in that way, so I can easily load the data into the registers.
The (pinvokable) lib I wrote in C++ gives nice results (i.e. 4 times as fast), but reordering the data eats up the speed gains.
Currently I'm reading the file blockwise and then reorder the ints (C#):
unsafe {
uint* b = (uint*)buffer.ToPointer() + chunkIndex;
fixed(byte* blockPtr = chunk) {
uint* blockIntPtr = (uint*)blockPtr;
for(int i = 0; i < 9500 * 1024 / 4; i += 4) {
*(b + 00) = blockIntPtr[i + 0];
*(b + 04) = blockIntPtr[i + 1];
*(b + 08) = blockIntPtr[i + 2];
*(b + 12) = blockIntPtr[i + 3];
b += 16;
}
}
}
chunk
is a byte array and chunkIndex
is an int, passed as methods parameters.
buffer
is a uint32_t*
pointer which is allocated by my C++ code.
The problem with this is that it takes too long. Calling the above code 4 times takes around 90ms while the hashing takes 3ms.
The big discrepancy strikes me as a bit odd, but it produces correct hashes.