I have been developing a C++ project from existing Java code. I have the following C++ code and Java code reading from the same test file, which consists of millions of integers.
C++:
int * arr = new int[len]; //len is larger than the largest int from the data
fill_n(arr, len, -1); //fill with -1
long loadFromIndex = 0;
struct stat sizeResults;
long size;
if (stat(fileSrc, &sizeResults) == 0) {
size = sizeResults.st_size; //here size would be ~551950000 for 552M test file
}
mmapFile = (char *)mmap(NULL, size, PROT_READ, MAP_SHARED, fd, pageNum*pageSize);
long offset = loadFromIndex % pageSize;
while (offset < size) {
int i = htonl(*((int *)(mmapFile + offset)));
offset += sizeof(int);
int j = htonl(*((int *)(mmapFile + offset)));
offset += sizeof(int);
swapElem(i, j, arr);
}
return arr;
Java:
IntBuffer bb = srcFile.getChannel()
.map(MapMode.READ_ONLY, loadFromIndex, size)
.asIntBuffer().asReadOnlyBuffer();
while (bb.hasRemaining()) {
int i = bb.get();
int j = bb.get();
swapElem(i, j, arr); //arr is an int[] of the same size as the arr in C++ version, filled with -1
}
return arr;
void swapElem(arr)
in C++ and Java are identical. It compares and modifies values in the array, but the original code is kind of long to post here. For testing purpose, I replaced it with the following function so the loop won't be dead code:
void swapElem(int i, int j, int * arr){ // int[] in Java
arr[i] = j;
}
I assumed the C++ version should outperform the java version, but the test gives the opposite result -- Java code is almost two times faster than the C++ code. Is there any way to improve the C++ code?
I feel maybe the mmapFile+offset
in C++ is repeated too many times so it is O(n) additions for that and O(n) additions for offset+=sizeof(int)
, where n is number of integers to read. For Java's IntBuffer.get()
, it just directly reads from a buffer's index so no addition operation is needed except O(n) increments of the buffer index by 1. Therefore, including the increments of buffer index, C++ takes O(2n) additions while Java takes O(n) additions. When it comes to millions of data, it might cause significant performance difference.
Following this idea, I modified the C++ code as follows:
mmapBin = (char *)mmap(NULL, size, PROT_READ, MAP_SHARED, fd, pageNum*pageSize);
int len = size - loadFromIndex % pageSize;
char * offset = loadFromIndex % pageSize + mmapBin;
int index = 0;
while (index < len) {
int i = htonl(*((int *)(offset)));
offset += sizeof(int);
int j = htonl(*((int *)(offset)));
offset += sizeof(int);
index+=2*sizeof(int);
}
I assumed there will be a slight performance gain, but there isn't.
Can anyone explain why the C++ code works slower than the Java code does? Thanks.
Update:
I have to apologize that when I said -O2 does not work, there was a problem at my end. I messed up Makefile so the C++ code did not recompile using -O2. I've updated the performance and the C++ version using -O2 has outperformed the Java version. This can seal the question, but if anyone would like to share how to improve the C++ code, I will follow. Generally I would expect it to be 2 times faster than the Java code, but currently it is not. Thank you all for your input.
Compiler: g++
Flags: -Wall -c -O2
Java Version: 1.8.0_05
Size of File: 552MB, all 4 byte integers
Processor: 2.53 GHz Intel Core 2 Duo
Memory 4GB 1067 MHz DDR3
Updated Benchmark:
Version Time(ms)
C++: ~1100
Java: ~1400
C++(without the while loop): ~35
Java(without the while loop): ~40
I have something before these code that causes the ~35ms performance(mostly filling the array with -1), but that is not important here.