I am trying to reduce the execution time of the if-statement shown below (second block of code). It involves a bit-mask where the masks array contain 8 integers used as masks and setup as follows:
static unsigned int masks[8];
void setupMasks() {
int mask = 3; // 0000 0000 0000 0000 0000 0000 0000 0011
for(unsigned int i=0; i < 8; i++) {
masks[i] = (mask << (i * 4));
}
}
Each integer in the testarr below actually contains 8 results. Each result is 4 bits of the 32-bit int and I only want to know if any of the lower-two out of the 4 bits is a 1. The code below is executed inside another for-loop that updates resultnum. failcount is a locally-defined int array. I would like to avoid masking, but the data in testarr comes from an API that I cannot change. In any case, I think the if-statement consumes more time than masking, but I could be wrong. Does anyone see a way to optimize?
for(unsigned int i = 0; i < 8 && dumped < numtodump; i++, dumped++) { //8 results per 32-bit value
unsigned int fails = 0;
for(unsigned int j = 0; j < 32; j++) {
if((testarr[j * numintsperpin + resultnum] & masks[i]) && failcount[j]++ <= 10000) { //have a fail
failingpins[fails++] = &pins[j];
}
}
}
Sorry if my previous post was not clear. Below is the full function. I tried to simplify the problem statement as much as possible earlier. Sorry if I left out useful details.
void process(vector<int> &testarr, vector<unsigned int> &failcount, vector<pin> &pins, vector<unsigned int> &muxaddr, unsigned int base, StopWatch &acc1) {
unsigned int labeloffset = 400;
unsigned int startindex = 50;
unsigned int numtodump = 1000;
unsigned int numintsperpin = testarr.size() / pins.size();
pin** failingpins = new pin*[32];
acc1.start();
int count = 0;
unsigned int dumped = 0;
unsigned int resultnum = 0;
while(dumped < numtodump) {
for(unsigned int i = 0; i < 8 && dumped < numtodump; i++, dumped++) { //8 results per 32-bit value
unsigned int currentindex = labeloffset + dumped + startindex;
unsigned int fails = 0;
for(unsigned int j = 0; j < pins.size(); j++) {
if((testarr[j * numintsperpin + resultnum] & masks[i]) && failcount[j]++ <= 10000) { //have a fail
failingpins[fails++] = &pins[j];
}
}
unsigned int saddr = muxaddr[currentindex];
if(fails > 0) {
logFails(fails, muxaddr[currentindex] - base, failingpins);
}
}
resultnum++;
}
acc1.accumulate();
}