1

I'm writing an application for my research which deals with very big arrays and lots of iterations on their values. For reducing the calculation time, I'm implementing several improvements in the codes. Right now, after several improvements, the calculation time is still a bit high. The array_key() function is the one that consumes almost half of the calculation time. Is there any better substitute for that?

I'm creating an example which is a simple version (without iteration loops) of what I need to improve:

$bigArray=array_fill(0,1000000,0);
for($i=0;$i<10;$i++){
    $rnd=mt_rand(0,1000000);
    $bigArray[$rnd]=1;
}

$start=microtime(true);
$list=array_keys($bigArray,1);
$end=microtime(true);

echo $end-$start;

The outcome is something like 0.021490097045898 seconds. Does anybody know a faster way to do this? Even very small improvement will be helpful since this kind of calculation would go for hundreds of thousands of rounds and sometimes the time reaches up to 30 seconds and half of it is for the array_key() function as described above.

BTW, I'm running the script on a Dual Core (Intel) E8500 @3.16GHz,3.17GHz with 8Gb RAM, the OS is Windows 7 64-bit (just in case).

Thanks in advance.

SAVAFA
  • 818
  • 8
  • 23
  • 0.02 is pretty fast for PHP considering that array size :) – nice ass May 14 '13 at 21:46
  • Yes I know, but as I mentioned, due to my huge calculations, I'm looking for any possible substitute that is faster :). – SAVAFA May 14 '13 at 21:48
  • Considering you've got an array of 1 million elements, and are searching for 10 elements with the value 1, 0.02 seconds is pretty good. – Marc B May 14 '13 at 21:48
  • Try using an opcode cache like APC, it should speed up your script – nice ass May 14 '13 at 21:52
  • How about a totally different approach? Don't create the big array, just add 10 (your loop size) random elements (in your range) to an array (just append, random is the value `$array[] = mt_rand()`) and then `array_unique`. This is cheap because your array is small. – Joost May 14 '13 at 21:52
  • As a result you get list of maximum ten random numbers, right? Couldn't you just generate random number ten times and put each into an array (as a key to be sure there is no duplicates)? :) – Wiktor May 14 '13 at 21:53
  • Crystal balling: if (1) you are only interested in keys with a certain value, not grouping the keys for all possible values and (2) you know the values beforehand, it stands to reason that speed could be gained from checking the value directly after calculation and storing it separately. Other then that, this seems about what you can get, but under certain circumstances and conditions using a [SPLFixedArray](http://nl3.php.net/manual/en/class.splfixedarray.php) may provide some speedup. – Wrikken May 14 '13 at 21:54
  • would an indexed db lookup be faster? HEAP, or not. –  May 14 '13 at 21:55
  • As I said, I just created this example to show you what the content looks like. My arrays are fed by user input and I don't have control on them, neither of the 1s and 0s in them. But I need to know were those 1s are in that arrays. – SAVAFA May 14 '13 at 21:57
  • @Dagon, it's not! I have done performance test on that and the array was faster (similar test: http://stackoverflow.com/questions/16430539/mysqli-query-vs-php-array-which-is-faster). – SAVAFA May 14 '13 at 21:58
  • was that based on the HEAP storage engine? –  May 14 '13 at 22:01
  • @Dagon, no it was not on `HEAP`. I only have these on my machine: `MRG_MYISAM, MyISAM, BLACKHOLE, CSV, MEMORY, ARCHIVE, InnoDB, PERFORMANCE_SCHEMA` – SAVAFA May 14 '13 at 22:03
  • @Wrikken, I'll check `SPLFixedArray` and let you know the results. – SAVAFA May 14 '13 at 22:05
  • The key lookup in a database table with 1 million records would be very fast if the table is properly indexed, probably milliseconds. So unless you're inserting a million records on each request, that would be the fastest solution – nice ass May 14 '13 at 22:07
  • @OneTrickPony, the problem is that, as you said, to use the database I need to insert all values into the tables first, then do the key look up. That inserting adds more time that makes the entire process slower. – SAVAFA May 14 '13 at 22:12
  • 1
    @SAVAFA: if it's really just 1's and 0's, `array_filter()` performs about twice as fast as the `array_keys()` here BTW. – Wrikken May 14 '13 at 22:13
  • 1
    is php the best approach to this project, its hardly the traditional domain of a language designed for Web development –  May 14 '13 at 22:23
  • @Wrikken, Thanks. by adding `array_filter()` to the actual calculation, the time went down from `10.186648130417` to `8.5778539180756` – SAVAFA May 14 '13 at 22:37
  • Might want to have a look at https://github.com/facebook/hiphop-php/wiki – Michel Feldheim May 23 '13 at 22:53

1 Answers1

0

Credit goes to @Wikken from the comments:

if it's really just 1's and 0's, array_filter() performs about twice as fast as the array_keys() here.

It is really helpful and performs the job fast. Reduced the time by %20 in my case. Thanks Wikken!

SAVAFA
  • 818
  • 8
  • 23