I have been working on a project to simulate biologically inspired neural networks using arrayfire. I got to the point of doing some timing tests and was disappointed with the results I was getting. I decided to try and go with one of the fastest, dirt-simple models for a timing test case, the Izhikevich model. When I ran the new test with that model the results were worse. The code I am using is below. It is not doing anything fancy. It is just standard matrix algebra. However, it takes over 5 seconds to do a single evaluation of the equation for just 10 neurons! Every stop after that takes roughly that same amount of time as well.
Code:
unsigned int neuron_count = 10;
array a = af::constant(0.02, neuron_count);
array b = af::constant(0.2, neuron_count);
array c = af::constant(-65.0, neuron_count);
array d = af::constant(6, neuron_count);
array v = af::constant(-70.0, neuron_count);
array u = af::constant(-20.0, neuron_count);
array i = af::constant(14, neuron_count);
double tau = 0.2;
void StepIzhikevich()
{
v = v + tau*(0.04*pow(v, 2) + 5 * v + 140 - u + i);
//af_print(v);
u = u + tau*a*(b*v - u);
//Leaving off spike threshold checks for now
}
void TestIzhikevich()
{
StepIzhikevich();
timer::start();
StepIzhikevich();
printf("elapsed seconds: %g\n", timer::stop());
}
Here are the timing results for different numbers of neurons.
results:
neurons seconds
10 5.18275
100 5.27969
1000 5.20637
10000 4.86609
Increasing the number of neurons does not appear to have a huge effect. The time goes down a little. Am I doing something wrong here? Is there a better way to optimize things with arrayfire to get better results?
When I switched the v equation to use v*v instead pow(v, 2) the time required for a step went down to 3.75762. That is still extremely slow though, so something odd is happening.
[EDIT] I tried to split the processing up into pieces and found something new. Here is the code I am using now.
Code:
unsigned int neuron_count = 10;
array a = af::constant(0.02, neuron_count);
array b = af::constant(0.2, neuron_count);
array c = af::constant(-65.0, neuron_count);
array d = af::constant(6, neuron_count);
array v = af::constant(-70.0, neuron_count);
array u = af::constant(-20.0, neuron_count);
array i = af::constant(14, neuron_count);
array g = af::constant(0.0, neuron_count);
double tau = 0.2;
void StepIzhikevich()
{
array j = tau*(0.04*pow(v, 2));
//af_print(j);
array k = 5 * v + 140 - u + i;
//af_print(k);
array l = v + j + k;
//af_print(l);
v = l; //If this line is here time is long on second loop
//g = l; //If this is here then time is short.
//u = u + tau*a*(b*v - u);
//Leaving off spike threshold checks for now
}
void TestIzhikevich()
{
timer::start();
StepIzhikevich();
printf("elapsed seconds: %g\n", timer::stop());
timer::start();
StepIzhikevich();
printf("elapsed seconds: %g\n", timer::stop());
}
When I run it without reassigning back to v, or assigning it to a new variable g, then the time for the step on both the first and second run are small
results:
elapsed seconds: 0.0036143
elapsed seconds: 0.00340621
However, when I put v = l; back in, then the first time it runs it is fast, but from then on it is slow.
results:
elapsed seconds: 0.0034497
elapsed seconds: 2.98624
Any ideas on what is causing this?
[EDIT 2]
I still do not know why it is doing this, but I have found a workaround by copying the v array before using it again.
Code:
unsigned int neuron_count = 100000;
array v = af::constant(-70.0, neuron_count);
array u = af::constant(-20.0, neuron_count);
array i = af::constant(14, neuron_count);
double tau = 0.2;
void StepIzhikevich()
{
//array vp = v;
array vp = v.copy();
//af_print(vp);
array j = tau*(0.04*pow(vp, 2));
//af_print(j);
array k = 5 * vp + 140 - u + i;
//af_print(k);
array l = vp + j + k;
//af_print(l);
v = l; //If this line is here time is long on second loop
}
void TestIzhikevich()
{
for (int i = 0; i < 10; i++)
{
timer::start();
StepIzhikevich();
printf("loop: %d ", i);
printf("elapsed seconds: %g\n", timer::stop());
timer::start();
}
}
Here are the results now. The second time it runs it is a bit slow, but now it is fast after that. Huge improvement over before.
Results: loop: 0 elapsed seconds: 0.657355
loop: 1 elapsed seconds: 0.981287
loop: 2 elapsed seconds: 0.000416182
loop: 3 elapsed seconds: 0.000415045
loop: 4 elapsed seconds: 0.000421014
loop: 5 elapsed seconds: 0.000413339
loop: 6 elapsed seconds: 0.00041675
loop: 7 elapsed seconds: 0.000412202
loop: 8 elapsed seconds: 0.000473321
loop: 9 elapsed seconds: 0.000677432