This question stems from looking at the answers provided to this question regarding counting the number of zero crossings. Several answer were provided that solve the problem, but the NumPy appproach destroyed the others with respect to time.
When I compared four of the answers however I notice that the NumPy solution provides a different result for large sequences. The four answers in question are loop and simple generator, better generator expression , and NumPy solution.
Question: Why does the NumPy solution provide a different result than the other three? (and which is correct?)
Here are the results for counting the number of zero crossings:
Blazing fast NumPy solution
total time: 0.303605794907 sec
Zero Crossings Small: 8
Zero Crossings Med: 54464
Zero Crossings Big: 5449071
Loop solution
total time: 15.6818780899 sec
Zero Crossings Small: 8
Zero Crossings Med: 44960
Zero Crossings Big: 4496847
Simple generator expression solution
total time: 16.3374049664 sec
Zero Crossings Small: 8
Zero Crossings Med: 44960
Zero Crossings Big: 4496847
Modified generator expression solution
total time: 13.6596589088 sec
Zero Crossings Small: 8
Zero Crossings Med: 44960
Zero Crossings Big: 4496847
And the code used to get the results:
import time
import numpy as np
def zero_crossings_loop(sequence):
s = 0
for ind, _ in enumerate(sequence):
if ind+1 < len(sequence):
if sequence[ind]*sequence[ind+1] < 0:
s += 1
return s
def print_three_results(r1, r2, r3):
print 'Zero Crossings Small:', r1
print 'Zero Crossings Med:', r2
print 'Zero Crossings Big:', r3
print '\n'
small = [80.6, 120.8, -115.6, -76.1, 131.3, 105.1, 138.4, -81.3, -95.3, 89.2, -154.1, 121.4, -85.1, 96.8, 68.2]
med = np.random.randint(-10, 10, size=100000)
big = np.random.randint(-10, 10, size=10000000)
print 'Blazing fast NumPy solution'
tic = time.time()
z1 = (np.diff(np.sign(small)) != 0).sum()
z2 = (np.diff(np.sign(med)) != 0).sum()
z3 = (np.diff(np.sign(big)) != 0).sum()
print 'total time: {0} sec'.format(time.time()-tic)
print_three_results(z1, z2, z3)
print 'Loop solution'
tic = time.time()
z1 = zero_crossings_loop(small)
z2 = zero_crossings_loop(med)
z3 = zero_crossings_loop(big)
print 'total time: {0} sec'.format(time.time()-tic)
print_three_results(z1, z2, z3)
print 'Simple generator expression solution'
tic = time.time()
z1 = sum(1 for i, _ in enumerate(small) if (i+1 < len(small)) if small[i]*small[i+1] < 0)
z2 = sum(1 for i, _ in enumerate(med) if (i+1 < len(med)) if med[i]*med[i+1] < 0)
z3 = sum(1 for i, _ in enumerate(big) if (i+1 < len(big)) if big[i]*big[i+1] < 0)
print 'total time: {0} sec'.format(time.time()-tic)
print_three_results(z1, z2, z3)
print 'Modified generator expression solution'
tic = time.time()
z1 = sum(1 for i in xrange(1, len(small)) if small[i-1]*small[i] < 0)
z2 = sum(1 for i in xrange(1, len(med)) if med[i-1]*med[i] < 0)
z3 = sum(1 for i in xrange(1, len(big)) if big[i-1]*big[i] < 0)
print 'total time: {0} sec'.format(time.time()-tic)
print_three_results(z1, z2, z3)