Why is numpy not faster for comparing element-wise greater values in two arrays?

Question

I have three versions of the functions to do element-wise comparison of two lists and output a count of results. First uses for loop (simple function), second uses list expression, third uses numpy. I expected numpy to be super-fast especially once when the list sizes are large, but find its not fast consistently.

Running this on google colab with different array sizes gives me results as follows: SN) ArraySize RuntimeRatio=Simple:Optimised:Numpy

25 1:0.43:2.61
25 1:0.22:0.46
25 1:0.63:0.29
25 1:0.75:1.18
2500 1:0.89:3.07
2500 1:0.84:1.51
2500 1:0.59:0.79
2500 1:0.75:2.19
250000 1:1.26:2.64
250000 1:1.23:2.18
250000 1:1.25:2.22
250000 1:0.90:1.56
25000000 1:1.40:2.25
25000000 1:1.32:2.22
25000000 1:1.29:2.17
25000000 1:1.28:2.19

Any ideas on whats happening or what I am doing wrong?

The code:

import numpy as np
import time
import random

def solution_list_simple(a, b):
  answer = 0
  for aval, bval in zip(a, b):
    if aval > bval:
      answer += 1
  return answer
###########
def solution_list_opti(a, b):
  return [ a_ele > b_ele for a_ele, b_ele in zip(a, b) ].count(True)
###########
def solution_np(a, b):
  return np.sum( np.array(a) > np.array(b) )
###########
random.seed(30)
howmanyvalue = 250000
A = [ random.randint(1, 100) for _ in range(howmanyvalue) ]
B = [ random.randint(1, 100) for _ in range(howmanyvalue) ]

## list version - simple
start_time = time.time()
print(f"")
print(f"\nanswer list simple = {solution_list_simple(A, B)}")
runtime_list_simple = time.time() - start_time
print(f"Runtime list simple = {runtime_list_simple}")

## list version - optmised
start_time = time.time()
print(f"")
print(f"\nanswer list_opti = {solution_list_opti(A, B)}")
runtime_list_opti = time.time() - start_time
print(f"Runtime list optimised = {runtime_list_opti}")

## numpy version
start_time = time.time()
print(f"")
print(f"\nanswer numpy = {solution_np(A, B)}")
end_np = time.time()
runtime_numpy = time.time() - start_time
print(f"Runtime numpy = {runtime_numpy}")

print(f"\n\nRelative ratios\nlist_simple : list_optimised : numpy = {1} : {(runtime_list_opti / runtime_list_simple):.2f} : {(runtime_numpy / runtime_list_simple):.2f}")

@hpaulj : True, but I thought the idea of a numpy array was that is contiguous memory as opposed to a list. So the vectorised (SIMD) operations would be much faster especially with large size like 25000000. Even with this huge size its more than twice slower. — rbewoor, Sep 05 '20 at 23:06
Did you time the `np.array(a)` by itself? And `np.sum( A > B )` using arrays? `numpy` compiled operations on whole arrays are indeed fast, but only if you start with arrays. — hpaulj, Sep 05 '20 at 23:45

score 0 · Answer 1 · answered Aug 23 '22 at 05:46

0

Simple. Don't include the creation of the numpy array in the timing stuff.

answered Aug 23 '22 at 05:46

Infinitely small

1

Why is numpy not faster for comparing element-wise greater values in two arrays?

1 Answers1