3

I'm trying to speed up the following function using numba.

import numpy as np
from numba import jit, prange

@jit(nopython=True, parallel=True)
def find_reg_numba(states):
    reg = []    
    states_sum = np.sum(states, axis=1)
    for i in prange(states.shape[0]):
        if states_sum[i] > 0 and states_sum[i] < 5:
            reg.append(states[i])
    return reg

The states is generated using the following function

def generate_states(size):
    # size is a natural number
    states = np.array(list(map(list, itertools.product([0., 1.], repeat = size))))
    return states

When I try to use the find_reg function, I get the following error trace.

double free or corruption (!prev)
Aborted (core dumped)

My numba version is 0.48.0.

How to solve this issue?

papabiceps
  • 988
  • 2
  • 18
  • 33
  • You have `reg = []; reg = []`. Did you want: `reg = []; reg_size = []`. Also `find_reg_numba` doesn't return anything. Do you want: `return reg, reg_size`? – DarrylG Feb 13 '20 at 12:26
  • I've edited my code, I only want `reg` since `reg_size` can be calculated later. – papabiceps Feb 13 '20 at 12:31
  • Strangely most runs of your code are okay, but occasionally I get:the error `double free or corruption (!prev) repl process died unexpectedly: signal: aborted (core dumped)`. This doesn't happen in the non-Numba version (i.e. not using the decorator). – DarrylG Feb 13 '20 at 12:37
  • I'm getting `double free or corruption (!prev)Segmentation fault (core dumped)` for all runs. My numba version is `0.48.0`. – papabiceps Feb 13 '20 at 12:42

2 Answers2

2

Not sure why your code produces an error. Related error posts are:

However, these proved unhelpful.

Here's an alternative Numba version of find_reg_numba which:

  1. Runs without errors
  2. Produces the same result as the original code without Numba (i.e. original produces errors only with Numba).

Code Refactoring

import numpy as np
from numba import jit
import itertools

@jit(nopython=True, parallel=True)
def find_reg_numba(states):
    states_sum = np.sum(states, axis=1)

    # Find indexes satisfying condition using np.where as described https://www.geeksforgeeks.org/numpy-where-in-python/
    indexes = np.where((states_sum > 0) & (states_sum < 5))
    return states[indexes]

def generate_states(size):
    # size is a natural number
    states = np.array(list(map(list, itertools.product([0., 1.], repeat = size))))
    return states

Tests

for size in range(10):
  s = generate_states(size)
  r  = find_reg_numba(s)
  print(f'Size: {size}\n Result: \n{r}')

Results

 Size: 0
 Result:
[]
Size: 1
 Result:
[[1.]]
Size: 2
 Result:
[[0. 1.]
 [1. 0.]
 [1. 1.]]
Size: 3
 Result:
[[0. 0. 1.]
 [0. 1. 0.]
 [0. 1. 1.]
 [1. 0. 0.]
 [1. 0. 1.]
 [1. 1. 0.]
 [1. 1. 1.]]
Size: 4
 Result:
[[0. 0. 0. 1.]
 [0. 0. 1. 0.]
 [0. 0. 1. 1.]
 [0. 1. 0. 0.]
 [0. 1. 0. 1.]
 [0. 1. 1. 0.]
 [0. 1. 1. 1.]
 [1. 0. 0. 0.]
 [1. 0. 0. 1.]
 [1. 0. 1. 0.]
 [1. 0. 1. 1.]
 [1. 1. 0. 0.]
 [1. 1. 0. 1.]
 [1. 1. 1. 0.]
 [1. 1. 1. 1.]]
Size: 5
 Result:
[[0. 0. 0. 0. 1.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 1. 1.]
 [0. 0. 1. 0. 0.]
 [0. 0. 1. 0. 1.]
 [0. 0. 1. 1. 0.]
 [0. 0. 1. 1. 1.]
 [0. 1. 0. 0. 0.]
 [0. 1. 0. 0. 1.]
 [0. 1. 0. 1. 0.]
 [0. 1. 0. 1. 1.]
 [0. 1. 1. 0. 0.]
 [0. 1. 1. 0. 1.]
 [0. 1. 1. 1. 0.]
 [0. 1. 1. 1. 1.]
 [1. 0. 0. 0. 0.]
 [1. 0. 0. 0. 1.]
 [1. 0. 0. 1. 0.]
 [1. 0. 0. 1. 1.]
 [1. 0. 1. 0. 0.]
 [1. 0. 1. 0. 1.]
 [1. 0. 1. 1. 0.]
 [1. 0. 1. 1. 1.]
 [1. 1. 0. 0. 0.]
 [1. 1. 0. 0. 1.]
 [1. 1. 0. 1. 0.]
 [1. 1. 0. 1. 1.]
 [1. 1. 1. 0. 0.]
 [1. 1. 1. 0. 1.]
 [1. 1. 1. 1. 0.]]
Size: 6
 Result:
[[0. 0. 0. 0. 0. 1.]
 [0. 0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1. 1.]
 [0. 0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0. 1.]
 [0. 0. 0. 1. 1. 0.]
 [0. 0. 0. 1. 1. 1.]
 [0. 0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0. 1.]
 [0. 0. 1. 0. 1. 0.]
 [0. 0. 1. 0. 1. 1.]
 [0. 0. 1. 1. 0. 0.]
 [0. 0. 1. 1. 0. 1.]
 [0. 0. 1. 1. 1. 0.]
 [0. 0. 1. 1. 1. 1.]
 [0. 1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0. 1.]
 [0. 1. 0. 0. 1. 0.]
 [0. 1. 0. 0. 1. 1.]
 [0. 1. 0. 1. 0. 0.]
 [0. 1. 0. 1. 0. 1.]
 [0. 1. 0. 1. 1. 0.]
 [0. 1. 0. 1. 1. 1.]
 [0. 1. 1. 0. 0. 0.]
 [0. 1. 1. 0. 0. 1.]
 [0. 1. 1. 0. 1. 0.]
 [0. 1. 1. 0. 1. 1.]
 [0. 1. 1. 1. 0. 0.]
 [0. 1. 1. 1. 0. 1.]
 [0. 1. 1. 1. 1. 0.]
 [1. 0. 0. 0. 0. 0.]
 [1. 0. 0. 0. 0. 1.]
 [1. 0. 0. 0. 1. 0.]
 [1. 0. 0. 0. 1. 1.]
 [1. 0. 0. 1. 0. 0.]
 [1. 0. 0. 1. 0. 1.]
 [1. 0. 0. 1. 1. 0.]
 [1. 0. 0. 1. 1. 1.]
 [1. 0. 1. 0. 0. 0.]
 [1. 0. 1. 0. 0. 1.]
 [1. 0. 1. 0. 1. 0.]
 [1. 0. 1. 0. 1. 1.]
 [1. 0. 1. 1. 0. 0.]
 [1. 0. 1. 1. 0. 1.]
 [1. 0. 1. 1. 1. 0.]
 [1. 1. 0. 0. 0. 0.]
 [1. 1. 0. 0. 0. 1.]
 [1. 1. 0. 0. 1. 0.]
 [1. 1. 0. 0. 1. 1.]
 [1. 1. 0. 1. 0. 0.]
 [1. 1. 0. 1. 0. 1.]
 [1. 1. 0. 1. 1. 0.]
 [1. 1. 1. 0. 0. 0.]
 [1. 1. 1. 0. 0. 1.]
 [1. 1. 1. 0. 1. 0.]
 [1. 1. 1. 1. 0. 0.]]
Size: 7
 Result:
[[0. 0. 0. 0. 0. 0. 1.]
 [0. 0. 0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 0. 1. 1.]
 [0. 0. 0. 0. 1. 0. 0.]
 [0. 0. 0. 0. 1. 0. 1.]
 [0. 0. 0. 0. 1. 1. 0.]
 [0. 0. 0. 0. 1. 1. 1.]
 [0. 0. 0. 1. 0. 0. 0.]
 [0. 0. 0. 1. 0. 0. 1.]
 [0. 0. 0. 1. 0. 1. 0.]
 [0. 0. 0. 1. 0. 1. 1.]
 [0. 0. 0. 1. 1. 0. 0.]
 [0. 0. 0. 1. 1. 0. 1.]
 [0. 0. 0. 1. 1. 1. 0.]
 [0. 0. 0. 1. 1. 1. 1.]
 [0. 0. 1. 0. 0. 0. 0.]
 [0. 0. 1. 0. 0. 0. 1.]
 [0. 0. 1. 0. 0. 1. 0.]
 [0. 0. 1. 0. 0. 1. 1.]
 [0. 0. 1. 0. 1. 0. 0.]
 [0. 0. 1. 0. 1. 0. 1.]
 [0. 0. 1. 0. 1. 1. 0.]
 [0. 0. 1. 0. 1. 1. 1.]
 [0. 0. 1. 1. 0. 0. 0.]
 [0. 0. 1. 1. 0. 0. 1.]
 [0. 0. 1. 1. 0. 1. 0.]
 [0. 0. 1. 1. 0. 1. 1.]
 [0. 0. 1. 1. 1. 0. 0.]
 [0. 0. 1. 1. 1. 0. 1.]
 [0. 0. 1. 1. 1. 1. 0.]
 [0. 1. 0. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0. 0. 1.]
 [0. 1. 0. 0. 0. 1. 0.]
 [0. 1. 0. 0. 0. 1. 1.]
 [0. 1. 0. 0. 1. 0. 0.]
 [0. 1. 0. 0. 1. 0. 1.]
 [0. 1. 0. 0. 1. 1. 0.]
 [0. 1. 0. 0. 1. 1. 1.]
 [0. 1. 0. 1. 0. 0. 0.]
 [0. 1. 0. 1. 0. 0. 1.]
 [0. 1. 0. 1. 0. 1. 0.]
 [0. 1. 0. 1. 0. 1. 1.]
 [0. 1. 0. 1. 1. 0. 0.]
 [0. 1. 0. 1. 1. 0. 1.]
 [0. 1. 0. 1. 1. 1. 0.]
 [0. 1. 1. 0. 0. 0. 0.]
 [0. 1. 1. 0. 0. 0. 1.]
 [0. 1. 1. 0. 0. 1. 0.]
 [0. 1. 1. 0. 0. 1. 1.]
 [0. 1. 1. 0. 1. 0. 0.]
 [0. 1. 1. 0. 1. 0. 1.]
 [0. 1. 1. 0. 1. 1. 0.]
 [0. 1. 1. 1. 0. 0. 0.]
 [0. 1. 1. 1. 0. 0. 1.]
 [0. 1. 1. 1. 0. 1. 0.]
 [0. 1. 1. 1. 1. 0. 0.]
 [1. 0. 0. 0. 0. 0. 0.]
 [1. 0. 0. 0. 0. 0. 1.]
 [1. 0. 0. 0. 0. 1. 0.]
 [1. 0. 0. 0. 0. 1. 1.]
 [1. 0. 0. 0. 1. 0. 0.]
 [1. 0. 0. 0. 1. 0. 1.]
 [1. 0. 0. 0. 1. 1. 0.]
 [1. 0. 0. 0. 1. 1. 1.]
 [1. 0. 0. 1. 0. 0. 0.]
 [1. 0. 0. 1. 0. 0. 1.]
 [1. 0. 0. 1. 0. 1. 0.]
 [1. 0. 0. 1. 0. 1. 1.]
 [1. 0. 0. 1. 1. 0. 0.]
 [1. 0. 0. 1. 1. 0. 1.]
 [1. 0. 0. 1. 1. 1. 0.]
 [1. 0. 1. 0. 0. 0. 0.]
 [1. 0. 1. 0. 0. 0. 1.]
 [1. 0. 1. 0. 0. 1. 0.]
 [1. 0. 1. 0. 0. 1. 1.]
 [1. 0. 1. 0. 1. 0. 0.]
 [1. 0. 1. 0. 1. 0. 1.]
 [1. 0. 1. 0. 1. 1. 0.]
 [1. 0. 1. 1. 0. 0. 0.]
 [1. 0. 1. 1. 0. 0. 1.]
 [1. 0. 1. 1. 0. 1. 0.]
 [1. 0. 1. 1. 1. 0. 0.]
 [1. 1. 0. 0. 0. 0. 0.]
 [1. 1. 0. 0. 0. 0. 1.]
 [1. 1. 0. 0. 0. 1. 0.]
 [1. 1. 0. 0. 0. 1. 1.]
 [1. 1. 0. 0. 1. 0. 0.]
 [1. 1. 0. 0. 1. 0. 1.]
 [1. 1. 0. 0. 1. 1. 0.]
 [1. 1. 0. 1. 0. 0. 0.]
 [1. 1. 0. 1. 0. 0. 1.]
 [1. 1. 0. 1. 0. 1. 0.]
 [1. 1. 0. 1. 1. 0. 0.]
 [1. 1. 1. 0. 0. 0. 0.]
 [1. 1. 1. 0. 0. 0. 1.]
 [1. 1. 1. 0. 0. 1. 0.]
 [1. 1. 1. 0. 1. 0. 0.]
 [1. 1. 1. 1. 0. 0. 0.]]
Size: 8
 Result:
[[0. 0. 0. ... 0. 0. 1.]
 [0. 0. 0. ... 0. 1. 0.]
 [0. 0. 0. ... 0. 1. 1.]
 ...
 [1. 1. 1. ... 1. 0. 0.]
 [1. 1. 1. ... 0. 0. 0.]
 [1. 1. 1. ... 0. 0. 0.]]
Size: 9
 Result:
[[0. 0. 0. ... 0. 0. 1.]
 [0. 0. 0. ... 0. 1. 0.]
 [0. 0. 0. ... 0. 1. 1.]
 ...
 [1. 1. 1. ... 0. 0. 0.]
 [1. 1. 1. ... 0. 0. 0.]
 [1. 1. 1. ... 0. 0. 0.]]
DarrylG
  • 16,732
  • 2
  • 17
  • 23
0
  1. The problem is that list.append method is not thread safe. If you comment out that line, you wont get the error

  2. Intuitively it makes sense that append is not thread safe. Think about what happens when 2 threads try to append to the end of the same list. They will both be writing to the same memory address

  3. Admittedly Numba documentation can be sparse. But it does mention that when using prange, it's up to the user to ensure there are no "cross iteration dependencies" link

  4. You can fix the error by pre-initialising the list and then setting the values via indexing.

For example

  import numpy as np
  from numba import jit, prange
  
  @jit(nopython=True, parallel=True)
  def find_reg_numba(states):
      # initialize reg to be a populated with dummy values of the same type that we will eventually populate it with
      reg = [np.empty(1).astype(states[0].dtype) for i in range(states.shape[0])]    
      states_sum = np.sum(states, axis=1)
      for i in prange(states.shape[0]):
          if states_sum[i] > 0 and states_sum[i] < 5:
              reg[i] = states[i]
      return reg
Arran Duff
  • 1,214
  • 2
  • 11
  • 23