0

I would like to use Just-In-Time-Compiling with numba of a Python function, however my code contains a too complex dictionary for that. Its key data type is int, and its values are lists of int with different lengths.

I have a network graph with connected points (Fig. https://www.data-to-viz.com/graph/network_files/figure-html/unnamed-chunk-2-1.png). I am using the dictionary to store the neighbor points of each point. Because points can have differently many neighbors, the lists inside the dictionary have a different length (see code example below). Numba does not support this, and I am having a hard time to find an alternative.

My ideas:

  1. Extract the neighbor points from the pairs-list (see below) each time when I need to find the nighbors of some point. But this will slow down my code, which is the opposite of what I am trying to do with numba.
  2. Use 2d-numpy-array. However then I don't have variable length for each row resulting in a quite large memory usage.

I would be grateful for any ideas.

Minimum code example:

from numba.typed import Dict
from numba.typed import List
from numba import types
from numba import jit


# @jit(nopython=True) # works only without jit...
def f():
    # 3 points: 0, 1, 2
    # Only 0 and 1 as well as 0 and 2 are connected. 1 and 2 are not connected
    pairs = [(0, 1), (2, 0)]
    # Save the neighbor of each point in a dictionary
    d = dict()

    def add_element_to_dict_list(key: int, element: int, dict_list: dict):
        if key in dict_list:
            dict_list[key].append(element)
        else:
            dict_list[key] = [element]

    for p1, p2 in pairs:
        add_element_to_dict_list(p1, p2, d)
        add_element_to_dict_list(p2, p1, d)

    print(d)

f()
cakelover
  • 177
  • 5

1 Answers1

1

The reason your example fails is probably because you are requesting Numba not to use Python (nopython=True). In that mode Numba cannot deal with containers of which it has no way of knowing which types they might contain. You can either remove the nopython=True and allow Numba to fallback to Python objects or you have to use Numba's experimental features from https://numba.pydata.org/numba-doc/0.43.0/reference/pysupported.html#dict.

The following example should print the same thing as your non-jitted function.

import numba
import numba.types


list_type = numba.types.ListType(numba.types.int64)


@numba.jit(nopython=True)
def f():
    pairs = [(0, 1), (2, 0)]

    d = numba.typed.Dict.empty(numba.types.int64, list_type)

    for p1, p2 in pairs:
        if p1 not in d:
            d[p1] = numba.typed.List.empty_list(numba.types.int64)
        d[p1].append(p2)
        if p2 not in d:
            d[p2] = numba.typed.List.empty_list(numba.types.int64)
        d[p2].append(p1)
    return d

d = f()

{k: list(v) for k, v in d.items()}
# returns {0: [1, 2], 1: [0], 2: [0]}

See also https://stackoverflow.com/a/61077948/8483989

lagru
  • 196
  • 7