0

I have some python code, and I'm wondering what I can do to optimize the speed for creating the array using Cython. Note that I have tried other methods: Counting Algorithm Performance Optimization in Pypy vs Python (Numpy vs List)

It seems like Cython is significantly faster than anything I've tried before right out of the box. I am wondering if I can get even more performance.

#!/usr/bin/env python
def create_array(size=4):
    """
    Creates a multi-dimensional array from size
    """
    array = [(x, y, z)
             for x in xrange(size)
             for y in xrange(size)
             for z in xrange(size)]
    return array

Thanks in advance!

Community
  • 1
  • 1
Brian Bruggeman
  • 5,008
  • 2
  • 36
  • 55

1 Answers1

1

I won't help with the cython-code, but I believe this operation can still be done efficiently in numpy, you just haven't looked deep enough yet.

def variations_with_repetition(alphabetlen):
    """Return a list of all possible sets of len=3 with elements
    chosen from range(alphabetlen)."""

    a = np.arange(alphabetlen)
    z = np.vstack((
        np.repeat(a,alphabetlen**2),
        np.tile(np.repeat(a,alphabetlen),alphabetlen),
        np.tile(a,alphabetlen**2))).T
    return z

Now, execution speed here is meaningless in this case because you just mention you want it below 2ms for alphabetlen=32. That depends on your CPU. But I can compare your own proposed method to this one:

In [4]: %timeit array = [(x, y, z) for x in xrange(size) for y in xrange(size) for z in xrange(size)]
100 loops, best of 3: 3.3 ms per loop

In [5]: %timeit variations_with_repetition(32)
1000 loops, best of 3: 348 µs per loop

That's well below your desires 2ms speed. But once again, your mileage may vary depending on the CPU.

Oliver W.
  • 13,169
  • 3
  • 37
  • 50
  • Thank you for this. It's not exactly what I wanted, but I will accept it. In addition, I had really wanted to go the numpy route since I need to do transformations on the data later. – Brian Bruggeman Feb 23 '15 at 20:27