0

I want to process a big list of uint numbers (test1) and I do that i chunks of "length". I need them as signed int and then I need it as absolute length from each pait of even and odd values in this list.

But I want to get rid of two problems:

  1. it uses a a lot of ram
  2. it takes ages!

So how could I make this faster? Any trick? I could also use numpy, no problem in doing so.

Thanks in advance!

test2 = -127 + test1[i:i+length*2048000*2 + 2048000*2*1]
test3 = (test2[::2]**2 + test2[1::2]**2)**0.5
ipinak
  • 5,739
  • 3
  • 23
  • 41
Andreas Hornig
  • 2,439
  • 5
  • 28
  • 36
  • What do you mean by 'I could also use numpy'? The expressions you give use `numpy`. They aren't using Python lists. – hpaulj Jan 16 '16 at 23:11
  • have a look at [numba](http://numba.pydata.org/numba-doc/0.12.2/tutorial_firststeps.html) or [PyPy](http://pypy.org/) just-in-time compilers – Aprillion Jan 17 '16 at 16:15
  • So do you have 8 bit `uint`'s? Because you're subtracting 127. If that's the case you can (probably) improve a lot by doing `test2 = np.int8(-127) + test1[...]`. Maybe convert `test1` to a `int8` dtype first. –  Jan 17 '16 at 16:42

1 Answers1

1

An efficient way is to try to use Numpy functions, e.g:

n = 10
ff = np.random.randint(0, 255, n)  # generate some data

ff2 = ff.reshape(n/2, 2)  # new view on ff (only makes copy if needed)
l_ff = np.linalg.norm(ff2, axis=1)  # calculate vector length of each row

Note that when modifying an entry in ff2 then ff will change as well and vice versa.

Internally, Numpy stores data as contiguous memory blocks. So there are further methods besides np.reshape() to exploit that structure. For efficient conversion of data types, you can try:

dd_s = np.arange(-5, 10, dtype=np.int8)
dd_u = dd_s.astype(np.uint8) # conversion from signed to unsigned
miraculixx
  • 10,034
  • 2
  • 41
  • 60
Dietrich
  • 5,241
  • 3
  • 24
  • 36