Let's consider a numpy array
a = array([1,2,25,13,10,9,4,5])
containing an even number of elements. I need to keep only one element of the array every two at random: either the first or the second, then either the third or the fourth, etc. For example, using a, it should result into:
c = array([1,13,9,5])
d = array([2,13,10,4])
e = array([2,25,10,5])
I have to do that on long array of hundred elements and on thousand of arrays along huge loops. What would be the fastest algorithm that iterating over element and keeping or deleting one on two using pair_index+random.randint(0,1)
A generalised method that keeps one element every three, four, etc. would be nice ;-)
Thanks
results:
import timeit
import numpy
def soluce1():
k=2
a = numpy.array([1,2,25,13,10,9,4,5])
aa = a.reshape(-1, k)
i = numpy.random.randint(k, size = aa.shape[0])
return numpy.choose(i, aa.T)
def soluce2():
k=2
a = numpy.array([1,2,25,13,10,9,4,5])
w = len(a) // k
i = numpy.random.randint(0, 2, w) + numpy.arange(0, 2 * w, 2)
return a[i]
def random_skip():
a= numpy.array([1,2,25,13,10,9,4,5])
k=2
idx = numpy.arange(0, len(a), k)
idx += numpy.random.randint(k, size=len(idx))
idx = numpy.clip(idx, 0, len(a)-1)
return a[idx]
> ts1=timeit.timeit(stmt='soluce1()',setup='from __main__ import soluce1',number=10000)
> --> 161 µs
> ts2=timeit.timeit(stmt='soluce2()',setup='from __main__ import soluce2',number=10000)
> --> 159 µs
> ts3=timeit.timeit(stmt='random_skip()',setup='from __main__ import random_skip',number=10000)
> --> 166 µs
Seem to be equivalent proposals. Thanks again all.