You could use some broadcasting
magic there -
(x.cumsum(1) > sample[:,None]).argmax(1)-1
Steps involved :
I. Perform cumsum along each row.
II. Use broadcasted comparison for each cumsum row against each sample value and look for the first occurrence of sample being lesser than cumsum values, signalling that the element before that in x
is the index we are looking for.
Step-by-step run -
In [64]: x
Out[64]:
array([[ 0.1 , 0.1 , 0.1 , 0.1 , 0.1 , 0.1 , 0.1 ],
[ 0.8 , 0.96, 0.88, 0.36, 0.5 , 0.68, 0.71],
[ 0.37, 0.56, 0.5 , 0.01, 0.77, 0.88, 0.36],
[ 0.62, 0.08, 0.37, 0.93, 0.65, 0.4 , 0.79]])
In [65]: sample # one elem per row of x
Out[65]: array([ 0.5, 2.2, 1.9, 2.2])
In [78]: x.cumsum(1)
Out[78]:
array([[ 0.1 , 0.2 , 0.3 , 0.4 , 0.5 , 0.6 , 0.7 ],
[ 0.8 , 1.76, 2.64, 2.99, 3.49, 4.18, 4.89],
[ 0.37, 0.93, 1.43, 1.45, 2.22, 3.1 , 3.47],
[ 0.62, 0.69, 1.06, 1.99, 2.64, 3.04, 3.83]])
In [79]: x.cumsum(1) > sample[:,None]
Out[79]:
array([[False, False, False, False, False, True, True],
[False, False, True, True, True, True, True],
[False, False, False, False, True, True, True],
[False, False, False, False, True, True, True]], dtype=bool)
In [80]: (x.cumsum(1) > sample[:,None]).argmax(1)-1
Out[80]: array([4, 1, 3, 3])
# A loopy solution to verify results against
In [81]: [np.digitize( sample[i], np.cumsum(x[i]))-1 for i in range(x.shape[0])]
Out[81]: [4, 1, 3, 3]
Boundary cases :
The proposed solution automatically handles the cases where sample
values are lesser than smallest of cumulative summed values -
In [113]: sample[0] = 0.08 # editing first sample to be lesser than 0.1
In [114]: [np.digitize( sample[i], np.cumsum(x[i]))-1 for i in range(x.shape[0])]
Out[114]: [-1, 1, 3, 3]
In [115]: (x.cumsum(1) > sample[:,None]).argmax(1)-1
Out[115]: array([-1, 1, 3, 3])
For cases where a sample
value is greater than largest of cumulative summed values, we need one extra step -
In [116]: sample[0] = 0.8 # editing first sample to be greater than 0.7
In [121]: mask = (x.cumsum(1) > sample[:,None])
In [122]: idx = mask.argmax(1)-1
In [123]: np.where(mask.any(1),idx,x.shape[1]-1)
Out[123]: array([6, 1, 3, 3])
In [124]: [np.digitize( sample[i], np.cumsum(x[i]))-1 for i in range(x.shape[0])]
Out[124]: [6, 1, 3, 3]