numpy - use arrays as start and end of range

Question

This is a simplified array of what I have:

a = np.array([ 1, 12, 60, 80, 90, 210])
b = np.array([11, 30, 79, 89, 99, 232])

How can I get a result that uses a as the start range, and b as the end of the range, that can compute a list of numbers (quickly).

so, c would look like:

c = np.array([1,2,3,...,11, 12,13,14,...,29,30, 
              60,61,62,...79, ..., 210,211,...,231,232])

Ideally, this would be done in a vectorised way (using numpy/pandas) rather than python.

You should be able to use `zip()` here. Are `a` and `b` always the same size? — pault, Jan 16 '18 at 13:50
If you import `add` from `operator`, you can do the following: `c = np.array(reduce(add, [range(x, y) for x, y in zip(a, b)]))` — pault, Jan 16 '18 at 13:57
You can try doing it this way: `c= np.array(np.concatenate([np.arange(a[i],b[i]+1) for i in range(len(a))]))`. — Vasilis G., Jan 16 '18 at 13:58
I don't know about the speed difference in `np.concatenate()` vs using `reduce()` and `add()`, but I like @VasilisG.'s solution because it doesn't require any additional imports. — pault, Jan 16 '18 at 14:00
You can also use a combinateion of Vasilis' and pault's answer, `c = np.concatenate([np.arange(x,y+1) for x,y in zip(a,b)])` — Thomas Kühn, Jan 16 '18 at 14:02
@ThomasKühn thank you, I was about to say that. Using `zip` is better. — Vasilis G., Jan 16 '18 at 14:03
Excellent answers, much more readable than the 'duplicated' answer provided by Divakar, although his is quicker. Thanks guys.(Still checking if Divakar's answer gives me the correct result - got a large dataset to check through) — A H, Jan 16 '18 at 14:08
My array of ~50,000 items (in each a & b) Using your answer Divakar: 1.92 ms ± 48.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) Using the other answer: 97.6 ms ± 3.07 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) So 2 orders of magnitude, something to be expected of doing it vectorised. Thank you all for your answers :). — A H, Jan 16 '18 at 15:25
It's often the case that for small exanples, list operations are faster. The array version may have larger overhead, and thus only has the advantage when the problem becomes large. — hpaulj, Jan 16 '18 at 17:55

score 2 · Accepted Answer · answered Jan 16 '18 at 14:13

2

Summarizing the comments above: One way is to use zip() and np.concatenate().

c = np.concatenate([np.arange(x, y+1) for x, y in zip(a,b)])

HT to @VasilisG. And @ThomasKühn

answered Jan 16 '18 at 14:13

pault

41,343
15
107
149

numpy - use arrays as start and end of range

1 Answers1