-1

I have an array x with data like this: [3.1, 3.0, 3.3, 3.5, 3.8, 3.75, 4.0] etc. I have another variable y with corresponding 0s and 1s [0, 1, 0] I want to get from that new separate arrays to have that divided

freq, bins = np.histogram(X, 5)

That allows me to know the cutoffs for each bin. But how do I actually get that data? For example, if I have two bins (3 to 3.5 and 3.5 to 4), I want two get two arrays in return like this [3.1, 3.2, 3.4, ...] and [3.6, 3.7, 4, ...]. Also, I want the variable y to be broken and sorted in the same fashion.

Summary: I am looking for code to break x into bins with corresponding y values.

I thought about doing something using the bins variable, but I am not sure how to split the data based on the cutoffs. I appreciate any help.

If I graph a normal histogram of X, I get this: enter image description here

Using code:

d=plt.hist(X, 5, facecolor='blue', alpha=0.5)

Working Code:

def pairwise(iterable):
    "s -> (s0,s1), (s1,s2), (s2, s3), ..."
    a, b = tee(iterable)
    next(b, None)
    return zip(a, b)


def getLists(a, b, bin_obj):
    index_list = []
    for left, right in pairwise(bin_obj):
        indices = np.where((a >= left) & (a < right))
        index_list += [indices[0]]
    X_ret = [a[i] for i in index_list]
    Y_ret = [b[i] for i in index_list]
    return (X_ret, Y_ret)
freq, bins = np.histogram(X[:, 0], 5)

Xnew, Ynew = getLists(X[:, 0], Y, bins)
shurup
  • 751
  • 10
  • 33

1 Answers1

2

There's a handful python function defined in the standard library.

from itertools import tee

def pairwise(iterable):
    "s -> (s0,s1), (s1,s2), (s2, s3), ..."
    a, b = tee(iterable)
    next(b, None)
    return zip(a, b)

It can help you to iterate through your bins and get the indices of your elements.

for left, right in pairwise(bins):
    indices = np.where((x >= left) & (x < right))
    print(x[indices], y[indices])
tidylobster
  • 683
  • 5
  • 13
  • The print statement did not work because indices contain some other data other than the indexes. I put the working code in my question. Thanks! – shurup Dec 18 '18 at 00:19