3

Notice that when you input pandas.cut into a dataframe, you get the bins of each element, Name:, Length:, dtype:, and Categories in the output. I just want the Categories array printed for me so I can obtain just the range of the number of bins I was looking for. For example, with bins=4 inputted into a dataframe of numbers "1,2,3,4,5", I would want the output to print solely the range of the four bins, i.e. (1, 2], (2, 3], (3, 4], (4, 5].

Is there anyway I can do this? It can be anything, even if it doesn't require printing "Categories".

Jon Clements
  • 138,671
  • 33
  • 247
  • 280
Phoebe
  • 57
  • 1
  • 6
  • 5
    Would you care to share your attempt at the problem? That way you are more likely to obtain answers – Sheldore Sep 16 '18 at 17:07
  • 4
    Please post a [**Minimal**, Complete, and Verifiable example](https://stackoverflow.com/help/mcve). – Alex Sep 16 '18 at 17:30

2 Answers2

5

I guessed that you just would like to get the 'bins' from pd.cut(). If so, you can simply set retbins=True, see the doc of pd.cut For example:

In[01]:

data = pd.DataFrame({'a': [1, 2, 3, 4, 5]})
cats, bins = pd.cut(data.a, 4, retbins=True)

Out[01]:

cats:

0    (0.996, 2.0]
1    (0.996, 2.0]
2      (2.0, 3.0]
3      (3.0, 4.0]
4      (4.0, 5.0]
Name: a, dtype: category
Categories (4, interval[float64]): [(0.996, 2.0] < (2.0, 3.0] < (3.0, 4.0] < (4.0, 5.0]]

bins:

array([0.996, 2.   , 3.   , 4.   , 5.   ])

Then you can reuse the bins as you pleased. e.g.,

lst = [1, 2, 3]
category = pd.cut(lst,bins)
Chenglong Ma
  • 345
  • 3
  • 13
0

For anyone who has come here to see how to select a particular bin from pd.cut function - we can use the pd.Interval funtcion

df['bin'] = pd.cut(df['y'], [0.1, .2,.3,.4,.5, .6,.7,.8 ,.9])
print(df["bin"].value_counts())

Ouput
(0.2, 0.3]    697
(0.4, 0.5]    156
(0.5, 0.6]    122
(0.3, 0.4]     12
(0.6, 0.7]      8
(0.7, 0.8]      4
(0.1, 0.2]      0
(0.8, 0.9]      0
print(df.loc[df['bin'] ==  pd.Interval(0.7,0.8)]
Alex Punnen
  • 5,287
  • 3
  • 59
  • 71