I am working with some data and end up with a situation where I want to cut a series like this:
df = pd.DataFrame({'A': 10000*[1], 'B': np.random.randint(0, 1001, 10000)})
df['level'] = pd.cut(df.B, bins = [0, 200, 400, 600, 800, 1000],
labels = ['i', 'ii', 'iii', 'iv', 'v'])
To count the number of values in each level, I find two different answers when I do the following:
df.level.value_counts(sort = False)
i 1934
ii 1994
iii 2055
iv 2056
v 1952
Name: level, dtype: int64
df.pivot_table(index = 'A', columns = 'level', values = 'B', aggfunc = 'count').loc[1]
level
i 1994
ii 2056
iii 1934
iv 1952
v 2055
Name: 1, dtype: int64
Shouldn't both methods give the same results?