0

If only 1 line, i can plot, learn from below Plot Normal distribution with Matplotlib

e.g , i have a pandas like below:

name,distance
Peter,13
Sam,14
Peter,15
Sam,12
Sam,13
Peter,14

With df.groupby('name').describe() I can display some min/max/mean by each of a user.

However, I want to draw a normal distribution base on existing data. i tried df.sort_values(by='name').groupby('name').plot()

but it wont draw a pdf or normal distribution for it. How can I use numpy to achieve that

Thanks

Koukou
  • 187
  • 8
Kenneth Yeung
  • 35
  • 1
  • 7

1 Answers1

0

IIUC, what you want is to plot a distance histogram for both name values in the same plot.

import matplotlib.pyplot as plt
df = pd.DataFrame({'name':['Peter', 'Sam', 'Peter', 'Sam', 'Sam', 'Peter'],
         'distance':[13, 14, 15, 12, 13, 14]})

for name in df['name'].unique():      
  plt.hist(df.loc[df['name']==name, 'distance'], label=name)    
plt.legend();

enter image description here


UPDATE:

As OP suggested in the comments, it's possible to draw these without a for loop.

df.groupby('name').distance.plot.hist()
df.groupby('name').distance.plot.kde()

enter image description here

akilat90
  • 5,436
  • 7
  • 28
  • 42