1

Hi there My dataset is as follows

username    switch_state    time    
abcd         sw-off         07:53:15 +05:00 
abcd         sw-on          07:53:15 +05:00

Now using this i need to find that on a given day how many times in a day the switch state is manipulated i.e switch on or switch off. My test code is given below

switch_off=df.loc[df['switch_state']=='sw-off']#only off switches
groupy_result=switch_off.groupby(['time','username']).count()['switch_state'].unstack#grouping the data on the base of time and username and finding the count on a given day. fair enough

the result of this groupby clause is given as

print(groupy_result)
username  abcd
time             
05:08:35        3
07:53:15        3
07:58:40        1

Now as you can see that the count is concatenated in the time column. I need to separate them so that i can plot it using Seaborn scatter plot. I need to have the x and y values which in my case will be x=time,y=count Kindly help me out that how can i plot this column.

`

Jeff
  • 69
  • 9

1 Answers1

1

You can try the following to get the data as a DataFrame itself

df = df.loc[df['switch_state']=='sw-off']
df['count'] = df.groupby(['username','time'])['username'].transform('count')

The two lines of code will give you an updated data frame df, which will add a column called count.

df = df.drop_duplicates(subset=['username', 'time'], keep='first')

The above line will remove the duplicate rows. Then you can plot df['time'] and df['count'].

plt.scatter(df['time'], df['count'])
Van Peer
  • 2,127
  • 2
  • 25
  • 35
  • abcd 07:53:15 3.0 abcd 07:53:15 3.0 this is the output as you can see that time is not getting grouped correctly as it is getting repeated. Dont you think after groupby statement there must be only one entry of count against the time e.g both of rows should have grouped into single on abcd 07:53:15 3.0 – Jeff Jul 27 '18 at 19:01
  • Great answer thanks but on the plotting it gives me an error saying that float() argument must be a string or a number, not 'datetime.time' I could try converting datetime to float but i would like to ask that what would be the best option to tackle this? – Jeff Jul 28 '18 at 09:09
  • Thanks! You can try ply.plot_date(<>). I think it depends on other factors as well to decide on the best approach. – Van Peer Jul 28 '18 at 13:22