3

I have a data frame named StaffHours_df that looks similar to the following:


Name          Hours                  Description

Maria         5 hours 10 minutes     Volunteer

Taylor        2 hours 4 minutes      Employee

Ben           4hrs 30mins            Employee

Gary          8 hours 40 mins        Volunteer

I am wanting to extract the hours and minutes to create a total work time figure for all employees, but only for people classed as an ‘employee’ rather than a volunteer. I would like this figure to be totalled as a separate value from the data frame - for example the above table should give: timeWorked = [6, 34] or minutesWorked = 394 or similar I have to account for the discrepancies for the format of how staff may input their hours but I figure this won’t be a problem if I use .isdigit.

Here is the train of though I’m going down for code but its as far as I’ve gotten:

StaffHours_df[StaffHours_df[‘Description’].str.containts[‘Employee’]

s= [int(s) for s in str.split() if s.isdigit()]

Marnie
  • 31
  • 3

1 Answers1

3

This should give you what you need:

df_emp = df[df['Description'] == 'Employee'] # filter for employees
df_emp['total_minutes'] = (df_emp['Hours']
                          .map(lambda x: [int(i) for i in re.findall("[0-9]+", x)]) # get list of intergers
                          .map(lambda x: 60 * x[0] + x[1]) # convert to minutes
                          )
print(df_emp.to_string())

     Name              Hours Description  total_minutes
1  Taylor  2 hours 4 minutes    Employee            124
2     Ben        4hrs 30mins    Employee            270
Brandon
  • 918
  • 6
  • 14