1

I asked this question and I got an answer which works for a general case with sequential and non missing data but not for my case specifically. I have a DF that looks as follows.

eventTime       MeteredEnergy Demand RunningHoursLamps 
6/7/2018 0:00   67.728           64  1037.82
6/7/2018 1:00   67.793           64  1038.82
6/7/2018 2:00   67.857           64  1039.82
6/7/2018 3:00   67.922           64  1040.82
6/7/2018 4:00   67.987           64  1041.82
6/7/2018 5:00                    64  1042.82
6/7/2018 6:00                        1043.43
6/7/2018 23:00  68.288
6/8/2018 0:00   67.728           64  1037.82
6/8/2018 23:00  67.793           64  1097.82

I need a DF that finds the difference between RunningHoursLamps values at hour 0 and hour 23 for each unique date in "eventTime" If data is missing for hour 0 or hour 23, the resultant DF can have NaN

Expected output

    Date        00:00       23:00       Difference 
    6/7/2018    1037.82     NaN         NaN
    6/8/2018    1037.82     1097.82     60
pynoob
  • 41
  • 5
  • So basically, you want to have the daily hours the lamp was running? What code did you try? Where are you stuck at the moment? – albert Mar 16 '20 at 16:31
  • Exactly. Based on suggestion here, I used the groupby.assign (first, last) but that works only when I have all data. I am trying to loop through the dataframe and just sample the data at the specific date and time that matches my criteria. `bill3 = pd.DataFrame() each = unique_dates[5] for each in unique_dates: date = each hour0 = df[(df['eventTime'] == each) & (df['hour'] == 0)] ['RunningHoursLamp'] hour23 = df[(df['eventTime'] == each) & (df['hour'] == 23)] ['RunningHoursLamp'] bill3 = pd.DataFrame([date,hour0,hour23]) bill3` – pynoob Mar 16 '20 at 16:41
  • What's the reason for using `0:00` to `23:00` of the same day instead of a span from `0:00` to `0:00` of the following day? – albert Mar 17 '20 at 18:40
  • @albert: you are right. It should be 0 through 0 of the next day. I figured this when I was working through the prb. Thanks – pynoob Mar 18 '20 at 19:57

1 Answers1

0

Update: For those that are interested: I found a way to do this. I parsed out a separate column with dates and hours from the eventTime column and looped through it and handled exceptions when I did not have data for the required DateTime. Thanks.

#for loop to build the bill dataframe
bill = pd.DataFrame()

for i in range(len(unique_dates)):
    try :
        if i == 0:
            hour0 = np.nan
        else:
            hour0 = df.loc[((df['date'] == unique_dates[i]) & (df['hour'] == 0)),'RunningHoursLamp'].values[0]
    except IndexError:
        hour0 = np.nan

    try :
        hour24 = df.loc[((df['date'] == unique_dates[i+1]) & (df['hour'] == 0)),'RunningHoursLamp'].values[0]
    except IndexError:
        hour24 = np.nan

    temp = pd.DataFrame([[unique_dates[i],hour0,hour24]],columns=['Date','Hour_0','Hour_24'])  
    bill = bill.append(temp,ignore_index=True)

bill
pynoob
  • 41
  • 5