0

How can I obtain overnight data for a dynamic dataset over a specified period using Pandas?

I wish to extract data between 23:00 (Day 1) to 07:00 (Day 2).

I am using the following method to concatenate all the night data but am unsure how to then split this into said periods.

night = df_data[(df_data['Hour'] >= 23) | (df_data['Hour'] >= 0) & (df_data['Hour'] < 7)]
print(night)

Resulting dataframe

rickid123
  • 1
  • 2
  • This sounds like a job to pd.cut. See https://stackoverflow.com/questions/43500894/pandas-pd-cut-binning-datetime-column-series – pinegulf Feb 18 '21 at 12:32

1 Answers1

0

I've came up with the following solution which seems to work fine. I imagine there is much simpler method.

Ignore the measure_interval part at the top. That is for future integration with other software. I have left it in for clarity in the rest of the code.

#
# Number of iterated periods will depend on 5min or 15min runs
#

df_123.set_index('Datetime',inplace=True)

measure_interval = "15mins"

if measure_interval == "5mins":
  num_measure_periods = 96
  measure_period = "5T"
elif measure_interval == "15mins":
  num_measure_periods = 32
  measure_period = "15T"

myList = []
dateList = []

# Gets list of dates in dataset and appends '23:00' for measure start
# Creates datetime like object

for idx, day in df_123.groupby(df_123.index.date):
  dateList.append(str(idx) + ' 23:00')

# For each date in list, try to filter data between 23:00 (Day 1) - 07:00 (Day 2)

for iDate in dateList:

  try:
    filt = pd.date_range(start = iDate, periods=num_measure_periods,freq=measure_period)
    var = df_123.loc[filt]
    myList.append(var)

  # If data cant be filtered by 23:00 - 07:00 (e.g. data finishes at 01:00)
  # Loop until fails then append data to myList
  # If var is a dataframe then append to myList

  except:
    var = ""
    for x in range (1, num_measure_periods):
      filt = pd.date_range(start = iDate, periods=x,freq=measure_period)
      try:
        var = df_123.loc[filt]
      except KeyError as e:
        if not isinstance(var, pd.DataFrame):
          break
        else:
          myList.append(var)
          break

print(myList)

Resulting dataframes

rickid123
  • 1
  • 2