I have an 80,000 rows csv file made up of four columns ID, Date, Time and Flow. If flow data is ever missing the missing data is skipped over until a new flow data is record and then the data continues to record. Flow measurements are taken every 15 minutes.
Example:
USGS 2/12/2023 0:45 167
USGS 2/12/2023 1:00 170
USGS 2/12/2023 1:15 177
USGS 2/12/2023 1:45 170
USGS 2/12/2023 2:00 164
USGS 2/12/2023 2:15 177
USGS 2/12/2023 2:30 170
USGS 2/12/2023 2:45 180
Here 1:30 is missing from the Feb 12th 2023 record data. These missing data can be a one off or could occur over multiple hours or days.
I'm trying to write a python script which search out the missing timesteps and whenever it finds a skipped/missing row it add a replacement row into the missing location(s) with the correct ID, date, time and NA for flow.
Example
USGS 2/12/2023 0:45 167
USGS 2/12/2023 1:00 170
USGS 2/12/2023 1:15 177
USGS 2/12/2023 1:30 NA
USGS 2/12/2023 1:45 170
USGS 2/12/2023 2:00 164
USGS 2/12/2023 2:15 177
USGS 2/12/2023 2:30 170
USGS 2/12/2023 2:45 180
or
USGS 1/16/2023 23:00 329
USGS 1/16/2023 23:15 329
USGS 1/16/2023 23:30 329
USGS 1/16/2023 23:45 NA
USGS 1/17/2023 0:00 NA
USGS 1/17/2023 0:15 NA
USGS 1/17/2023 0:30 329
USGS 1/17/2023 0:45 329
USGS 1/17/2023 1:00 329
USGS 1/17/2023 1:15 329
USGS 1/17/2023 1:30 329
Currently, I'm able to find solution where we replace single values within the dataset, such as a time value of 1:30 or a flow value nothing describing the replacement of an entire row of missing data.