Add rows for missing hourly data in a pandas dataframe

Question

I have a pandas dataframe with 2 columns: Created'(%Y-%m-%d %H) and Count which is an integer.

It is counting the amount of "tickets" registered per hour.

The problem is that there are many hours in the day that there are not registered any tickets.

I would like to add these hours as new rows with a Count of 0. The dataframe looks like this:

    Created       Count
0   2020-10-26 10   11

1   2020-10-26 09   123

2   2020-10-26 08   36

3   2020-10-26 07   28

4   2020-10-26 06   7

But I would need it to add rows like this:

    Created           Count
    enter code here

0   2020-10-26 10   11

1   2020-10-26 09   123

2   2020-10-26 08   36

3   2020-10-26 07   28

4   2020-10-26 06   7

1  2020-10-26 05.   0

3  2020-10-26 04.   0

Also adding that it needs to be able to update continuously as new dates are added to the original dataframe.

[Add missing dates to pandas dataframe](https://stackoverflow.com/q/19324453/7259176) might be helpful. — upe, Nov 17 '20 at 14:33

Renaud · Answer 1 · 2020-11-18T13:02:27.633

You can resample with:

import datetime
import pandas as pd

df = pd.DataFrame({
'Created': ['2020-10-26 10', '2020-10-26 08','2020-10-26 09','2020-10-26 07','2020-10-26 06'],
'count': [11, 10,14,16,20]})



df['Created'] = pd.to_datetime(df['Created'], format='%Y-%m-%d %H')
df.sort_values(by=['Created'],inplace=True)
df.set_index('Created',inplace=True)


df_Date=pd.date_range(start=df.index.min().replace(hour=0), end=(df.index.max()), freq='H')
df=df.reindex(df_Date,fill_value=0)
df.reset_index(inplace=True)
df.rename(columns={'index': 'Created'},inplace=True)
print(df)

Result:

               Created  count
0  2020-10-26 00:00:00      0
1  2020-10-26 01:00:00      0
2  2020-10-26 02:00:00      0
3  2020-10-26 03:00:00      0
4  2020-10-26 04:00:00      0
5  2020-10-26 05:00:00      0
6  2020-10-26 06:00:00     20
7  2020-10-26 07:00:00     16
8  2020-10-26 08:00:00     10
9  2020-10-26 09:00:00     14
10 2020-10-26 10:00:00     11

Somehow this gives me some odd values when I try to add it on the whole dataframe. It works perfectly fine when I set the df as you did above though. I get a lot of repetitive values, as well as a lot of 1's. — Tina Marie, Nov 17 '20 at 13:00
@Tina Marie Try [`df.reindex()`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.reindex.html) without specifying the `ffill` method: `df=df.reindex(df_Date,fill_value=0)` — upe, Nov 17 '20 at 14:24

Add rows for missing hourly data in a pandas dataframe

1 Answers1