Goal: From an excel file, I want to get all the records which have dates that fall within a range and write them to a new excel file. The infile I'm working with has 500K+ rows and 21 columns.
What I've tried:
I've read the infile to a Pandas dataframe then returned the DatetimeIndex
. If I print the range
variable I get the desired records.
import pandas as pd
in_excel_file = r'path\to\infile.xlsx'
out_excel_file = r'path\to\outfile.xlsx'
df = pd.read_excel(in_excel_file)
range = (pd.date_range(start='1910-1-1', end='2021-1-1'))
print(range)
##prints
DatetimeIndex(['1990-01-01', '1990-01-02', '1990-01-03', '1990-01-04',
'1990-01-05', '1990-01-06', '1990-01-07', '1990-01-08',
'1990-01-09', '1990-01-10',
...
'2020-12-23', '2020-12-24', '2020-12-25', '2020-12-26',
'2020-12-27', '2020-12-28', '2020-12-29', '2020-12-30',
'2020-12-31', '2021-01-01'],
dtype='datetime64[ns]', length=11324, freq='D')
Where I'm having trouble is getting the above DatetimeIndex
to the outfile. The following gives me an error:
range.to_excel(out_excel_file, index=False)
AttributeError: 'DatetimeIndex' object has no attribute 'to_excel'
I'm pretty sure that when writing to excel it has to be a dataframe. So, my question is how do I get the range
variable to a dataframe object?