I have a few sentences and I have put each sentence in a row of a dataframe. I am looking at extracting date from these sentences. I came across the package "datefinder".
When I send single sentence to 'string_with_dates', it properly extracts all dates and gives back.
import datefinder
string_with_dates = ''' They have released Proposals for period October 1, 2018 ’ September 30, 2019. Manufacturers are encouraged to submit proposals for stores located basis throughout the fiscal year ending September 30, 2018, pending availability of funds., '''
matches = datefinder.find_dates(string_with_dates)
for match in matches:
match = str(match)
print(match)
output = 2018-10-01 00:00:00
2019-09-30 00:00:00
2018-09-30 00:00:00
But when I put multiple sentences of a dataframe and loop over using a "for" loop, it is getting messed up. It doesn't show multiple dates (if any) in a cell of dataframe properly. description_df is the name of my dataframe. In column 9, I have the sentences and in column 13, I wish to store the extracted dates.
import datefinder
for i in range (len(description_df)):
string_with_dates = description_df.iloc[i,9]
matches = datefinder.find_dates(string_with_dates)
for match in matches:
match = str(match)
print(match)
description_df.iloc[i,13] = match
Output of the extracted date column of the dataframe is:
2019-09-30 00:00:00
2019-05-07 00:00:00
""
0310-08-07 00:00:00
2019-08-07 00:00:00