0

datefinder module doesn't find dates when there is ':' before date.

There is a similar question here: Datefinder Module Stranger behavior on particular string

string = "Assessment Date 17-May-2017 at 13:31"

list(datefinder.find_dates(string.lower()))
#Returns [datetime.datetime(2017, 5, 17, 13, 31)]

However when I add : like this "Assessment Date:", it fails

string = "Assessment Date 17-May-2017 at 13:31"
list(datefinder.find_dates(string.lower()))
#returns []
Narahari B M
  • 346
  • 1
  • 16

1 Answers1

1

These are the delimiters patterns in datefinder: DELIMITERS_PATTERN = r"[/:-\,\s_+@]+"

So 'Date:' is causing an issue when you try to parse the string.

You could preclean the string using a regular expression.

import re as regex
import datefinder

def preclean_input_text(text):
  cleaned_text = regex.sub(r'[a-z]:\s', ' ', text, flags=re.IGNORECASE)
  return cleaned_text

def parse_date_information(text):
  date_info = list(datefinder.find_dates(text.lower()))
  return date_info

string = "Assessment Date: 17-May-2017 at 13:31"
cleaned_string = preclean_input_text(string)
print(parse_date_information(cleaned_string))
# output
[datetime.datetime(2017, 5, 17, 13, 31)]
Life is complex
  • 15,374
  • 5
  • 29
  • 58