I am using the python wrapper of NLP Stanford's SUTime. So far comparing the results to other date parsers like duckling, dateparser's search_dates, parsedatetime and natty, SUTime gives the most reliable results.
However, it fails to capture some obvious dates from documents. Following are the 2 types of documents that I am having difficult parsing for dates using SUTime.
- I am out and I won't be available until 9/19
- I am out and I won't be available between (September 18-September 20)
It gives no results in case of the first document. However, for the second document, it only captures the month but not the date or date range.
I tried wrapping my head around the java's code to see if I could alter or add some rules to make this work, but couldn't figure it out.
If someone can suggest a way to make this work with SUTime, it would be really helpful.
Also, I tried dateparser's search_dates, and it is unreliable as it captures anything and everything. Like for the first document it would parse a date on text "am out" (which is not required) and "9/19" (which is okay). So if there is a way to control this behavior it would work as well.