0

I am trying to identify dates from a column containing text entries and output the dates to a text file. However, my code is not returning any output. I can't seem to figure out what I did wrong in my code. I'd appreciate some help on this.

My Code:

import csv

from dateutil.parser import parse

with open('file1.txt', 'r') as f_input, open('file2.txt', 'w') as f_output:

     csv_input = csv.reader(f_input)

     csv_output = csv.writer(f_output)

     for row in csv_input:
         x = str(row[3])

         def is_date(x):
             try:
                parse(x)
                csv_output.writerow([row[0], row[1], row[2], row[3], row[4]])
             # no return value in case of success 
             except ValueError:
                return False 

         is_date(x)
mikuszefski
  • 3,943
  • 1
  • 25
  • 38
user8929822
  • 273
  • 1
  • 3
  • 13
  • 1
    You aren't calling `is_date` – Paul Mar 01 '18 at 07:14
  • I have added is_date(x) to the bottom of the code, but it still does not produce any output. – user8929822 Mar 01 '18 at 07:38
  • Can you provide a few sample lines of `f_input`. Your code needs to be cleaned up quite a bit and it would be easier if we'd know how the data actually looks like. – mikuszefski Mar 01 '18 at 13:12
  • Just for the record: the function definition inside a `for` loop is horrible. I am sure you could put it outside and pass the csv_reader object. But even there...why. Why does the parse function not return the parsed value, which you then write in the output file? – mikuszefski Mar 01 '18 at 13:54

1 Answers1

2

Guessing somewhat you input like e.g.:

1,2,3, This is me on march first of 2018 at 2:15 PM, 2015
3,4,5, She was born at 12pm on 9/11/1980, 2015

a version of what you want could be

from dateutil.parser import parse

with open("input.txt", 'r') as inFilePntr, open("output.txt", 'w') as outFilePntr:
    for line in inFilePntr:
        clmns = line.split(',')
        clmns[3] = parse( clmns[3], fuzzy_with_tokens=True )[0].strftime("%Y-%m-%d %H:%M:%S")
        outFilePntr.write( ', '.join(clmns) )

Note, as you do not touch the other columns, I just leave them as text. Hence, no need for csv. You never did anything with the return value of parse. I use the fuzzy token, as my column three has the date somewhat hidden in other text. The returned datetime object is transformed into a string of my liking (see here) and inserted in column three, replacing the old value. I recombine the strings with comma separation again an write it into output.txt, which looks like:

1, 2, 3, 2018-03-01 14:15:00,  2015
3, 4, 5, 1980-09-11 12:00:00,  2015
mikuszefski
  • 3,943
  • 1
  • 25
  • 38