0

I'm reading in a series of CSV files then writing them back out. Some files have different date formats, but I want to write them all out in "%Y-%m-%d" format (YYYY-MM-DD).

How can I write some if statements for the case when the date format is "%m/%d/%Y" and for "%m-%d-%Y" ?

Here's what I've attempted :

reader = csv.reader(gzip.open(readfile), delimiter=',')
for line in reader:
    if datetime.datetime.strptime(line[10], "%m/%d/%Y") == "%m/%d/%Y"
        line[10] = datetime.datetime.strptime(line[26], "%m/%d/%Y").strftime("%Y-%m-%d")
    else:
        line[10] = datetime.datetime.strptime(line[26], "%m-%d-%Y").strftime("%Y-%m-%d")
Hana
  • 1,330
  • 4
  • 23
  • 38
  • If I'd want to write out in the same format I read those dates in, I'd read in one date and remember that format like in an array or something with line number and position. This way, when you write it out, you can easily read out of that array, which date format is the right one for that specific line and column. Understandable? – Janos Vinceller Sep 08 '20 at 20:23
  • Or I may misunderstood your problem and you're looking for this one: https://stackoverflow.com/questions/14245029/parsing-a-date-that-can-be-in-several-formats-in-python – Janos Vinceller Sep 08 '20 at 20:30

2 Answers2

1

You can use the python dateutil lib. It can do this automatically so that you do not need to write if statements at all.

>>> from dateutil.parser import parse
>>> parse("10/03/2017")
datetime.datetime(2017, 10, 3, 0, 0)
>>> parse("03-12-2010")
datetime.datetime(2010, 3, 12, 0, 0)

This only works if your month always comes first (but this is true for if statements as well). You could also call parse with the option dayfirst=True, if your format were e.g. "%d/%m/%Y", i.e. your day always came first.

Combined with your example this would mean:

from dateutil.parser import parse

for line in reader:
    line[10] = parse(line[26]).strftime("%Y-%m-%d")

(PS: If you are not 100% sure that your line always actually has a valid date string, you should probably include ValueError handling in case parse cannot undestand the string you passed to it.)

buddemat
  • 4,552
  • 14
  • 29
  • 49
1

According to Python's EAFP Philosophy.

If you are certain that your CSV only contains dates in those two formats you can try doing this.

for line in reader:
    try:
        line[10] = datetime.datetime.strptime(line[26], "%m/%d/%Y").strftime("%Y-%m-%d")
    except ValueError:
        line[10] = datetime.datetime.strptime(line[26], "%m-%d-%Y").strftime("%Y-%m-%d")
Asav Patel
  • 1,113
  • 1
  • 7
  • 25