reading file & joining two columns in numpy array

Question

I have a text data like:
16/12/2006;17:24:00;1.000;17.000
with first column being date and second being time and rest just some floats. Right now I am reading the file like:

np.genfromtxt(path,
              dtype=(np.datetime64, np.datetime64, np.float16, np.float16),
              delimiter=';',
              converters = {0: lambda x: datetime.datetime.strptime(x, "%d/%m/%Y"),
                            1: lambda x: datetime.datetime.strptime(x, "%H:%M:%S")})

This leads to very basic problem where default date for second column is set to Jan 1,1900. Is there some way to combine the date and time from the first two column while reading the file or after reading the file.

Unless you're already wedded to this approach I'd recommend using `pandas` instead. `pd.read_csv("twodate.csv", header=None, sep=";", parse_dates=[[0,1]])` Just Works(tm). — DSM, Feb 03 '13 at 21:00
@DSM this makes me curious about what is the current status of Panda for statistical computing with Python. I think scipy and numpy should have been extensively flexible to handle such situations. — mrig, Feb 03 '13 at 22:09

score 2 · Answer 1 · answered Feb 03 '13 at 21:29

You could read it using plain Python commands and create the joined fields yourself. Then, if needed, you can run your converter over it:

from datetime import datetime

fp = open("test.dat", "r")
lines = [ line.split(";") for line in fp.readlines() ]
fp.close()
fulldates = [ " ".join(line[0:2]) for line in lines if len(line) > 1 ]
converted = [ datetime.strptime(date, "%d/%m/%Y %H:%M:%S")
              for date in fulldates ]

The list fulldates will contain a list of the joined data+time fields. The list converted will contain the initialized datetime objects. (I added the if len(line) > 1 filter only to handle eventual empty lines in the file. If your file do not contain any, you can omit it.)

reading file & joining two columns in numpy array

1 Answers1