1

I am trying to plot water-level hydrogaphs for multiple wells. The data are in a text file with the first column a date in the format 'yyyymmdd'. In this particular case, there are 35 other columns with float numbers.

I have been trying to use genfromtxt, but I don't want to have to define all 36 dtypes.

I tried dtype=None with converters, but then I get the message that the converter is locked and cannot be updated.

Pierre GM
  • 19,809
  • 3
  • 56
  • 67
  • 3
    What you have described is actually 2 different problems, and the solution may become apparent if you split it up. The first problem is, "how do I parse a text file into a usable python data structure?" (The obvious choice of data structure is a `list`.) The second problem is, how do I convert this data structure into the types that I want. (This can be done by iterating over the elements of the data structure using `for` loops, generator expressions/list comprehensions, or `map()`). Post what you've tried. – Joel Cornett Aug 26 '12 at 15:15
  • Could you post an example of line from the file? – Visgean Skeloru Aug 26 '12 at 15:20
  • What data format do you get if you try a plain `numpy.loadtxt(filename)`? – tsundoku Aug 26 '12 at 15:29
  • 1
    What converters did you try? The csv module in the python standard library is very helpful for reading columnar files with well defined formats, so long as you know what character is used to separate columns. – abought Aug 26 '12 at 15:31
  • If I try numpy.loadtxt, all of the columns are float. This is also what happens if I use genfromtxt without specifying any dtypes. – Lynette Brooks Aug 26 '12 at 15:59
  • 2
    `numpy.genfromtxt("hydro.txt", converters={0:numpy.datetime64})` works for me in numpy 1.6.2 -- 0th column datetime, all the rest floats. It would definitely help to see a minimal reproducible example. – DSM Aug 26 '12 at 15:59
  • Thanks, the numpy.datetime64 works. Now I just have to learn how to plot! – Lynette Brooks Aug 26 '12 at 16:27

1 Answers1

2

I'm surprised you can't use np.genfromtxt with a converter argument to transform your first column into either :

  • a np.datetime64 object (as @DSM suggests, provided you have a version of numpy recent enough (>1.6.1))
  • a np.object, with a converter as:

    converter={0:lambda d: datetime.datetime.strptime(d,"%Y%m%d")

If you don't want to define the dtype yourself, you could use dtype=None. It's not that good of an idea, though, as this option is notably slower than giving an explicit dtype. But as the documentation tells you, you can use a tuple to define your dtype, so something like:

dtype=tuple([np.datetime64] + [float]*35)

or

dtype=tuple([np.object] + [float]*35)

could work.

Pierre GM
  • 19,809
  • 3
  • 56
  • 67
  • This worked better than datetime64 because Matplotlib doesn't recognize datetime64. I'm not using the tuple definition because I may not always have 36 columns. Thanks to everyone for the help. – Lynette Brooks Aug 27 '12 at 03:00
  • Hi, I tried to use what Pierre told, like this ->> converters={7:lambda d: datetime.datetime.strptime(d,"%d-%m-%Y")}, but it gives an error "global name 'datetime' is not defined", i need some help because i am new on python, thanks – Nielsen Rechia Aug 13 '15 at 16:46