I understand that with genfromtxt
, the defaultfmt
parameter can be used to infer default column names, which is useful if column names are not in input data. And defaultfmt
, if not provided, is defaulted to f%i
. E.g.
>>> data = StringIO("1 2 3\n 4 5 6")
>>> np.genfromtxt(data, dtype=(int, float, int))
array([(1, 2.0, 3), (4, 5.0, 6)],
dtype=[('f0', '<i8'), ('f1', '<f8'), ('f2', '<i8')])
So here we have autogenerated column names f0
, f1
, f2
.
But what if I want numpy to infer both column headers and data type? I thought you do it with dtype=None
. Like this
>>> data3 = StringIO("1 2 3\n 4 5 6")
>>> np.genfromtxt(data3, dtype=None, ???) # some parameter combo
array([(1, 2, 3), (4, 5, 6)],
dtype=[('f0', '<i8'), ('f1', '<i8'), ('f2', '<i8')])
I still want the automatically generated column names of f0
, f1
...etc. And I want numpy to automatically determine the datatypes based on the data, which I thought was the whole point of doing dtype=None
.
EDIT But unfortunately that doesn't ALWAYS work.
This case works when I have both floats and ints.
>>> data3b = StringIO("1 2 3.0\n 4 5 6.0")
>>> np.genfromtxt(data3b, dtype=None)
array([(1, 2, 3.), (4, 5, 6.)],
dtype=[('f0', '<i8'), ('f1', '<i8'), ('f2', '<f8')])
So numpy correctly inferred datatype of i8 for first 2 column, and f8 for last column.
But, if I provide all ints, the inferred columned names disappears.
>>> data3c = StringIO("1 2 3\n 4 5 6")
>>> np.genfromtxt(data3c, dtype=None)
array([[1, 2, 3],
[4, 5, 6]])
My identical code may or may not work depending on the input data? That doesn't sound right.
And yes I know there's pandas. But I'm not using pandas on purpose. So please bear with me on that.