I have a list of tuples that look like this:
>>> y
[(0,1,2,3,4,...,10000), ('a', 'b', 'c', 'd', ...), (3.2, 4.1, 9.2, 12., ...), ]
etc. y
has 7 tuples, where each tuple has 10,000 values. All 10,000 values of a given tuple are the same dtype, and I have a list of these dtypes as well:
>>>dt
[('0', dtype('int64')), ('1', dtype('<U')), ('2', dtype('<U')), ('3', dtype('int64')), ('4', dtype('<U')), ('5', dtype('float64')), ('6', dtype('<U'))]
My intent is to do something like x = np.array(y, dtype=dt)
, but when I do that, I get the following error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: could not assign tuple of length 10000 to structure with 7 fields.
I understand that this is because dtype is saying that the first value in the tuple must be an int64, the second value must be a string, and so on, and that I only have 7 dtypes for a tuple with 10,000 values.
How can I communicate to the code that I mean that ALL values of the first tuple are int64s, and ALL values of the second tuple are strings, etc.?
I've also tried having y
be a list of lists instead of a list of tuples:
>>>y
[[0,1,2,3,4,...,10000], ['a', 'b', 'c', 'd', ...), ...]
etc, and I get an error due to the same reason as above:
>>> x = np.array(y, dtype=dt)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: 'Supplier#000000001'
Any help is appreciated!
Edit: My goal is to have x be a numpy array.