How can I use genfromtxt in numpy to get 2D array instead of tupled or 1-D array

Question

a=np.genfromtxt("winequality-red.csv", delimiter=":", dtype=None, encoding=None,\
            skip_header=1, missing_values="??")


['7.40,0.70,0.00,1.90,0.08,11.00,34.00,1.00,3.51,0.56,9.40,5.00'
 '7.80,0.88,0.00,2.60,0.10,25.00,67.00,1.00,3.20,0.68,9.80,5.00'
 '7.80,0.76,0.04,2.30,0.09,15.00,54.00,1.00,3.26,0.65,9.80,5.00' ...
 '6.30,0.51,0.13,2.30,0.08,29.00,40.00,1.00,3.42,0.75,11.00,6.00'
 '5.90,0.65,0.12,2.00,0.08,32.00,44.00,1.00,3.57,0.71,10.20,5.00'
 '6.00,0.31,0.47,3.60,0.07,18.00,42.00,1.00,3.39,0.66,11.00,6.00']

I want to get 2-D array. I know the dataset maybe not homologous but what tick can I do to deal with that and get an array witch is easy to slice?

Why did you specify `delimiter=":"` for data that's delimited with commas? — user2357112, Aug 05 '23 at 05:27

score -1 · Accepted Answer · answered Aug 05 '23 at 05:47

The issue you're facing is that you're using the wrong delimiter and reading the entire row as a single string. You can use the following code to read the CSV file into a 2D array:

import numpy as np

# Read the file as 1D array of strings
a = np.genfromtxt("winequality-red.csv", delimiter="\n", dtype=str, skip_header=1)

# Convert to a 2D array of floats
data = np.array([list(map(float, line.split(','))) for line in a])

# Resulting 2D array
print(data)

Make sure your CSV file has rows with the same number of columns, and this code will give you the 2D array you need.

How can I use genfromtxt in numpy to get 2D array instead of tupled or 1-D array

1 Answers1