-1

I was writing this code to make some data graphs on my jupyter notebook, and when I tried bringing in data from a csv file, I got a "could not convert string to float" error.

So here's my code:

phot_g = np.genfromtxt('gaia_hyades_search.csv', dtype='str', delimiter=",",           skip_header=1, usecols=(6), unpack=True)
phot_bp = np.genfromtxt('gaia_hyades_search.csv', dtype='str', delimiter=",", skip_header=1, usecols=(7), unpack=True)
phot_rp = np.genfromtxt('gaia_hyades_search.csv', dtype='str', delimiter=",", skip_header=1, usecols=(8), unpack=True)

phot_g = phot_g.astype(np.float64)
phot_bp = phot_bp.astype(np.float64)
phot_rp = phot_rp.astype(np.float64)

And here's my error:

ValueError                                Traceback (most recent call last)
/tmp/ipykernel_63/3948901710.py in <module>
---> 18 phot_g = phot_g.astype(np.float64)
     19 phot_bp = phot_bp.astype(np.float64)
     20 phot_rp = phot_rp.astype(np.float64

ValueError: could not convert string to float: ''

I've tried searching the error up, but a lot of the solutions I've gotten have been for numpy.loadtxt, and moreover, they don't seem to help me at all. Any help would be greatly appreciated.

By the way, the error shows up for all three lines of code (phot_g, phot_bp, and phot_rp)

Arvin
  • 3
  • 2
  • just print your plot_g, it could not convert since it is empty , as it is shown in the error, `could not convert string to float: ''` – amirhm Dec 04 '22 at 19:35
  • This isn't a problem with `genfromtxt`, since you specified the `dtype=str`. The error occurs after, when you try to convert the string dtype array to float. You will have to examine that array. It should be obvious what's in it that can't be converted. We certainly can't reach through to your computer and look at it for you! – hpaulj Dec 04 '22 at 19:47
  • check your file for `,,` empty fields. Examine the file for anything that would make reading numbers awkward. You are smarter than the program, and should be able to to identify funny stuff. – hpaulj Dec 04 '22 at 20:45

1 Answers1

0

Is that the full error message? I get more information when I try to recreate the error:

works:

In [104]: np.array(['1','2']).astype(float)
Out[104]: array([1., 2.])

doesn't:

In [105]: np.array(['1','2','two']).astype(float)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [105], in <cell line: 1>()
----> 1 np.array(['1','2','two']).astype(float)

ValueError: could not convert string to float: 'two'

See the 'two'! That tells me exactly what string is causing the problem.

If a line (or more) has two delimiters next to each other, the string array could end up with ''. which can't be converted to a float:

In [109]: np.array('1,2,,'.split(',')).astype(float)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [109], in <cell line: 1>()
----> 1 np.array('1,2,,'.split(',')).astype(float)

ValueError: could not convert string to float: ''

genfromtxt has some ability to fill missing data. pandas csv reader is even better for that.

genfromtxt with 'dtype=float' (the default case), will put np.nan in the array when it can't make a float of the input.

hpaulj
  • 221,503
  • 14
  • 230
  • 353