ValueError: could not convert string to float: 'Pregnancies'

Question

def loadCsv(filename):
    lines = csv.reader(open('diabetes.csv'))
    dataset = list(lines)
    for i in range(len(dataset)):
        dataset[i] = [float(x) for x in dataset[i]
    return dataset

Hello, I'm trying to implement Naive-Bayes but its giving me this error even though i've manually changed the type of each column to float. it's still giving me error. Above is the function to convert.

Without any information on the content in `dataset[i]`, we cant help you. Print out the content in `dataset[i]`, and check if you have items that cannot be converted to float (e.g. string of alphabets) — Toukenize, Apr 01 '20 at 01:20
df = pd.read_csv('diabetes.csv') df.dtypes Pregnancies int64 Glucose int64 BloodPressure int64 SkinThickness int64 Insulin int64 BMI float64 DiabetesPedigreeFunction float64 Age int64 Outcome int64 dtype: object — Salman Khan, Apr 01 '20 at 10:15
there's no strings it just takes the name of the first column and says it can not be converted to float. it's the pima-diabetes dataset. pregnancies is the name of first column — Salman Khan, Apr 01 '20 at 10:18

score 2 · Accepted Answer · answered Apr 01 '20 at 13:45

The ValueError is because the code is trying to cast (convert) the items in the CSV header row, which are strings, to floats. You could just skip the first row of the CSV file, for example:

for i in range(1, len(dataset)): # specifying 1 here will skip the first row
    dataset[i] = [float(x) for x in dataset[i]

Note: that would leave the first item in dataset as the headers (str).

Personally, I'd use pandas, which has a read_csv() method, which will load the data directly into a dataframe.

For example:

import pandas as pd
dataset = pd.read_csv('diabetes.csv')

This will give you a dataframe though, not a list of lists. If you really want a list of lists, you could use dataset.values.tolist().

ValueError: could not convert string to float: 'Pregnancies'

1 Answers1