0

May I get sample code to read data from csv. My Requirement is I need to generate Train and test data from CSV in TensorFlow.

one CSV that contains both Train and Test data . I mean first 10 rows I take for Train and next 10 for test Thanks in Advance

  • Possible duplicate of [How to \*actually\* read CSV data in TensorFlow?](https://stackoverflow.com/questions/37091899/how-to-actually-read-csv-data-in-tensorflow) –  Aug 18 '17 at 20:48
  • How are the columns distributed in your csv's? How many csv's you are talking about (one with both train/test or separated)? – DarkCygnus Aug 18 '17 at 20:52
  • one CSV that contains both Train and Test data . I mean first 10 rows I take for Train and next 10 for test – Android_programmer_office Aug 19 '17 at 14:47

1 Answers1

1

The folks at TensorFlow have created an excellent tutorial that does just this. It covers how to read the census data from csv, convert it into tensors, and fit and evaluate a machine learning model using the high-level estimator API.

However, I did get an error when I tried using the urllib function, and I modified the code slightly so that the data is read directly using pandas.

Original Code

import tempfile
import urllib
train_file = tempfile.NamedTemporaryFile()
test_file = tempfile.NamedTemporaryFile()
urllib.urlretrieve("https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data", train_file.name)
urllib.urlretrieve("https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.test", test_file.name)

import pandas as pd
CSV_COLUMNS = [
    "age", "workclass", "fnlwgt", "education", "education_num",
    "marital_status", "occupation", "relationship", "race", "gender",
    "capital_gain", "capital_loss", "hours_per_week", "native_country",
    "income_bracket"]
df_train = pd.read_csv(train_file.name, names=CSV_COLUMNS, skipinitialspace=True)
df_test = pd.read_csv(test_file.name, names=CSV_COLUMNS, skipinitialspace=True, skiprows=1)

Modified code

import pandas as pd
COLUMNS = ["age", "workclass", "fnlwgt", "education", "education_num",
           "marital_status", "occupation", "relationship", "race", "gender",
           "capital_gain", "capital_loss", "hours_per_week", "native_country",
           "income_bracket"]

df_train = pd.read_csv('http://mlr.cs.umass.edu/ml/machine-learning-databases/adult/adult.data'
                       , names=COLUMNS
                       , skipinitialspace=True)
df_test = pd.read_csv('http://mlr.cs.umass.edu/ml/machine-learning-databases/adult/adult.test'
                      , names=COLUMNS
                      , skipinitialspace=True
                      , skiprows=1)
josiah
  • 139
  • 1
  • 8