0

I am using a payments dataset from Austin Text Open Data. I am trying to load the data with the following code:-

library(ff)

asd <- read.table.ffdf(file = "~/Downloads/Fiscal_Year_2010_eCheckbook_Payments.csv", first.rows = 100, next.ros = 50, FUN = "read.csv", VERBOSE = TRUE)

This shows me the following error:-

read.table.ffdf 301..Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : scan() expected 'an integer', got '7AHM'

This happens on 339th line of csv file at 5th column of the dataset. The reason why I think this is happening is that all the values of the 5th column are integers where as this happens to be string. But the actual type of the column should be string.

So I wanted to know if there was a way I could set the types of the column

Below I am providing the types for all the columns in a vector:-

c("character","integer","integer","character","character", "character","character","character","character","character","integer","character","character","character","character","character","character","character","integer","character","character","character","character","character","integer","integer","integer","character","character","character","character","double","character","integer")

You can also find the type of each column from the description of the dataset.

Please also keep in mind that I am very new to this library. Practically just found out about it today.

Shawn Brar
  • 1,346
  • 3
  • 17

1 Answers1

0

Maybe you need to transform your data type...The following is just an example that maybe to help you.

data <- transform(
  data,
  age=as.integer(age),
  sex=as.factor(sex),
  cp=as.factor(cp),
  trestbps=as.integer(trestbps),
  choi=as.integer(choi),
  fbs=as.factor(fbs),
  restecg=as.factor(restecg),
  thalach=as.integer(thalach),
  exang=as.factor(exang),
  oldpeak=as.numeric(oldpeak),
  slope=as.factor(slope),
  ca=as.factor(ca),
  thai=as.factor(thai),
  num=as.factor(num)
)
sapply(data, class)
Adam
  • 60
  • 7