I am trying to import a CSV file into R to do fraud analysis with linear/logistic regression. What should have been pretty easy is turning complicated... This data set contains 26 variables and more than 2 million rows. I used this command line to import the CSV file:
data <- read.csv('C:/Users/amartinezsistac/OneDrive/PROYECTO/decla_cata_filtrados.csv',header=TRUE,sep=";")
Nevertheless, R imported 2.3 million rows in only 1 variable. I attach an of the
View(data)
obtained after this step for more information. I have tried switching from sep=";" to sep="," using:
datos <- read.csv('C:/Users/amartinezsistac/OneDrive/PROYECTO/decla_cata_filtrados.csv',header=TRUE,sep=",")
But got this error message:
Error in read.table(file = file, header = header, sep = sep, quote = quote, :
more columns than column names
I have tried changing read.csv to read.csv2 (2.3 million rows and 1 variable as result); or using fill=TRUE options (same result), nevertheless the import is not correct. I attach another image of original CSV look opened in Excel.
I appreciate in advance any suggestion or help to fix it.