I am trying to convert a large .CSV
file into an .Xdf
file by using the rxImport()
function with the below code:
rxImport(inData = "/poc/revor/data/ext_roll36_chrg_vol.csv",
outFile = "/poc/revor/data/ext_roll36_chrg_vol.xdf",
overwrite = TRUE, rowsPerRead = 100000,
colClasses = c(SE_NO = "character",
HIER_ROLLUP_CD = "character",
CUR_MO_CT ="numeric",
CUR_MO_AM = "numeric",
AD_LINE_1_TX = "character",
AD_LINE_2_TX = "character",
SUBMIT_DT = "character",
UPDT_TS = "character"),
transforms = list(SUBMIT_DT = as.Date(SUBMIT_DT, format="%d%b%Y")))
But this file contains many records like:
0200001097,SS,625,236899.000,"KRAV MAGA WORLDWIDE, INC.","KRAV MAGA WORLDWIDE, INC.",01MAY2014,07JUN2014:01:08:57.000000
As you can see the columns AD_LINE_1_TX
& AD_LINE_2_TX
contain commas inside the double quotes.
I have tried using the type = "text"
argument, but then it reads the first column i.e SE_NO
as numeric
even though its type is showing as character
. This is the issue with all the numeric
fields which I want to read as a character
.
And If I transform the column using the transform
argument into character
as:
transforms = list(SE_NO = as.character(as.numeric(SE_NO)))
Then the value of SE_NO
column changes from 0200001097
to 0200001000
in the transformation from character(exponential representation) 2.000011e+08
to numeric.
So Is there any other way to suppress the comma inside the double quotes with out affecting other columns?
Please let me know in case any further information is needed on the same.