I can't get this to work. I want to replace all two character occurences in the first field of a csv file with the occurence and an X
appended, and whitespace removed. For example SA
and SA
should map to SAX
in the new file. Below is what I tried with sed
(from help through an earlier question)
system( paste("sed ","'" ,' s/^GG/GGX/g; s/^GG\\s/GGX/g; s/^GP/GPX/g;
s/^GP\\s/GPX/g; s/^FG/FGX/g; s/^FG\\s/FGX/g; s/^SA/SAX/g; s/^SA\\s/SAX/g;
s/^TP/TPX/g; s/^TP\\s/TPX/g ',"'",' ./data/concat_csv.2 >
./data/concatenated_csv.2 ',sep=''))
I tried using the sQuote()
function, but this still doesn't help. The file has problems being handled by read.csv because there are errors within some fields based on too many and not enough separators on certain lines.
I could try reading in and editing the file in pieces, but I don't know how to do that as a streaming process.
I really just want to edit the first field of the file using a system()
call. The file is about 30GB.