0

I extracted a mixed variable which includes both numeric and string values from a data file using strsplit function. I ended up with a variable just as seen below:

> sample3

[[1]]
[1] "1200" "A"  

[[2]]
[1] "1193" "A"  

[[3]]
[1] "1117" "B"  

[[4]]
[1] "5663" 

[[5]]
[1] "7003" "C" 

[[6]]
[1] "1205" "A"  

[[7]]
[1] "2100" "D"  

[[8]]
[1] "1000" "D"  

[[9]]
[1]  "D" 

[[10]]
[1] "1000" "B"

I need to split this into two variables/vectors(or convert to a two-columned matrix). I tried to use unlist(sample3) code then put the all values into a matrix with ncol=2 however since there are some missing data points the result is not correct when I use this way. I think I need to solve missing data issue before putting into a two columned matrix. Does anyone have any idea on this issue? Any help will be greatly appreciated.

thelatemail
  • 91,185
  • 12
  • 128
  • 188
John Smith
  • 110
  • 2
  • 11

1 Answers1

1

Something like this will work

# dummy data
x <- list(c('100','a'), '100', c('a'), c('1000','b'))

numeric_x <- unlist(lapply(x,function(x) {.x <- head(x,1); as.numeric(.x)}))

character_x <- unlist(lapply(x,function(x) {.x <- tail(x,1); if(is.na(as.numeric(.x))) {return(.x)} else {return(NA)}}))

There will be a much nicer regex answer I am sure

mnel
  • 113,303
  • 27
  • 265
  • 254
  • I think this code has some errors because when I apply it R gives some warnings such as "NAs introduced by coercion". Could you please double check it or Am I missing something? – John Smith Nov 14 '12 at 05:41
  • It will give these warnings as it relies on coercing to numeric, and then finding `NA` values (precisely what the warning is telling you it is doing – mnel Nov 14 '12 at 05:42
  • Opps! I did not check the variables after application of your code. Variables seemed correct even R gives errors;) Thanks again for your time and help. – John Smith Nov 14 '12 at 05:47