0

I have a data frame that contains the list of countries and it has been split using the csplit function.

The code is as follows:-

df <- data.frame(country = c("India, South Africa", "United Kingdom, United States, India",
                             "England, Australia, South Africa, Germany, United States"))
splitstackshape::cSplit(df, "country", sep = ", ")
 
#        country_1     country_2    country_3 country_4     country_5
#1:          India  South Africa         <NA>      <NA>          <NA>
#2: United Kingdom United States        India      <NA>          <NA>
#3:        England     Australia South Africa   Germany United States

I wish to rearrange the columns in a such a manner that country_1 column should contain either United States or <NA>. Similarly for country_2 and country_3, it should be India or <NA> and United Kingdom or <NA> respectively. From column_4 on wards, it can follow the order as it is in the row.

Expected output is as follows,

#Expected Output
# country_1    country_2    country_3        country_4     country_5   country_6     country_7
#1 <NA>            India     <NA>            South Africa  <NA>        <NA>          <NA>
#2 United States   India     United Kingdom  <NA>          <NA>        <NA>          <NA>
#3 United States   <NA>      <NA>            England       Australia   South Africa  Germany
Sri Sreshtan
  • 535
  • 3
  • 12

1 Answers1

1

A very ugly solution using apply :

df1 <- splitstackshape::cSplit(df, "country", sep = ", ")
n <- length(unique(na.omit(unlist(df1))))

as.data.frame(t(apply(df1, 1, function(x) {
      x1 <- rep(NA, n)
      if(any(x == 'United States', na.rm = TRUE)) x1[1] <- 'United States'
      if(any(x == 'India', na.rm = TRUE)) x1[2] <- 'India'
      if(any(x == 'United Kingdom', na.rm = TRUE)) x1[3] <- 'United Kingdom'
      temp <- setdiff(x, x1)
      if(length(temp)) x1[4:(4 + length(temp) - 1)] <- temp
      x1
})))

#             V1    V2             V3           V4        V5           V6      V7
#1          <NA> India           <NA> South Africa      <NA>         <NA>    <NA>
#2 United States India United Kingdom         <NA>      <NA>         <NA>    <NA>
#3 United States  <NA>           <NA>      England Australia South Africa Germany
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213