0

I have the following data set

id<-c(1,1,1,1,2,2,2,2,2,3,3,3,3,3,3,3,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4)
s02<-c(001,002,003,004,001,002,003,004,005,001,002,003,004,005,006,007,001,002,003,004,005,006,007,008,009,010,011,012,013,014,015,016,017,018,019,020,021,022,023,024,025,026,027,028,029)
dat1<-data.frame(id,s02)

I would wish to create a data set based on this dat1. I would wish to have an R code that creates n s02 automatically as s02__0, s02__1, s02__2, s02__3, s02__4, in which case my n==5. Then based on the ID in dat1, the code should allocate each s02 to the respective s02__0 to s02__4 in the data frame. These rows are uniquely identified by another ID_2 created based on the number of rows. If incase the s02 are less in the row created, then the remaining cells should be allocated ##N/A##. if the s02 are more than the n, then another new row with an increment from the unique ID_2 is formed to accommodate the extra s02 and every blank cell is still filled with ##N/A##. From the dataset above, I would wish to have the following output

id<-c(1,2,3,3,4,4,4,4,4,4)
id_2<-c(1,1,1,2,1,2,3,4,5,6)
s02__0<-c(1,1,1,6,1,6,11,16,21,26)
s02__1<-c(2,2,2,7,2,7,12,17,22,27)
s02__2<-c(3,3,3,##N/A##,3,8,13,18,23,28)
s02__3<-c(4,4,4,##N/A##,4,9,14,19,24,29)
s02__4<-c(##N/A##,5,5,##N/A##,5,10,15,20,25,##N/A##)

dat2<-data.frame(id,id_2,s02__0,s02__1,s02__2,s02__3,s02__4)
Sam Mwenda
  • 150
  • 8

1 Answers1

1

This can produce what you want:

library(tidyverse)
#Data
id<-c(1,1,1,1,2,2,2,2,2,3,3,3,3,3,3,3)
s02<-c(001,002,003,004,001,002,003,004,005,001,002,003,004,005,006,007)
dat1<-data.frame(id,s02)
#Code
dat2 <- dat1 %>% group_by(id) %>% mutate(id2 = ifelse(s02<=5,1,2)) %>% ungroup() %>%
  group_by(id,id2) %>% mutate(val=1:n()-1,nid = cur_group_id()) %>% ungroup() %>%
  select(-id2) %>% mutate(id=paste0(id,'.',nid),val=paste0('s02','.',val)) %>% select(-nid) %>%
  pivot_wider(names_from = c(val),values_from = s02) %>%
  mutate(id=gsub("\\..*","", id)) %>% group_by(id) %>%
  mutate(id2=1:n()) %>% select(order(colnames(.)))
dat2

# A tibble: 4 x 7
# Groups:   id [3]
  id      id2 s02.0 s02.1 s02.2 s02.3 s02.4
  <chr> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1         1     1     2     3     4    NA
2 2         1     1     2     3     4     5
3 3         1     1     2     3     4     5
4 3         2     6     7    NA    NA    NA
Duck
  • 39,058
  • 13
  • 42
  • 84
  • Am forever grateful for that out put as it is exactly what i want. however, when i try the code it is returning the following error "Error in cur_group_id() : could not find function "cur_group_id"". I have tried to update my dplyr and tidyverse to the latest releases,but still the error remains. Kindly assist, and thankyou in advance – Sam Mwenda Aug 01 '20 at 22:48
  • 1
    @SamMwenda It is a package issue. I have `dplyr_1.0.0` and the function is present. Try getting that version or download from CRAN and install manually. Let me know if you solved that issue :) – Duck Aug 01 '20 at 22:54
  • 1
    @SamMwenda Also, re start `R` after installing all. – Duck Aug 01 '20 at 22:55
  • Dear @Duck, the comment is very helpful. I have edited the dataframe,because i am working with a large data frame, and from your code above the " ifelse(s02<=5,1,2))" will only work for maximum of 2 rows. suppose i have a data which neds to fit more that 2 rows, what would i need to do? example in my addition, id=4 needs to fit in 6 rows. Thankyou in advance, this code is invaluable! – Sam Mwenda Aug 02 '20 at 06:09