I'm trying to isolate and join pairs of columns in a data set with 2m rows and ~400 columns.
The columns distX
and zX
pertain to each other and I'm interested in isolating pairs of cells based on the dependent variable, distX. I need to drop cells containing values which are not -10,-5,0,5,10. I can't drop the entire row as usually the pair I want has just been shifted along.
How would I go about cleaning the distX that I don't need, along with its 'z' value?
Ideally then I would like to have all the -10 dist1
in a single column with its pertaining z1
in column to the right. -5 dist2
with z2
etc. and so on.
> dist1 <- c('-10','-10','-10','-10','-10','-10','-9','-9','-9','-9','-9','-10','-10','10','-10','-10','-10','-10','-10','-10','-10')
> z1 <- c('5','5.1','5.1','5.4','5.3','5.5','5.8','5.7','5.7','5.8','5.7','5.9','5.8','6','5.9','6','5.9','5.8','5.7','6','5.9')
> dist2 <- c('-5','-5','-5','-4','-4','-5','-5','-5','-4','-4','-4','-4','-4','-5','-5','-5','-5','-5','-5','-5','-5')
> z2 <- c('5','5.1','5.1','5.4','5.3','5.5','5.8','5.7','5.7','5.8','5.7','5.9','5.8','6','5.9','6','5.9','5.8','5.7','6','5.9')
> dist3 <- c('0','0','0','0','0','1','1','0','0','0','1','1','1','1','1','1','1','1','1','0','0')
> z3 <- c('5','5.1','5.1','5.4','5.3','5.5','5.8','5.5','5.7','5.8','5.7','5.9','5.8','6','5.9','6','5.9','5.8','5.7','6','5.9')
> dist4 <- c('5','5','5','5','6','5','5','6','6','6','6','6','6','6','6','5','5','5','5','5','5')
> z4 <- c('6','6.1','6.1','6.4','6.3','6.6','6.8','6.7','6.7','6.8','6.7','6.5','6.8','6','6.9','6','6.9','6.8','6.7','6','6.9')
> dist5 <- c('10','10','10','10','10','9','9','10','10','10','10','10','10','10','10','10','10','10','10','10','10')
> z5 <- c('6','6.1','6.1','6.4','6.3','6.6','6.8','6.7','6.7','6.8','6.7','6.5','6.8','6','6.9','6','6.9','6.8','6.7','6','6.9')
>
> test <- data.frame(dist1,z1,dist2,z2,dist3,z3,dist4,z4,dist5,z5)
> tibble(test)
# A tibble: 21 x 10
dist1 z1 dist2 z2 dist3 z3 dist4 z4 dist5 z5
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 -10 5 -5 5 0 5 5 6 10 6
2 -10 5.1 -5 5.1 0 5.1 5 6.1 10 6.1
3 -10 5.1 -5 5.1 0 5.1 5 6.1 10 6.1
4 -10 5.4 -4 5.4 0 5.4 5 6.4 10 6.4
5 -10 5.3 -4 5.3 0 5.3 6 6.3 10 6.3
6 -10 5.5 -5 5.5 1 5.5 5 6.6 9 6.6
7 -9 5.8 -5 5.8 1 5.8 5 6.8 9 6.8
8 -9 5.7 -5 5.7 0 5.5 6 6.7 10 6.7
9 -9 5.7 -4 5.7 0 5.7 6 6.7 10 6.7
10 -9 5.8 -4 5.8 0 5.8 6 6.8 10 6.8
# ... with 11 more rows
I have tried a loop to try and pair columns but I don't get an output nor error. I also thought about using pivot_longer
, but I am stumped on how to move forward.
Below is an example of the loop I tried:
for (i in 1:ncol(test)){
combn(ncol(test),2, FUN=function(i)
apply(test[,i], 1, paste0, collapse=""))
}
Any help greatly appreciated :)