0

My data is:

df <- data.frame(a   = 1:2,
                 x   = 1:2,
                 b   = 1:2,
                 y   = 3:4,
                 x_2 = 1:2,
                 y_2 = 3:4,
                 c   = 1:2,
                 x_3 = 5:6,
                 y_3 = 1:2)

I now want to put together the x vars, and the y vars so that the order of columns would be:

a, x, x_2, x_3, b, y, y_2, y_3, c

I thought, I could use tidyverse's relocate function in combination with lapply or map or reduce (?), but it doesn't work out.

E.g. if I do:

move_names <- c("x", "y")

library(tidyverse)
moved_data <- lapply(as.list(move_names), function(x)
{
  df <- df |> 
    relocate(!!!syms(paste0(x, "_", 2:3)),
             .after = all_of(x))
}
)

It does the moving for x and y separately, but it creates separate list, but I want to have just my original df with relocated columns.

Update:

I should have been clear that my real data frame has ~500 columns where the to-be-moved columns are all over the place. So providing the full vector of desired column name order won't be feasible.

What I instead have: I have the names of my original columns, i.e. x and y, and I have the names of the to-be-moved columns, i.e. x_2, x_3, y_2, y_3.

deschen
  • 10,012
  • 3
  • 27
  • 50

4 Answers4

1

Not sure if it's what you want.

Vector with order of column names

Let's say you have a vector relocate_name that contains the order of your columns:

library(tidyverse)

relocate_name <- c("a", "x", "x_2", "x_3", "b", "y", "y_2", "y_3", "c")

df %>% relocate(any_of(relocate_name))

Vector with prefix of column names

Or if you only have the prefix of the order, let's call it relocate_name2:

relocate_name2 <- c("a", "x", "b", "y", "c")

df %>% relocate(starts_with(relocate_name2))

Group x and y together

Or if you only want to "group" x and y together:

df %>% 
  relocate(starts_with("x"), .after = "x") %>% 
  relocate(starts_with("y"), .after = "y")

Output

All of the above output is the same.

  a x x_2 x_3 b y y_2 y_3 c
1 1 1   1   5 1 3   3   1 1
2 2 2   2   6 2 4   4   2 2
benson23
  • 16,369
  • 9
  • 19
  • 38
  • Thanks for the suggestions, which would nicely work if in my real-life use case I coudl easily create a vector of desired column order, but I have ~500 columns all over the place, which makes this approach not really feasible, unless I could somehow dynamically create such a column order vector. Let me think about it... – deschen Mar 24 '22 at 12:25
  • @deschen See if my updated answer helps. – benson23 Mar 24 '22 at 12:29
  • Theoretically yes, and if I only had 2 such column pairs, I would do it manually like you suggested, but I probably have 10+ such relocations and then it isn't fun anymore to write it out 10 times. – deschen Mar 24 '22 at 12:34
1
library(rlist)
# split based in colname-part before _
L <- split.default(df, f = gsub("(.*)_.*", "\\1", names(df)))

# remove names with an underscore
# this is the new order, it should match the names of list L !!
neworder <- names(df)[!grepl("_", names(df))]
# [1] "a" "x" "b" "y" "c"

# cbind list elements together
ans <- rlist::list.cbind(L[neworder])
# a x.x x.x_2 x.x_3 b y.y y.y_2 y.y_3 c
# 1 1   1     1     5 1   3     3     1 1
# 2 2   2     2     6 2   4     4     2 2

# create tidy names again
names(ans) <- gsub(".*\\.(.*)", "\\1", names(ans))
#   a x x_2 x_3 b y y_2 y_3 c
# 1 1 1   1   5 1 3   3   1 1
# 2 2 2   2   6 2 4   4   2 2
Wimpel
  • 26,031
  • 1
  • 20
  • 37
  • Thanks for the suggestions, which would nicely work if in my real-life use case I coudl easily create a vector of desired column order, but I have ~500 columns all over the place, which makes this approach not really feasible, unless I could somehow dynamically create such a column order vector. Let me think about it... – deschen Mar 24 '22 at 12:25
  • But what logic could you use to tell a computer to sort the column names on? If not provides by a vector, and not alphabetically.. then what? – Wimpel Mar 24 '22 at 12:38
  • The logic is: I know which names I need to look for, i.e. x_2, x_3 and y_2 and y_3. And then I kind of simply want to do: "search for the first element in `move_names` and put all x_2 and x_3 vars after taht, then search for the second element in move_names and put y_2 and y_3 after that" and so on. – deschen Mar 24 '22 at 12:43
  • see updated answer – Wimpel Mar 24 '22 at 12:46
1

In base R:

df[match(c('a', 'x', 'x_2', 'x_3', 'b', 'y', 'y_2', 'y_3', 'c'), names(df))]
#>   a x x_2 x_3 b y y_2 y_3 c
#> 1 1 1   1   5 1 3   3   1 1
#> 2 2 2   2   6 2 4   4   2 2
Allan Cameron
  • 147,086
  • 7
  • 49
  • 87
  • Thanks for the suggestions, which would nicely work if in my real-life use case I coudl easily create a vector of desired column order, but I have ~500 columns all over the place, which makes this approach not really feasible, unless I could somehow dynamically create such a column order vector. Let me think about it... – deschen Mar 24 '22 at 12:25
  • Yes, the problem is that your final ordering is somewhat arbitrary. It's easy to get the data frame into the correct order once you can put the names in the correct order, so you should focus on being able to rearrange _names_ rather than _columns_ – Allan Cameron Mar 24 '22 at 12:28
0

Ok, this is probably the worst workaround ever and I don't really understand what exactly I'm doing (especially with the <<-), but it is does the trick.

My general idea after realizing the problem a bit more with the help of you guys here was to "loop" through both of my x and y names, remove these new _2 and _3 columns from the vector of column names and re-append them after their "base" x and y columns.

search_names <- c("x", "y")

df_names <- names(df)

new_names <- lapply(search_names, function(x)
{
  start <- which(df_names == x)

  without_new_names <- setdiff(df_names, paste0(x, "_", 2:3))
  
  df_names <<- append(without_new_names, values = paste0(x, "_", 2:3), after = start)
})[[length(search_names)]]

df |> 
  relocate(any_of(new_names))

  a x x_2 x_3 b y y_2 y_3 c
1 1 1   1   5 1 3   3   1 1
2 2 2   2   6 2 4   4   2 2
deschen
  • 10,012
  • 3
  • 27
  • 50