1

So, I have a dataset and need to do some works using a for loop.

Here is my fake data:

#fake data
L <- data.frame(matrix(rnorm(20), nrow=10, ncol=10))
names(L) <- c("P1", "P2", "P3", "P4", "P5", "P6", "P7", "P8", "P9","P10")

Now, I want to apply a function to the entire column and remove the column "P1". Then, run the function again and remove "P5" so on.

Here is the order of removing.

# order of removing column
R < c(P1, P5, P2, P8, P9, P4, P3, P6, P7) 

What can I try next?

halfer
  • 19,824
  • 17
  • 99
  • 186
Yun Tae Hwang
  • 1,249
  • 3
  • 18
  • 30
  • 1
    I assume you want to do something with the new data frame at each step? Otherwise you'd just remove all those columns in one go? – neilfws Feb 07 '18 at 22:32
  • 1
    Possible duplicate of [R dplyr: Drop multiple columns](https://stackoverflow.com/questions/35839408/r-dplyr-drop-multiple-columns) `library(tidyverse); L%>% select(-one_of("P1", "P5", "P2", "P8", "P9", "P4", "P3", "P6", "P7"))` – InfiniteFlash Feb 07 '18 at 22:33
  • I would find form a multiple linear regression model each step. So, basically I am removing one variable at a time. – Yun Tae Hwang Feb 07 '18 at 22:36
  • Well, store the variable selection process in a character vector (with the columns you want to keep after each iteration) and pass that through into the dataset to be subsetted by, after each iteration. I'd advise you to look at how the `step()` function works in R. – InfiniteFlash Feb 07 '18 at 22:38
  • I think there are packages that will do stepwise linear regression. For example, look at `step()` in the `stats` package or `regsubsets` in the `leaps` package. – neilfws Feb 07 '18 at 23:01

3 Answers3

2

It depends on the output you desire but I would use lapply so each data frame subset is saved as list element :

R <- c("P1", "P5", "P2", "P8", "P9", "P4", "P3", "P6", "P7") lapply(seq_along(R), function(i) L[-which(names(L) %in% R[1L:i])])

Julien Navarre
  • 7,653
  • 3
  • 42
  • 69
  • but I dont want P1 back at the moment when I remove P5 – Yun Tae Hwang Feb 07 '18 at 22:40
  • so "function(i)" will be the function that I will define? – Yun Tae Hwang Feb 07 '18 at 22:46
  • cuz I would find form a multiple linear regression model each step. So, basically I am removing one variable at a time. – Yun Tae Hwang Feb 07 '18 at 22:53
  • Yes you can do more computation inside the function if needed, for example : `lapply(seq_along(R), function(i) { df <- L[-which(names(L) %in% R[1L:i])] summary(df) })` – Julien Navarre Feb 07 '18 at 23:18
  • the comment remove line breaks, you should write ` df <- L[-which(names(L) %in% R[1L:i])]` and `summary(df)` on different lines – Julien Navarre Feb 08 '18 at 00:11
  • I figured out right before you add the comment haha. My last question will be .. your code gives me the first result without P1 but is it possible to get the result of all the columns first then start removing ?? – Yun Tae Hwang Feb 08 '18 at 00:15
  • you can just `c()` with a list of your original df: `L` -- `c(list(L), lapply(seq_along(R), function(i) L[-which(names(L) %in% R[1L:i])]))` – Matt W. Feb 08 '18 at 01:02
0

Since your column names are ordered consecutively, you can do this:

i <- c(1,5,2,8,9,4,3,6,7)
lapply(i,function(x) L[,-x])
Brian Davis
  • 990
  • 5
  • 11
0

Using dplyr::select and purrr:map, you could probably do something like this if you modified R to include the final column:

# example data
L <- data.frame(matrix(rnorm(20), nrow=10, ncol=10))
names(L) <- c("P1", "P2", "P3", "P4", "P5", "P6", "P7", "P8", "P9","P10")
R <- c("P1", "P5", "P2", "P8", "P9", "P4", "P3", "P6", "P7", "P10")

res_list <- 1:ncol(L) %>%
  map(~select(L[R], .x:ncol(L)))

L[R] is used to permute the columns into the order you want them removed. The result is a list of dataframes you can iterate over.

anant
  • 448
  • 3
  • 8