1

I have the following dataframe, which is the output of read_excel with missing column names in excel:

t <- tibble(A=rnorm(3), B=rnorm(3), "x"=rnorm(3), "y"=rnorm(3), Z=rnorm(3))
colnames(t)[3:4] <-  c("..3", "..4")

How can I select columns ..3 to Z in a flexible dynamic way (not depending on number or table width). I was thinking in the direction of something like:

t %>% select(-starts_with(".."):-last_col())

But this gives a warning, since starts_with returns two values.

zx8754
  • 52,746
  • 12
  • 114
  • 209
c0bra
  • 1,031
  • 5
  • 22
  • Not an answer - but it may be easier to rename the columns using ``dplyr::rename_at()``, since col names in the form of "..j" are invalid. – user438383 Jan 11 '21 at 10:35
  • 1
    Maybe `t %>% select(-starts_with("..")[1]:-last_col())` ? – zx8754 Jan 11 '21 at 10:40
  • Just to clarify, your expected output is columns A and B, right? We are wanting to exclude all cols starting from .. column to the last column? – zx8754 Jan 11 '21 at 10:43
  • @zx8754 yes, thank you, dumb me. Do you want to post it as an answer? – c0bra Jan 11 '21 at 10:47

2 Answers2

1

We could force to select the first one:

t %>% select(-c(starts_with("..")[ 1 ]:last_col()))
# # A tibble: 3 x 2
#       A      B
#   <dbl>  <dbl>
# 1 0.889  0.505
# 2 0.655 -2.15 
# 3 1.34  -0.290

Or "tidier" way use first:

select(-first(starts_with("..")):-last_col())
zx8754
  • 52,746
  • 12
  • 114
  • 209
0

You can do it using base R:

t[cumsum(startsWith(names(t), "..")) == 0]

# # A tibble: 3 x 2
#       A       B
#   <dbl>   <dbl>
# 1 -1.56 -0.0747
# 2 -1.68 -0.847 
# 3 -1.23 -1.20

Which you could also use with select():

t %>% 
  select(which(cumsum(startsWith(names(t), "..")) == 0))

PS. Don't use t as a variable name in R since it is a function name.

s_baldur
  • 29,441
  • 4
  • 36
  • 69