3

Question: Why does the colnames function generate a tibble, when used with the pipe operator %>% and the [ selector?

Example: Given the following tibble:

library(tidyverse)

x <- tribble(
  ~a, ~b,
  1, 0,
  0, 1
)

Problem: The colnames function generates a tibbe, when used with the pipe operator and the [ selector:

x %>% colnames(.)[1]
#> # A tibble: 2 x 1
#>       a
#>   <dbl>
#> 1    NA
#> 2    NA

Whereas I would expect it to generate a vector, as if no pipe operator was used:

colnames(x)[1]
#> [1] "a"
thando
  • 483
  • 2
  • 4
  • 18

3 Answers3

6

The pipe inserts the left hand side into the first argument of the main function of the right hand side which in this case is [ . that is these three are the same:

x %>% colnames(.)[1]

x %>% `[`(colnames(.), 1)

x %>% `[`(., colnames(.), 1)

The last one is the same because it does not insert the left hand side if one of the arguments of the main function is dot.

To prevent it from inserting dot use {...}

x %>% { colnames(.)[1] }

or use:

x %>% colnames %>% .[1]

or use dplyr::first or dplyr::slice_head(n = 1) or dplyr::nth(1) or purrr::pull(1) or magrittr::extract(1) or head(1)

x %>% colnames %>% first
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
5

Because %>% auto-inserts . as the first parameter of the right-hand side, unless . is already used as a top-level parameter of the right-hand side.

Your right-hand side is colnames(.)[1] — and . is not a top-level parameter here. In fact, the expression colnames(.)[1] is exactly equivalent to

`[`(colnames(.), 1)

So the top-level parameters of this expression are colnames(.) and 1. Neither is .. So %>% transforms your expression to

`[`(x, colnames(x), 1)

Which is once again equivalent to

x[colnames(x), 1]

And that expression doesn’t really make sense, and happens to result in what you’ve observed.

To avoid . auto-insertion, you can wrap the expression into {…} as shown by akrun. Or you can split up the compound expressions into its constituent parts:

x %>% colnames() %>% `[`(1L)
Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
3

We can either use first after the colnames

x %>% 
   colnames %>% 
   first
#[1] "a"

Or block the whole expression within {} to evaluate as a single expression and to avoid the precedence of operators

x %>%
    {colnames(.)[1]}
#[1] "a"
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Thank you akrun, the `{}` does the job in my original use-case. Do you have an idea, how to explain this behavior? – thando Dec 28 '20 at 21:21
  • @thando I think it is because of the order of operators, the extraction of column happens, creating a vector, and thus when you do the `colnames`, it returns `NA` i.e. ` x[[1]][colnames(x)[1]]` [1] NA` – akrun Dec 28 '20 at 21:23