1

I'm using dplyr and I want to select all the columns on the table but return only the rows where one specific column ends with '006'.

select(sample_id, ends_with("006"), everything())

The code above doesn't work. When I run it, it returns all rows (or more than I need -- it's a huge dataset).

I've tried using:

filter(sample_id == ends_with('006')) 

but ends_with() needs to be used within a select function.

Antonio
  • 417
  • 2
  • 8
  • 1
    `filter(stringr::str_sub(sample_id, -3, -1) == “006”)`. `ends_with()` selects columns. You want to filter rows. – Limey Aug 02 '22 at 04:39
  • oh, interesting! So ends_with selects columns only. makes complete sense now. Thank you, it worked! – Antonio Aug 02 '22 at 04:51

3 Answers3

2

Use str_ends from package stringr:

df %>% filter(str_ends(sample_id, "006"))

By default the pattern is a regular expression. You can match a fixed string with:

df %>% filter(str_ends(sample_id, fixed("006")))

Of course it's also possible to use a more general regular expression. It's useful if you have a more complex pattern to check, but it also works here:

df %>% filter(str_detect(sample_id, "006$")) 

See also: Detect the presence or absence of a pattern at the beginning or end of a string.

1

ends_with() is for subseting columns. You should use endsWith() from base:

filter(endsWith(sample_id, "006"))

It's equivalent to

filter(grepl("006$", sample_id))
Darren Tsai
  • 32,117
  • 5
  • 21
  • 51
0

For a base R approach, we could use grepl here along with a data frame subset operation:

df_out <- df[grepl("006$", df$sample_id), ]
Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360