Questions tagged [dplyr]

Use this tag for questions relating to functions from the dplyr package, such as group_by, summarize, filter, and select.

The dplyr package is the next iteration of the package. It has three main goals:

  1. Identify the most important data manipulation tools needed for data analysis and make them easy to use from R.
  2. Provide fast performance for in-memory data by writing key pieces in C++.
  3. Use the same interface to work with data no matter where it's stored, whether in a data.frame, a data.table or a database.

Repositories

Vignettes

Some vignettes have been moved to other related packages.

Other resources

Related tags

36044 questions
6
votes
1 answer

How to group_by variable and cut time into 10s bins starting at 13:24:00 exactly and average for group_by variable

I have CO2 measurement data by 30 sensors that don't all measure at the same time, nor do they all start at exactly the same time. I would like to align them as best as possible, so I thought that taking 10s averages might be a good solution. In a…
HCAI
  • 2,213
  • 8
  • 33
  • 65
6
votes
2 answers

Get title for plots when using purrr and ggplot with group_by and nest()

I have the following example: df <- mtcars plot <- df %>% mutate(carb=as.character(carb)) %>% group_by(carb) %>% nest() %>% mutate(plot=map(data, function(.x){ .x %>% ggplot() + geom_bar(aes(mpg)) })) print(plot) # A…
xhr489
  • 1,957
  • 13
  • 39
6
votes
2 answers

dplyr summarise keep NA if all summarised values are NA

I want to use dplyr summarise to sum counts by groups. Specifically I want to remove NA values if not all summed values are NA, but if all summed values are NA, I want to display NA. For example: name <- c("jack", "jack", "mary", "mary", "ellen",…
Tristan Bakx
  • 61
  • 1
  • 3
6
votes
0 answers

dplyr equivalent of sql row_number() over (partition by group order by value)

Initial situation I have a data set of the following form: library(dplyr) dat <- tribble( ~name, ~iq, "ben", 100, "alex", 98, "mia", 110, "paco", 124, "mia", 112, "mia", 120, "paco", 112, "ben", 90, "alex", 107 ) I'd…
piptoma
  • 754
  • 1
  • 8
  • 19
6
votes
1 answer

Could not find function "%>%" during CMD check

I wrote an R package, which is based on dplyr. When I run the CMD check, an error pops up when evaluating the @examples. could not find function "%>%" Calls: Rresult Execution halted I have added dplyr in the description file, and the package works…
Wang
  • 1,314
  • 14
  • 21
6
votes
3 answers

dplyr: case_when() over multiple columns with multiple conditions

I have made this minimal reproducible example to exemplify my question. I have already managed to solve the problem, but I am sure there are more elegant ways of coding it. The issue is about binary classification based on multiple criteria. In…
Claudiu Papasteri
  • 2,469
  • 1
  • 17
  • 30
6
votes
2 answers

Replace all underscores in feature names with a space

I'd like to replace all underscores in a dataframes feature names with a space: library(tidyverse) names <- c("a_nice_day", "quick_brown_fox", "blah_ha_ha") example_df <- data.frame( x = 1:3, y = LETTERS[1:3], z = 4:6 ) names(example_df) <-…
Doug Fir
  • 19,971
  • 47
  • 169
  • 299
6
votes
1 answer

join data frames and replace one column with another

I have two data frames, one with all my data, and another with a corrected ID number for some of the data. When I attempt to join these values with either a left, inner or full join, I end up with two ID columns (ID.x and ID.y). Is there anyway to…
tnt
  • 1,149
  • 14
  • 24
6
votes
1 answer

dplyr filter on value only if another value exists in the group in the same column

I fully anticipate getting slammed for a duplicate question, but I just couldn't find a similar question. Apologies in advance. I am trying to clean some data that sometimes contains a summary row and sometimes does not. here is a small…
jkgrain
  • 769
  • 5
  • 20
6
votes
1 answer

Using dplyr to group_by and conditionally mutate only with if (without else) statement

I have a dataframe that I need to group by a combination of columns entries in order to conditionally mutate several columns using only an if statement (without an else condition). More specifically, I want to sum up the column values of a certain…
atreju
  • 965
  • 6
  • 15
  • 36
6
votes
2 answers

dplyr: divide all values in group by group's first value

My df looks something like this: ID Obs Value 1 1 26 1 2 13 1 3 52 2 1 1,5 2 2 30 Using dplyr, I to add the additional column Col, which is the result of a division of all values in the column…
TIm Haus
  • 251
  • 3
  • 8
6
votes
1 answer

Filtering on a Stringr Match (str_detect) EXCEPT For a Particular Similar Value in R?

I'm trying to create a dplyr pipeline to filter on Imagine a data frame jobs, where I want to filter out the most-senior positions from the titles column: titles Chief Executive Officer Chief Financial Officer Chief Technical…
alxlvt
  • 675
  • 2
  • 10
  • 18
6
votes
3 answers

changing all values in one column in a filtered data.frame in R

I have a very messy data frame, with one column with values that are understandable to humans but not to computers, a bit like the one below. df<-data.frame("id"=c(1:10), "colour"=c("re d", ", red", "re-d","green", "gre, en", ",…
Mactilda
  • 393
  • 6
  • 18
6
votes
5 answers

How to calculate common values across different groups?

I am trying to create a data frame for creating network charts using igraph package. I have sample data "mydata_data" and I want to create "expected_data". I can easily calculate number of customers visited a particular store, but how do I calculate…
Yogesh Kumar
  • 609
  • 6
  • 22
6
votes
1 answer

Postgres ARRAY column type to tbl list column in R and viceversa

Let's say that I'm working with the starwars dataset from dplyr package, which contains list columns (for films, vehicles...). To simplify, let's work with only the name and the films data: library(dplyr) ex_data <- starwars %>% select(name,…
MalditoBarbudo
  • 1,815
  • 12
  • 18