Questions tagged [dplyr]

Use this tag for questions relating to functions from the dplyr package, such as group_by, summarize, filter, and select.

The r dplyr package is the next iteration of the plyr package. It has three main goals:

Identify the most important data manipulation tools needed for data analysis and make them easy to use from R.

Provide fast performance for in-memory data by writing key pieces in C++.

Use the same interface to work with data no matter where it's stored, whether in a data.frame, a data.table or a database.

Repositories

Vignettes

Some vignettes have been moved to other related packages.

Tibbles (from tibble package)
Databases (from dbplyr package)
Introduction to dplyr
Adding a new SQL backend (from dbplyr package)
Programming with dplyr
Two-table verbs
Window functions and grouped mutate/filter

Other resources

Related tags

R's plyr, magrittr, tidyr, tidyverse and data.table packages
Python's pandas library

36044 questions

votes

1 answer

How to group_by variable and cut time into 10s bins starting at 13:24:00 exactly and average for group_by variable

I have CO2 measurement data by 30 sensors that don't all measure at the same time, nor do they all start at exactly the same time. I would like to align them as best as possible, so I thought that taking 10s averages might be a good solution. In a…

r time dplyr

asked Feb 26 '19 at 11:27

HCAI

2,213
8
33
65

votes

2 answers

Get title for plots when using purrr and ggplot with group_by and nest()

I have the following example: df <- mtcars plot <- df %>% mutate(carb=as.character(carb)) %>% group_by(carb) %>% nest() %>% mutate(plot=map(data, function(.x){ .x %>% ggplot() + geom_bar(aes(mpg)) })) print(plot) # A…

r ggplot2 dplyr purrr

asked Feb 22 '19 at 14:57

xhr489

1,957
13
39

votes

2 answers

dplyr summarise keep NA if all summarised values are NA

I want to use dplyr summarise to sum counts by groups. Specifically I want to remove NA values if not all summed values are NA, but if all summed values are NA, I want to display NA. For example: name <- c("jack", "jack", "mary", "mary", "ellen",…

r dplyr

asked Feb 21 '19 at 11:03

Tristan Bakx

votes

0 answers

dplyr equivalent of sql row_number() over (partition by group order by value)

Initial situation I have a data set of the following form: library(dplyr) dat <- tribble( ~name, ~iq, "ben", 100, "alex", 98, "mia", 110, "paco", 124, "mia", 112, "mia", 120, "paco", 112, "ben", 90, "alex", 107 ) I'd…

r dplyr window-functions

asked Feb 14 '19 at 16:16

piptoma

votes

1 answer

Could not find function "%>%" during CMD check

I wrote an R package, which is based on dplyr. When I run the CMD check, an error pops up when evaluating the @examples. could not find function "%>%" Calls: Rresult Execution halted I have added dplyr in the description file, and the package works…

r dplyr package

asked Feb 07 '19 at 17:29

Wang

1,314
14
21

votes

3 answers

dplyr: case_when() over multiple columns with multiple conditions

I have made this minimal reproducible example to exemplify my question. I have already managed to solve the problem, but I am sure there are more elegant ways of coding it. The issue is about binary classification based on multiple criteria. In…

r dplyr

asked Feb 06 '19 at 16:05

Claudiu Papasteri

2,469
1
17
30

votes

2 answers

Replace all underscores in feature names with a space

I'd like to replace all underscores in a dataframes feature names with a space: library(tidyverse) names <- c("a_nice_day", "quick_brown_fox", "blah_ha_ha") example_df <- data.frame( x = 1:3, y = LETTERS[1:3], z = 4:6 ) names(example_df) <-…

r dplyr

asked Feb 05 '19 at 22:29

Doug Fir

19,971
47
169
299

votes

1 answer

join data frames and replace one column with another

I have two data frames, one with all my data, and another with a corrected ID number for some of the data. When I attempt to join these values with either a left, inner or full join, I end up with two ID columns (ID.x and ID.y). Is there anyway to…

r dplyr

asked Feb 05 '19 at 15:34

tnt

1,149
14
24

votes

1 answer

dplyr filter on value only if another value exists in the group in the same column

I fully anticipate getting slammed for a duplicate question, but I just couldn't find a similar question. Apologies in advance. I am trying to clean some data that sometimes contains a summary row and sometimes does not. here is a small…

r dplyr

asked Feb 03 '19 at 19:37

jkgrain

votes

1 answer

Using dplyr to group_by and conditionally mutate only with if (without else) statement

I have a dataframe that I need to group by a combination of columns entries in order to conditionally mutate several columns using only an if statement (without an else condition). More specifically, I want to sum up the column values of a certain…

r dplyr

asked Jan 28 '19 at 14:25

atreju

votes

2 answers

dplyr: divide all values in group by group's first value

My df looks something like this: ID Obs Value 1 1 26 1 2 13 1 3 52 2 1 1,5 2 2 30 Using dplyr, I to add the additional column Col, which is the result of a division of all values in the column…

r dplyr

asked Jan 24 '19 at 10:33

TIm Haus

votes

1 answer

Filtering on a Stringr Match (str_detect) EXCEPT For a Particular Similar Value in R?

I'm trying to create a dplyr pipeline to filter on Imagine a data frame jobs, where I want to filter out the most-senior positions from the titles column: titles Chief Executive Officer Chief Financial Officer Chief Technical…

r dplyr stringr

asked Jan 13 '19 at 00:35

alxlvt

votes

3 answers

changing all values in one column in a filtered data.frame in R

I have a very messy data frame, with one column with values that are understandable to humans but not to computers, a bit like the one below. df<-data.frame("id"=c(1:10), "colour"=c("re d", ", red", "re-d","green", "gre, en", ",…

r dplyr stringr

asked Dec 21 '18 at 11:10

Mactilda

votes

5 answers

How to calculate common values across different groups?

I am trying to create a data frame for creating network charts using igraph package. I have sample data "mydata_data" and I want to create "expected_data". I can easily calculate number of customers visited a particular store, but how do I calculate…

r dplyr igraph

asked Dec 13 '18 at 06:20

Yogesh Kumar

votes

1 answer

Postgres ARRAY column type to tbl list column in R and viceversa

Let's say that I'm working with the starwars dataset from dplyr package, which contains list columns (for films, vehicles...). To simplify, let's work with only the name and the films data: library(dplyr) ex_data <- starwars %>% select(name,…

r postgresql dplyr

asked Dec 11 '18 at 08:56

MalditoBarbudo

1,815
12
18

Prev 1 2 3

…

99 100 Next