Questions tagged [dplyr]

Use this tag for questions relating to functions from the dplyr package, such as group_by, summarize, filter, and select.

The r dplyr package is the next iteration of the plyr package. It has three main goals:

Identify the most important data manipulation tools needed for data analysis and make them easy to use from R.

Provide fast performance for in-memory data by writing key pieces in C++.

Use the same interface to work with data no matter where it's stored, whether in a data.frame, a data.table or a database.

Repositories

Vignettes

Some vignettes have been moved to other related packages.

Tibbles (from tibble package)
Databases (from dbplyr package)
Introduction to dplyr
Adding a new SQL backend (from dbplyr package)
Programming with dplyr
Two-table verbs
Window functions and grouped mutate/filter

Other resources

Related tags

R's plyr, magrittr, tidyr, tidyverse and data.table packages
Python's pandas library

36044 questions

votes

4 answers

Using dplyr filter() in programming

I am writing my function and want to use dplyr's filter() function to select rows of my data frame that satisfy a condition. This is my code: library(tidyverse) df <-data.frame(x = sample(1:100, 50), y = rnorm(50), z = sample(1:100,50), w =…

r dplyr tidyverse rlang

asked Jul 21 '17 at 01:59

Kay

2,057
3
20
29

votes

2 answers

R dplyr summarise bug?

library(tidyverse) stats <- read_csv('stats.csv') ## Warning: Installed Rcpp (0.12.12) different from Rcpp used to build dplyr (0.12.11). ## Please reinstall dplyr to avoid random crashes or undefined behavior. I am pretty sure that I got the same…

r dplyr tidyverse

asked Jul 18 '17 at 03:53

Y.Y

votes

2 answers

how to transform a string into a factor and sets contrasts using dplyr/magrittr piping

i have a rather specific question: how can I make a string into a factor and set its contrasts within a pipe ? Let's say that I have a tibble like the following tib <- data_frame (a = rep(c("a","b","c"),3, each = T), val = rnorm(9)) Now, I could…

r dplyr magrittr

asked Jul 17 '17 at 09:40

Federico Nemmi

votes

1 answer

Combining multiple columns in one R

How can I combine multiple all dataframe's columns in just 1 column? , in an efficient way... I mean not using the column names to do it, using dplyr or tidyr on R, cause I have too much columns (10.000+) For example, converting this data frame >…

r dplyr tidyr

asked Jul 15 '17 at 03:47

Forever

votes

4 answers

Remove columns the tidyeval way

I would like to remove a vector of columns using dplyr >= 0.7 library(dplyr) data(mtcars) rem_cols <- c("wt", "qsec", "vs", "am", "gear", "carb") head(select(mtcars, !!paste0("-", rem_cols))) Error: Strings must match column names. Unknown…

r dplyr rlang tidyeval

asked Jul 14 '17 at 10:18

Scott

votes

1 answer

dplyr 0.5.0 mutate using column index

I've updated dplyr (now 0.7.1) and a lot of my old code does not work because mutate_each has been deprecated. I use to do something like this (code below) with mutate_each using the column index. I'd do this on hundreds of columns. And I just can't…

r dplyr

asked Jul 13 '17 at 16:58

Kevin

votes

2 answers

Using n() at the same time as calculating other summary statistics

I am having trouble to prepare a summary table using dplyr based on the data set below: set.seed(1) df <- data.frame(rep(sample(c(2012,2016),10, replace = T)), sample(c('Treat','Control'),10,replace = T), …

r dplyr summary

asked Jul 11 '17 at 01:56

Arthur Carvalho Brito

votes

2 answers

Need help speeding up a dplyr aggregation

tl.dr. I have an aggregation problem that I haven't seen in documentation before. I manage to get it done, but it is way too slow for the intended application. The data I usually work with have around 500 lines (my gut feeling tells me this isn't…

r dplyr

asked Jul 04 '17 at 11:34

bdecaf

4,652
23
44

votes

1 answer

Setting column names when using bind_cols (r, dplyr)

I have a data.frame (df) which contains another data.frame called url_variables. url_variables = df$url_variables url_variables contains many other data.frames such as source, campaign, page and many others. Each of these data frames has the 3…

r dplyr

asked Jun 28 '17 at 21:42

Nick5a1

votes

2 answers

Looping with dplyr on each row of dataframe

I have a dataframe df <- data.frame(var1=c(10,20,30,40,50), var2=c(rep(0.3,5)), BYGROUP_OBSNUM=c(0:4)) var1 var2 BYGROUP_OBSNUM 10 0.3 0 20 0.3 1 30 0.3 2 40 0.3 3 50 0.3 4 I need to perform…

r loops dataframe dplyr

asked Jun 27 '17 at 09:55

Riya

votes

2 answers

Match in lagged group in data.table

I'm trying to create a new column that indicates if an ID was present in a previous group. Here's my data: data <- data.table(ID = c(1:3, c(9,2,3,4),c(5,1)), groups = c(rep(c("a", "b", "c"), c(3, 4,2)))) ID groups 1: 1 …

r data.table dplyr match matching

asked Jun 22 '17 at 21:40

Pierre Lapointe

16,017
2
43
56

votes

2 answers

Merge two lists of dataframes

I have two big lists of dataframes that I want to merge. Here is a sample of the data. list1 = list(data.frame(Wvlgth = c(337, 337.5, 338, 338.5, 339, 339.5), Global = c(".9923+00",".01245+00", ".0005+00", ".33421E+00", ".74361+00",…

r list dataframe dplyr

asked Jun 22 '17 at 15:31

ale19

1,327
7
23
38

votes

2 answers

R: How to spread, group_by, summarise and mutate at the same time

I want to spread this data below (first 12 rows shown here only) by the column 'Year', returning the sum of 'Orders' grouped by 'CountryName'. Then calculate the % change in 'Orders' for each 'CountryName' from 2014 to 2015. CountryName Days …

r dplyr tidyr

asked Jun 21 '17 at 22:53

RDJ

4,052
9
36
54

votes

1 answer

What is the purpose of dtplyr and the reason for the warning 'Please library(dtplyr)!'?

On loading the latest version of data.table (1.10.4) I get this message: > library(data.table) data.table…

r data.table dplyr

asked Jun 20 '17 at 00:15

Alex

15,186
15
73
127

votes

2 answers

Filter all days between a time range

I have a data frame like below: entry_no id time _________ ___ _____ 1 1 2016-09-01 09:30:09 2 2 2016-09-02 10:36:18 3 1 2016-09-01 12:27:27 4 3 …

r dataframe dplyr filter subset

asked May 30 '17 at 00:40

Ricky

2,662
5
25
57

Prev 1 2 3

…

100