Questions tagged [dplyr]

Use this tag for questions relating to functions from the dplyr package, such as group_by, summarize, filter, and select.

The dplyr package is the next iteration of the package. It has three main goals:

  1. Identify the most important data manipulation tools needed for data analysis and make them easy to use from R.
  2. Provide fast performance for in-memory data by writing key pieces in C++.
  3. Use the same interface to work with data no matter where it's stored, whether in a data.frame, a data.table or a database.

Repositories

Vignettes

Some vignettes have been moved to other related packages.

Other resources

Related tags

36044 questions
6
votes
1 answer

Joining data frames by lubridate date %within% intervals

I've been practicing and learning wrangling R data frames with columns that contain lubridate data types, such as an example problem in my other question. Now, I am trying to do the equivalent of joining two data frames, but joining them by whether…
hpy
  • 1,989
  • 7
  • 26
  • 56
6
votes
3 answers

Summarizing data by name separated across multiple variables

I'm trying to count totals for goals, primary assists, and secondary assists for each player. My problem is that I can't get my head around the logic to do that, as the data I want to summarize by (player name) is listed across three variables…
Evan O.
  • 1,553
  • 2
  • 11
  • 20
6
votes
2 answers

mutate by group in R

I have a data with following columns: Date CID FID rank 31/01/17 abc0001 rx180x01 0 31/01/17 abc0001 rx180x02 0 31/01/17 abc0001 rx180x03 2 28/02/17 abc0001 rx180x32 1 …
Dom Jo
  • 320
  • 1
  • 3
  • 13
6
votes
2 answers

Dplyr : use mutate with columns that contain lists

I have the following dataframe (sorry for not providing an example with dput, it doesn't seem to work with lists when I paste it here): Now I am trying to create a new column y that takes the difference between mnt_opeand ref_amountfor each element…
Vincent
  • 482
  • 6
  • 22
6
votes
2 answers

Calling prop.test function in R with dplyr

I am trying to calculate several binomial proportion confidence intervals. My data are in a data frame, and though I can successfully extract the estimate from the object returned by prop.test, the conf.int variable seems to be null when run on the…
PBB
  • 131
  • 1
  • 7
6
votes
1 answer

R- how to conditionally remove first row of group_by

I need to conditionally remove the first row of a group. I want to group by column gr, then remove the first row of each group only if the first row of the group has value a e.g. gr value 1 b 1 c 1 a 2 a 2 d 3 b 3 a 3 h 3 a 4 …
Isjitar
  • 63
  • 6
6
votes
2 answers

Creating dplyr function that can tell if variable input is a string or a symbol

I've been studying the "Programming with dplyr" vignette because I want to create functions that use dplyr functions. I would like to use the functions I make in both shiny applications and interactive R work. For use in shiny, I would like these…
Dave Rosenman
  • 1,252
  • 9
  • 13
6
votes
2 answers

R: counting distinct combinations found in a data frame where columns are interchangable

I'm not sure what this problem is even called. Let's say I'm counting distinct combinations of 2 columns, but I want distinct across the order of the two columns. Here's what I mean: df = data.frame(fruit1 = c("apple", "orange", "orange",…
Joy
  • 769
  • 6
  • 24
6
votes
2 answers

confusing behavior of purrr::pmap with rlang; "to quote" or not to quote argument that is the Q

I have a custom function where I am reading entered variables from a dataframe using rlang. This function works just fine irrespective of whether the arguments entered are quoted or unquoted. But, strangely enough, when this function is used with…
Indrajeet Patil
  • 4,673
  • 2
  • 20
  • 51
6
votes
2 answers

summarizing temperature data based on a vector of temperature thresholds

I have a data frame with daily average temperature data in it, structured like so: 'data.frame': 4666 obs. of 6 variables: $ Site : chr "EB" "FFCE" "IB" "FFCE" ... $ Date : Date, format: "2013-01-01" "2013-01-01" "2013-01-01" "2014-01-01" ...…
K.west
  • 63
  • 3
6
votes
2 answers

Calculate the mean of some columns using dplyr::mutate

I want to calculate the mean of some columns using dplyr::mutate. library(dplyr) test <- data.frame(replicate(12, sample(1:12, 12, rep = T))) %>% `colnames<-`(seq(1:12) %>% paste("BL", ., sep = "")) The columns I want to include to calculate the…
analeigh
  • 75
  • 1
  • 5
6
votes
3 answers

How to add only missing Dates in Dataframe

I have below mentioned data frame: Date Val1 Val2 2018-04-01 125 0.05 2018-04-03 458 2.99 2018-04-05 354 1.25 I want to add only missing dates considering Sys.Date() (Here for example Sys.Date() is 2018-04-06) in…
Roy1245
  • 507
  • 4
  • 18
6
votes
1 answer

Supplying multiple groups of variables to a function for dplyr arguments in the body

Here is the data: library(tidyverse) data <- tibble::tribble( ~var1, ~var2, ~var3, ~var4, ~var5, "a", "d", "g", "hello", 1L, "a", "d", "h", "hello", 2L, "b", "e", "h", "k", 4L, "b", "e", "h", …
Geet
  • 2,515
  • 2
  • 19
  • 42
6
votes
1 answer

Sequential evaluation of named arguments in R

I am trying to understand how to succinctly implement something like the argument capture/parsing/evaluation mechanism that enables the following behavior with dplyr::tibble() (FKA dplyr::data_frame()): # `b` finds `a` in previous…
lefft
  • 2,065
  • 13
  • 20
6
votes
1 answer

How to pass an expression in a string to a verb in dplyr 0.7.2

I am trying to implement advice I am finding in the web but I am halfway where I want to go. Here is a reproducible example: library(tidyverse) library(dplyr) library(rlang) data(mtcars) filter_expr = "am == 1" mutate_expr = "gear_carb =…
user8270077
  • 4,621
  • 17
  • 75
  • 140