Questions tagged [tidyverse]

ONLY use this tag if your question relates to the installation, integration with your system, or inclusion of the entire tidyverse library. DO NOT USE if your question relates to one or two components of the tidyverse, such as dplyr or ggplot2. Use *those* tags, and tag with `r` as well for a better response.

tidyverse is an R package that installs a number of other packages for data processing and graphics.

Unless your question is about the entirety of the tidyverse package, its installation or its integration with your system, use tags for the packages you are actually using. Using library(tidyverse) is rarely a minimal reproducible example when only library(dplyr) is required.

See https://www.tidyverse.org/packages/ for a breakdown of the packages contained in tidyverse and their respective functions.

Repositories

Resources

Vignettes

Related tags

9739 questions
9
votes
6 answers

Summarize data at different aggregate levels - R and tidyverse

I'm creating a bunch of basic status reports and one of things I'm finding tedious is adding a total row to all my tables. I'm currently using the Tidyverse approach and this is an example of my current code. What I'm looking for is an option to…
Reeza
  • 20,510
  • 4
  • 21
  • 38
9
votes
3 answers

Group_by and mutate slow on large dataframe

I am working with large (min 8 mil rows) dataframes and want to do some basic calculations based on a couple grouping variables and rmultinom. As my code stands it takes at least ~1 sec to complete the calculation, which wouldn't be a problem but I…
flee
  • 1,253
  • 3
  • 17
  • 34
9
votes
2 answers

Giving the list returned by purrr::map names

Is there a way to automatically give names to the returned list given by purrr:map? For example, I run code like this very often. fn <- function(x) { paste0(x, "_") } l <- map(LETTERS, fn) names(l) <- LETTERS I'd like for the vector that is being…
user1775655
  • 317
  • 2
  • 8
9
votes
1 answer

How to convert column types in R tidyverse

I'm trying to get comfortable with using the Tidyverse, but data type conversions are proving to be a barrier. I understand that automatically converting strings to factors is not ideal, but sometimes I would like to use factors, so some approach to…
tef2128
  • 740
  • 1
  • 8
  • 19
9
votes
4 answers

Creating models and augmenting data without losing additional columns in dplyr/broom

Consider the following data / example. Each dataset contains a number of samples with one observation and one estimate: library(tidyverse) library(broom) data = read.table(text = ' dataset sample_id observation estimate A A1 4.8 4.7 A A2 …
slhck
  • 36,575
  • 28
  • 148
  • 201
9
votes
3 answers

Joining two data frames with intervals misbehaves?

Edit (2019-06): This problem does not exist anymore, as this issue has been closed and a related feature implemented. If you now run the code with updated packages, it will work. I'm trying to find overlapping intervals and decided to join the…
pasipasi
  • 1,176
  • 10
  • 8
9
votes
1 answer

Replacement for parallel plyr with doMC

Consider a standard grouped operation on a data.frame: library(plyr) library(doMC) library(MASS) # for example nc <- 12 registerDoMC(nc) d <- data.frame(x = c("data", "more data"), g = c("group1", "group2")) y <- "some global object" res <-…
Devin
  • 851
  • 12
  • 32
9
votes
1 answer

dplyr : how-to programmatically full_join dataframes contained in a list of lists?

Context and data structure I'll share with you a simplified version of my huge dataset. This simplified version fully respects the structure of my original dataset but contains less list elements, dataframes, variables and observations than the…
pokyah
  • 163
  • 1
  • 9
9
votes
1 answer

Correct usage of dplyr::select in dplyr 0.7.0+, selecting columns using character vector

Suppose we have a character vector cols_to_select containing some columns we want to select from a dataframe df, e.g. df <- tibble::data_frame(a=1:3, b=1:3, c=1:3, d=1:3, e=1:3) cols_to_select <- c("b", "d") Suppose also we want to use…
RobinL
  • 11,009
  • 8
  • 48
  • 68
9
votes
1 answer

Spread with duplicate identifiers (using tidyverse and %>%)

My data looks like this: I am trying to make it look like this: I would like to do this in tidyverse using %>%-chaining. df <- structure(list(id = c(2L, 2L, 4L, 5L, 5L, 5L, 5L), start_end = structure(c(2L, 1L, 2L, 2L, 1L, 2L, 1L), .Label =…
Rasmus Larsen
  • 5,721
  • 8
  • 47
  • 79
9
votes
2 answers

using purrr to affect single columns of each dataframe in a list

still getting used to purrr and I have one of those questions that I think should be easy, but I don't know how to do it. All I want to do is convert the datetimes in the below, to dates with as.Date(). it's a list of dataframes. Been playing around…
jsg51483
  • 193
  • 1
  • 10
9
votes
2 answers

What is the **tidyverse** method for splitting a df by multiple columns?

I would like to split a dataframe by multiple columns so that I can see the summary() output for each subset of the data. Here's a way to do that using split() from base: library(tidyverse) #> Loading tidyverse: ggplot2 #> Loading tidyverse:…
Tiernan
  • 828
  • 8
  • 20
8
votes
3 answers

How to add polygons to your data for a voronoi treemap in R?

I have a data frame that looks like this. It contains the sunflower seed productivity of each country. I want to add next to this data polygon data so I can plot it with ggplot2. I was told to use this site:…
LDT
  • 2,856
  • 2
  • 15
  • 32
8
votes
3 answers

Sophisticated formula inside arrange

I would like to obtain a generic formula to arrange dataframes with a varying number of columns. For example, in this case the dataframe contains "categ_1, categ_2, points_1, points_2": library(tidyverse) set.seed(1) nrows <- 20 df <-…
crestor
  • 1,388
  • 8
  • 21
8
votes
1 answer

Dynamic `case_when` that allows for different number of conditions and conditions itself

I'm looking for a dynamic way to specify some "condition parameters" and then feed that to a case_when operation or something else if better suited for that problem. My goal is to separate the specification of the conditions from the case_when call,…
deschen
  • 10,012
  • 3
  • 27
  • 50