Questions tagged [tidyverse]

ONLY use this tag if your question relates to the installation, integration with your system, or inclusion of the entire tidyverse library. DO NOT USE if your question relates to one or two components of the tidyverse, such as dplyr or ggplot2. Use *those* tags, and tag with `r` as well for a better response.

tidyverse is an R package that installs a number of other packages for data processing and graphics.

Unless your question is about the entirety of the tidyverse package, its installation or its integration with your system, use tags for the packages you are actually using. Using library(tidyverse) is rarely a minimal reproducible example when only library(dplyr) is required.

See https://www.tidyverse.org/packages/ for a breakdown of the packages contained in tidyverse and their respective functions.

Repositories

Resources

Vignettes

Related tags

9739 questions
10
votes
2 answers

Dangers of mixing [tidyverse] and [data.table] syntax in R?

I'm getting some very weird behavior from mixing tidyverse and data.table syntax. For context, I often find myself using tidyverse syntax, and then adding a pipe back to data.table when I need speed vs. when I need code readability. I know Hadley's…
Daycent
  • 455
  • 4
  • 15
10
votes
1 answer

Adding an additional legend manually for different data.frames used in the same ggplot

In my plot below, I have two separate sources of data (dat and dat2) used in two different geom_smooth() calls producing the black and the red regression lines (see pic below). Is it possible to manually add another legend that shows the black line…
rnorouzian
  • 7,397
  • 5
  • 27
  • 72
10
votes
2 answers

R: Which is the optimal way to compute functions over time with 3D arrays (latitude, longitude, and time)?

I work a lot with large 3D array (latitude, longitude, and time), with size of for example 720x1440x480. Usually, I need to make operations over time for each latitude and longitude, for example, getting the average (resulting in a 2D array) or…
Santiago I. Hurtado
  • 1,113
  • 1
  • 10
  • 23
10
votes
3 answers

pivot_wider when there's no value column

I'm trying to reshape a dataset from long to wide. The following code works, but I'm curious if there's a way not to provide a value column and still use pivot_wider. In the following example, I have to create a temporary column "val" to use…
qnp1521
  • 806
  • 6
  • 20
10
votes
3 answers

What's a tidyverse approach to iterating over rows in a data frame when vectorisation is not feasible?

I want to know the best way to iterate over rows of a data frame when the value of a variable at row n depends on the value of variable(s) at row n-1 and/or n-2. Ideally I would like to do this in a "tidyverse" way, perhaps with purrr::pmap(). For…
Matt Cowgill
  • 659
  • 4
  • 13
10
votes
2 answers

tidy eval vs base or get() vs sym() vs as.symbol()

I have been trying to understand tidy eval or how to use variables within tidyverse for a while, but I never seem to fully grasp it. For example, I am trying to use ggplot with variable mappings. This would the base R version: library(ggplot2) var1…
burger
  • 5,683
  • 9
  • 40
  • 63
10
votes
1 answer

What distinguishes dplyr::pull from purrr::pluck and magrittr::extract2?

In the past, when working with a data frame and wanting to get a single column as a vector, I would use magrittr::extract2() like this: mtcars %>% mutate(wt_to_hp = wt/hp) %>% extract2('wt_to_hp') But I've seen that dplyr::pull() and…
crazybilly
  • 2,992
  • 1
  • 16
  • 42
10
votes
2 answers

Count data divided by year and by region in R

I have a very large (too big to open in Excel) biological dataset that looks something like this year <- c(1990, 1980, 1985, 1980, 1990, 1990, 1980, 1985, 1985,1990, 1980, 1985, 1980, 1990, 1990, 1980, 1985, 1985, …
colebrookson
  • 831
  • 7
  • 18
10
votes
6 answers

Save a data frame with list-columns as csv file

I have the following data frame that looks like this (3 columns as list). A tibble: 14 x 4 clinic_name drop_in_hours appointment_hours services …
Ann
  • 328
  • 1
  • 4
  • 14
10
votes
3 answers

How to loop over a tidy eval function using purrr?

I have the following data set (sample): train <- data.frame(ps_ind_06_bin = c(FALSE, FALSE, FALSE, TRUE, TRUE, FALSE), ps_ind_07_bin = c(FALSE, TRUE, TRUE, FALSE, TRUE, TRUE), ps_ind_08_bin = c(TRUE,…
Ramiro Bentes
  • 338
  • 1
  • 9
10
votes
3 answers

How to feed a list of unquoted column names into `lapply` (so that I can use it with a `dplyr` function)

I am trying to write a function in tidyverse/dplyr that I want to eventually use with lapply (or map). (I had been working on it to answer this question, but came upon an interesting result/dead-end. Please don't mark this as a duplicate - this…
leerssej
  • 14,260
  • 6
  • 48
  • 57
10
votes
2 answers

Opposite of unnest_tokens

This is most likely a stupid question, but I've googled and googled and can't find a solution. I think it's because I don't know the right way to word my question to search. I have a data frame that I have converted to tidy text format in R to get…
Kate
  • 512
  • 4
  • 12
10
votes
3 answers

Include "All other functions" in a pkgdown reference yaml

I have a pkgdown site in which I group a number of functions into categories in the reference .yml file. I'm wondering if there is a way to put all of the functions which I didn't explicitly categorize into their own category. The only thought I had…
Shorpy
  • 1,549
  • 13
  • 28
10
votes
2 answers

Difference between dplyr::rename and dplyr::rename_all

I have reviewed the documentation for dplyr multiple times and it indicates that dplyr::rename_all is a "scoped" variant of dplyr::rename. Can someone explain what this entails with regard to syntax and functionality? Why use one versus the other?…
socialscientist
  • 3,759
  • 5
  • 23
  • 58
10
votes
6 answers

How to convert list of list into a tibble (dataframe)

I have the following list of list. It contains two variables: pair and genes. The contain of pair is always vector with two strings. And the variable genes is a vector which can contain more than 1 values. lol <- list(structure(list(pair =…
littleworth
  • 4,781
  • 6
  • 42
  • 76