Questions tagged [tidyverse]

ONLY use this tag if your question relates to the installation, integration with your system, or inclusion of the entire tidyverse library. DO NOT USE if your question relates to one or two components of the tidyverse, such as dplyr or ggplot2. Use *those* tags, and tag with `r` as well for a better response.

tidyverse is an R package that installs a number of other packages for data processing and graphics.

Unless your question is about the entirety of the tidyverse package, its installation or its integration with your system, use tags for the packages you are actually using. Using library(tidyverse) is rarely a minimal reproducible example when only library(dplyr) is required.

See https://www.tidyverse.org/packages/ for a breakdown of the packages contained in tidyverse and their respective functions.

Repositories

Resources

Vignettes

Related tags

9739 questions
20
votes
2 answers

Replace NA on numeric columns with mutate_if and replace_na

I would like to replace NAs in numeric columns using some variation of mutate_if and replace_na if possible, but can't figure out the syntax. df <-tibble( first = c("a", NA, "b"), second = c(NA, 2, NA), third = c(10, NA, NA) ) #> # A…
Nettle
  • 3,193
  • 2
  • 22
  • 26
20
votes
6 answers

Remove an element of a list by name

I'm working with a long named list and I'm trying to keep/remove elements that match a certain name, within a tidyverse context, similar to dplyr::select(contains("pattern")) However, I'm having issues figuring it out. library(tidyverse) a_list…
kputschko
  • 766
  • 1
  • 7
  • 21
19
votes
2 answers

Refering to column names inside dplyr's across()

Is it possible to refer to column names in a lambda function inside across()? df <- tibble(age = c(12, 45), sex = c('f', 'f')) allowed_values <- list(age = 18:100, sex = c("f", "m")) df %>% mutate(across(c(age, sex), c(valid = ~…
severin
  • 2,106
  • 1
  • 17
  • 25
19
votes
4 answers

Filling missing dates in a grouped time series - a tidyverse-way?

Given a data.frame that contains a time series and one or ore grouping fields. So we have several time series - one for each grouping combination. But some dates are missing. So, what's the easiest (in terms of the most "tidyverse way") of adding…
JerryWho
  • 3,060
  • 6
  • 27
  • 49
19
votes
1 answer

How to rename a column to a variable name "in a tidyverse way"

I've created a simple data frame (dput below): date ticker value ------------------------------ 2016-06-30 A2M.ASX 0.0686 2016-07-29 A2M.ASX -0.0134 2016-08-31 A2M.ASX -0.0650 2016-09-30 A2M.ASX 0.0145 2016-10-31 …
lebelinoz
  • 4,890
  • 10
  • 33
  • 56
18
votes
4 answers

Does a multi-value purrr::pluck exist?

Seems like a basic question and perhaps I'm just missing something obvious ... but is there any way to pluck a sublist (with purrr)? More specifically, here's an initial list: l <- list(a = "foo", b = "bar", c = "baz") And I want to return a new…
mmuurr
  • 1,310
  • 1
  • 11
  • 21
18
votes
8 answers

Removing suffix from column names using rename_all?

I have a data frame with a number of columns in a form var1.mean, var2.mean. I would like to strip the suffix ".mean" from all columns that contain it. I tried using rename_all in conjunction with regex in a pipe but could not come up with a correct…
linda
  • 191
  • 1
  • 1
  • 6
18
votes
1 answer

ggplot 'non-finite values' error

I have an R dataframe (df) that looks like this: blogger; word; n; total joe; dorothy; 17; 718 paul; sheriff; 10; 354 joe; gray; 9; 718 joe; toto; 9; 718 mick; robin; 9; 607 paul; robin; 9; 354 ... I want to use ggplot2 to plot n divided by total…
Simon Lindgren
  • 2,011
  • 12
  • 32
  • 46
17
votes
5 answers

Conditional replacement of column name in tibble using dplyr

I have the following tibble: df <- structure(list(gene_symbol = c("0610005C13Rik", "0610007P14Rik", "0610009B22Rik", "0610009L18Rik", "0610009O20Rik", "0610010B08Rik" ), foo.control.cv = c(1.16204038288333, 0.120508045270669, 0.205712615954009,…
neversaint
  • 60,904
  • 137
  • 310
  • 477
17
votes
6 answers

set missing values for multiple labelled variables

How to I set missing values for multiple labelled vectors in a data frame. I am working with a survey dataset from spss. I am dealing with about 20 different variables, with the same missing values. So would like to find a way to use lapply() to…
spindoctor
  • 1,719
  • 1
  • 18
  • 42
16
votes
7 answers

tidyverse: binding list elements of same dimension

Using reduce(bind_cols), the list elements of same dimension may be combined. However, I would like to know how to combine only same dimension (may be specified dimesion in some way) elements from a list which may have elements of different…
MYaseen208
  • 22,666
  • 37
  • 165
  • 309
16
votes
3 answers

What are helpful optimizations in R for big data sets?

I built a script that works great with small data sets (<1 M rows) and performs very poorly with large datasets. I've heard of data table as being more performant than tibbles. I'm interested to know about other speed optimizations in addition to…
Cauder
  • 2,157
  • 4
  • 30
  • 69
16
votes
2 answers

Controlling decimal places displayed in a tibble. Understanding what pillar.sigfig does

I have a csv file weight.csv with the following…
mjandrews
  • 2,392
  • 4
  • 22
  • 39
16
votes
3 answers

Pass multiple functions to purrr:map

I would like to pass multiple functions at once to one purrr::map call, where the functions need some arguments. As pseudo code: funs <- c(median, mean) mtcars %>% purrr::map(funs, na.rm = TRUE) This code does not run, but is intended to show…
Sebastian Sauer
  • 1,555
  • 15
  • 24
16
votes
2 answers

summarise_at using different functions for different variables

When I use group_by and summarise in dplyr, I can naturally apply different summary functions to different variables. For instance: library(tidyverse) df <- tribble( ~category, ~x, ~y, ~z, #---------------------- …
David Pepper
  • 593
  • 1
  • 4
  • 14