Questions tagged [plyr]

plyr is an R package with tools to solve a variety of problems using the split-apply-combine strategy

plyr is an R package written by Hadley Wickham which contains tools to solve a variety of problems using the strategy of split, apply and combine:

Split a data structure (data frame, list, array) into smaller pieces;
Apply a function to each piece; then
Combine the results into a data structure.

It partially replaces the apply family of functions (lapply, tapply, Map, etc.) in base-R, and is partially succeeded by dplyr.

Repositories

Other resources

The Split-Apply-Combine Strategy for Data Analysis by Hadley Wickham in the Journal of Statistical Software
Data visualisation in R with ggplot2 and plyr course
Tutorial from useR2009 conference
manipulatr Google Group
Posts on R-bloggers

Related tags

r's dplyr and data.table packages

2465 questions

votes

8 answers

quick/elegant way to construct mean/variance summary table

I can achieve this task, but I feel like there must be a "best" (slickest, most compact, clearest-code, fastest?) way of doing it and have not figured it out so far ... For a specified set of categorical factors I want to construct a table of means…

r aggregate plyr reshape2

asked Sep 16 '11 at 18:58

Ben Bolker

211,554
25
370
453

votes

1 answer

round_any equivalent for dplyr?

I am trying to make a switch to the "new" tidyverse ecosystem and try to avoid loading the old packages from Wickham et al. I used to rely my coding previously. I found round_any function from plyr useful in many cases where I needed custom rounding…

r dplyr rounding plyr tidyverse

asked Apr 26 '17 at 07:29

Mikko

7,530
8
55
92

votes

4 answers

dplyr: apply function table() to each column of a data.frame

Apply function table() to each column of a data.frame using dplyr I often apply the table-function on each column of a data frame using plyr, like this: library(plyr) ldply( mtcars, function(x) data.frame( table(x), prop.table( table(x) ) ) ) Is…

r plyr dplyr

asked Dec 26 '14 at 17:09

Rasmus Larsen

5,721
8
47
79

votes

3 answers

Learning to understand plyr, ddply

I've been attempting to understand what and how plyr works through trying different variables and functions and seeing what results. So I'm more looking for an explanation of how plyr works than specific fix it answers. I've read the documentation…

r plyr

asked Jul 06 '12 at 22:11

rsgmon

1,892
4
23
35

votes

2 answers

ddply + summarize for repeating same statistical function across large number of columns

Ok, second R question in quick succession. My data: Timestamp St_01 St_02 ... 1 2008-02-08 00:00:00 26.020 25.840 ... 2 2008-02-08 00:10:00 25.985 25.790 ... 3 2008-02-08 00:20:00 25.930 25.765 ... 4 2008-02-08 00:30:00 25.925…

r multiple-columns plyr idioms split-apply-combine

asked May 28 '12 at 16:19

Reuben L.

2,806
2
29
45

votes

4 answers

Simple working example of ddply() in parallel on Windows

I've been searching around for a simple working example of using ddply() in parallel. I've installed the "foreach" package, but when I call ddply( .parallel = TRUE) I get a warning that "No parallel backend registered") Can someone provide a simple…

r foreach plyr

asked Jul 21 '11 at 17:21

Suraj

35,905
47
139
250

votes

3 answers

Using plyr::mapvalues with dplyr

plyr::mapvalues can be used like this: mapvalues(mtcars$cyl, c(4, 6, 8), c("a", "b", "c")) But this doesn't work: mtcars %>% dplyr::select(cyl) %>% mapvalues(c(4, 6, 8), c("a", "b", "c")) %>% as.data.frame() How can I use plyr::mapvalues with…

r dataframe plyr dplyr

asked Jan 18 '15 at 19:09

luciano

13,158
36
90
130

votes

7 answers

plyr or dplyr in Python

This is more of a conceptual question, I do not have a specific problem. I am learning python for data analysis, but I am very familiar with R - one of the great things about R is plyr (and of course ggplot2) and even better dplyr. Pandas of course…

python r pandas plyr dplyr

asked Nov 12 '14 at 02:55

user1617979

2,370
3
25
30

votes

2 answers

Convert R list to dataframe with missing/NULL elements

Given a list: alist = list( list(name="Foo",age=22), list(name="Bar"), list(name="Baz",age=NULL) ) what's the best way to convert this into a dataframe with name and age columns, with missing values (I'll accept NA or "" in that order of…

r list dataframe plyr

asked Apr 03 '13 at 17:10

Spacedman

92,590
12
140
224

votes

2 answers

R: converting each row of a data frame into a list item

I have a number of operations on data frames which I would like to speed up using mclapply() or other lapply() like functions. One of the easiest ways for me to wrestle with this is to make each row of the data frame a small data frame in a list. I…

r parallel-processing multicore dataframe plyr

asked Feb 24 '11 at 21:32

JD Long

59,675
58
202
294

votes

5 answers

Joining aggregated values back to the original data frame

One of the design patterns I use over and over is performing a "group by" or "split, apply, combine (SAC)" on a data frame and then joining the aggregated data back to the original data. This is useful, for example, when calculating each county's…

r plyr

asked Feb 17 '11 at 15:40

JD Long

59,675
58
202
294

votes

2 answers

dplyr rename - Error: `new_name` = old_name must be a symbol or a string, not formula

I am trying to rename a column with dplyr::rename() and R is returning this error that I am unable to find anywhere online. Error: `new_name` = old_name must be a symbol or a string, not formula Reproducible example with a 2 column data…

r dplyr rename plyr rlang

asked Dec 11 '17 at 14:53

Rassakhatsky Dmitry

votes

3 answers

Idiomatic R code for partitioning a vector by an index and performing an operation on that partition

I'm trying to find the idiomatic way in R to partition a numerical vector by some index vector, find the sum of all numbers in that partition and then divide each individual entry by that partition sum. In other words, if I start with this: df <-…

r functional-programming plyr

asked May 25 '12 at 03:51

John Horton

4,122
6
31
45

votes

3 answers

Sending in Column Name to ddply from Function

I'd like to be able to send in a column name to a call that I am making to ddply. An example ddply call: ddply(myData, .(MyGrouping), summarise, count=sum(myColumnName)) If I have ddply wrapped within another function is it possible to wrap this so…

r plyr

asked Apr 16 '12 at 16:43

Dave

2,386
1
20
38

votes

2 answers

Correlation between two dataframes by row

I have 2 data frames w/ 5 columns and 100 rows each. id price1 price2 price3 price4 price5 1 11.22 25.33 66.47 53.76 77.42 2 33.56 33.77 44.77 34.55 57.42 ... I…

r dataframe correlation plyr

asked Feb 03 '12 at 22:02

screechOwl

27,310
61
158
267

Prev 1 2

…

99 100 Next