Questions tagged [plyr]

plyr is an R package with tools to solve a variety of problems using the split-apply-combine strategy

plyr is an R package written by Hadley Wickham which contains tools to solve a variety of problems using the strategy of split, apply and combine:

Split a data structure (data frame, list, array) into smaller pieces;
Apply a function to each piece; then
Combine the results into a data structure.

It partially replaces the apply family of functions (lapply, tapply, Map, etc.) in base-R, and is partially succeeded by dplyr.

Repositories

Other resources

The Split-Apply-Combine Strategy for Data Analysis by Hadley Wickham in the Journal of Statistical Software
Data visualisation in R with ggplot2 and plyr course
Tutorial from useR2009 conference
manipulatr Google Group
Posts on R-bloggers

Related tags

r's dplyr and data.table packages

2465 questions

vote

2 answers

How can I speed up this sapply for cross checking samples?

I'm trying to speed up a QC function for checking similarity between samples. I wanted to know if there is a faster way to compare the way I am doing below? I know there have been answers to this kind of question that are pretty definitive (on SO…

r plyr

asked Jul 03 '13 at 21:39

cylondude

1,816
1
22
55

vote

1 answer

Merging files (and file names) in R

I'm trying to merge a directory full of comma delimited text files using R, while also incorporating the file name of each file as a new variable in the data set. I've been using the following: library(plyr) file_list <- list.files() dataset <-…

r file plyr

asked Jun 28 '13 at 16:28

Vadaar

vote

1 answer

daply: Correct results, but confusing structure

I have a data.frame mydf, that contains data from 27 subjects. There are two predictors, congruent (2 levels) and offset (5 levels), so overall there are 10 conditions. Each of the 27 subjects was tested 20 times under each condition, resulting in a…

r plyr

asked Jun 17 '13 at 23:11

vincentqu

vote

2 answers

ddply multiple function arguments + naming

Browsing other questions I have almost solved my problem but failing at the last hurdle... using R I have a dataframe (d) of which I pass through a function (fd) with ddply from the plyr package, this returns a dataframe as expected. In my actual…

r function plyr

asked Jun 09 '13 at 17:25

Salmo salar

vote

1 answer

Colwise eats column names within ddply

I'm trying to chunk through a data frame, find instances where the sub-data frames are unbalanced, and add 0 values for certain levels of a factor that are missing. To do this, within ddply, I did a quick comparison to a set vector of what levels…

r plyr

asked May 31 '13 at 01:11

jebyrnes

9,082
5
30
33

vote

1 answer

Sequentially numbering repetitive interactions in R

I have a data frame in R that has been previously sorted with data that looks like the following: id creatorid responderid 1 1 2 2 1 2 3 1 3 4 1 3 5 1 3 …

r plyr

asked May 30 '13 at 16:49

Pridkett

4,883
4
30
47

vote

1 answer

finding the last reading from a data.frame for ggplot2 using R

I'm trying to plot the price of vehicles over time. I'd like to include the reg. no of the vehicle as a marker for a sparkline. My data looks like this: > head (x[c(1,2,3,4)]) samp.date idx price reg.date 1 2012-11-15 xxxxxxb 27490 …

r plyr

asked May 29 '13 at 20:05

user676952

vote

2 answers

Read multiple files and save data into one dataframe in R

I am trying to read multiple files and then combine them into one data frame. The code that I am using is as follows: library(plyr) mydata = ldply(list.files(path="Data load for stations/data/Predicted",pattern = "txt"), function(filename) { dum =…

r dataframe plyr

asked May 28 '13 at 05:46

Jd Baba

5,948
18
62
96

vote

1 answer

Use plyr to summarize a data.frame and get counts of each unique item

I have a data.frame with task assignments from a ticket tracking system. Assignments <- data.frame('Task'=c(1, 1, 2, 3, 2, 2, 1), 'Assignee'=c('Alice', 'Bob', 'Alice', 'Alice', 'Bob', 'Chuck', 'Alice')) I need to summarize the data for some monthly…

r grouping plyr run-length-encoding

asked May 23 '13 at 14:27

Keith Twombley

1,666
1
17
21

vote

3 answers

Finding proportions based on data.frame subsets

I have a set of counts from data with three dimensions: df <- data.frame(type = c("A", "B", "B", "A", "A", "C", "B", "C"), group = c("Tp", "Tp", "Tp", "Tp", "Fc", "Fc", "Fc", "Fc"), size = c(10,20,30,40,10,20,30,40), count = c(1, 4, 2, 3, 2, 10, 2,…

r dataframe plyr apply

asked May 16 '13 at 13:19

MattLBeck

5,701
7
40
56

vote

1 answer

Converting ddply syntax into data.table

I have a 1.3 million row data frame which I need to aggregate into regional and temporal summaries. Plyr's syntax is straightforward, but it's just much too slow to be practical (I've left ddply to run for an hour, and it's completed less than 25%).…

r data.table plyr

asked May 11 '13 at 06:52

tomw

3,114
4
29
51

vote

1 answer

Must ddply use all possible combinations of the splitting variable(s), or only observed?

I have a data frame called thetas containing about 2.7 million observations. > str(thetas) 'data.frame': 2700000 obs. of 8 variables: $ rho_cnd : num 0 0 0 0 0 0 0 0 0 0 ... $ pct_cnd : num 0 0 0 0 0 0 0 0 0 0 ... $ sx : num 1 2…

r plyr cardinality

asked May 03 '13 at 16:42

Jon

vote

2 answers

R how to transform part of list into a data.frame?

Suppose I have a dataset as list object. Here is a way to quickly generate some random data: a <- list(x1=rnorm(10),x2=rnorm(10)) b <- list(y1=rnorm(10),y2=rnorm(10),y3=rnorm(10)) c <- list(x1=rnorm(10),x2=rnorm(10)) d <-…

r plyr data-manipulation

asked Apr 29 '13 at 21:35

Boxuan

4,937
6
37
73

vote

1 answer

Different results when when using ddply and summarize. Due to different R and plyr versions?

I'm looking to summarize data similar to the ToothGrowth data in the datasets package. The output I want looks like this: supp len half one two 1 OJ 619.9 132.3 227.0 260.6 2 VC 508.9 79.8 167.7 261.4 That is the sum of lengths split…

r plyr

asked Apr 26 '13 at 14:54

BuckyOH

vote

1 answer

Change data.frame in *_ply function

Let's say I have a <- data.frame( z = rep( c("A", "B", "C"), 2 ), p = 1:6, stringsAsFactors=FALSE ) b <- data.frame( z = c( rep( "A", 5), rep( "B", 5 ) ), q = 1:10, stringsAsFactors=FALSE ) and want to manipulate a while iterating over b using…

r plyr

asked Apr 25 '13 at 16:17

Beasterfield

7,023
2
38
47

Prev 1 2 3

…

99 100 Next