Questions tagged [plyr]

plyr is an R package with tools to solve a variety of problems using the split-apply-combine strategy

plyr is an R package written by Hadley Wickham which contains tools to solve a variety of problems using the strategy of split, apply and combine:

Split a data structure (data frame, list, array) into smaller pieces;
Apply a function to each piece; then
Combine the results into a data structure.

It partially replaces the apply family of functions (lapply, tapply, Map, etc.) in base-R, and is partially succeeded by dplyr.

Repositories

Other resources

The Split-Apply-Combine Strategy for Data Analysis by Hadley Wickham in the Journal of Statistical Software
Data visualisation in R with ggplot2 and plyr course
Tutorial from useR2009 conference
manipulatr Google Group
Posts on R-bloggers

Related tags

r's dplyr and data.table packages

2465 questions

vote

2 answers

R Create One Hot Vector From List Elements

I am trying to process some character strings for an input file. First I convert the strings from a vector to a list, then I reduce to only unique values. Next I would like to convert the words in each list element into a string with a separator of…

r plyr

asked Mar 04 '13 at 18:45

screechOwl

27,310
61
158
267

vote

1 answer

Optimising by Group of own function in r

I would like to apply an optimization by group on my own function: Here a reproducable data set: data <- data.frame(ID=c(1,1,1,2,2,3,3),C=c(1,1,1,2,2,3,4), Lambda=c(0.5),s=c(1:7), …

r optimization plyr

asked Mar 01 '13 at 23:20

New2R

vote

2 answers

Passing a character vector as arguments to a function in plyr

I suspect I'm Doing It Wrong, but I'd like to pass a character vector as an argument to a function in ddply. There's a lot of Q&A on removing quotes, etc. but none of it seems to work for me (eg. Remove quotes from a character vector in R and…

r function vector plyr argument-passing

asked Feb 27 '13 at 04:09

Ben

41,615
18
132
227

vote

2 answers

Ddply and summary of categorical variables

I have a dataframe x like this Id Group Var1 001 A yes 002 A no 003 A yes 004 B no 005 B yes 006 C no I want to create a data frame like this Group yes no A 2 1 B 1 1 C …

r aggregate plyr

asked Feb 17 '13 at 15:38

corrado

vote

2 answers

How to calculate percentage change from different rows over different spans

I am trying to calculate the percentage change in price for quarterly data of companies recognized by a gvkey(1001, 1384, etc...). and it's corresponding quarterly stock price, PRCCQ. gvkey PRCCQ 1 1004 23.750 2 1004 13.875 3 1004…

r statistics plyr quantmod

asked Feb 15 '13 at 18:16

user2076502

vote

2 answers

Sum duplicates then remove all but first occurrence

I have a data frame (~5000 rows, 6 columns) that contains some duplicate values for an id variable. I have another continuous variable x, whose values I would like to sum for each duplicate id. The observations are time dependent, there are year and…

r plyr

asked Feb 08 '13 at 01:00

Chris

vote

2 answers

Summary data tables from wide data.frames

I am trying to find lazy/easy ways of creating summary tables/data.frames from wide data.frames. Assume a following data.frame, but with many more columns so that specifying the column names takes a long time: set.seed(2) x <- data.frame(Rep =…

r dataframe plyr summary

asked Feb 07 '13 at 10:47

Mikko

7,530
8
55
92

vote

1 answer

Scaling / mean center / demean variable in sqldf / SQLite?

I am trying to mean center (aka demean, scale) a variable by 3 dimensions: year, month, and region using the sqldf package in R. Here is exactly what I want to do using the plyr package: ## create example data set.seed(145) v =…

sql r sqlite plyr sqldf

asked Feb 04 '13 at 19:27

baha-kev

3,029
9
33
31

vote

2 answers

Use ddply() to aggregate relative histogram counts

Related to a previous question I asked (ggplot2 how to get 2 histograms with the y value = to count of one / sum of the count of both), I tried to write a function which would take a data.frame as input with the response times (RT) and accuracy…

r ggplot2 histogram plyr reshape2

asked Jan 31 '13 at 10:10

shora

vote

2 answers

Split input of apply function using a continuous classifier

I have the example data frame test.df<-data.frame(classifier=runif(n=1000), x1=rnorm(1000), x2=rnorm(1000), x3=rnorm(1000)) with x1,x2,...,x10000 I would like to use the apply function to perform a large amount of tests (lets say t.test) and…

r plyr apply reshape

asked Jan 31 '13 at 09:52

ECII

10,297
18
80
121

vote

1 answer

Summarize dataframe by day from timestamp

I have a dataset data that contains a timestamp and a suite of other variables with values at each timestamp. I am trying to use ddply within plyr to create a new dataframe that is the summary (e.g. mean) of a variable by the group day. How can I…

r timestamp dataframe plyr

asked Jan 30 '13 at 23:33

nofunsally

2,051
6
35
53

vote

1 answer

d_ply and dist() together

I'm having trouble with a R code that I wrote. Particularly it looks like this: n<- nrow(aa) for (i in 1:n) { A<- aa[i,] d_ply(A, 1, function(row){ cu<- dist(A) write.table(cu, file = paste(row$header, "txt", sep = "."), sep = "\t") },…

r distance plyr correlation

asked Jan 30 '13 at 16:28

Gabelins

vote

1 answer

summarise() - calculating percentages and counts of factor

I'm trying to use summarise() from the plyr-packge to calculate percentages of occurences of each level in a factor. EDIT: The Puromycin data is in the base R installation My data look like this: library(plyr) data.p <-…

r dataframe plyr summary

asked Jan 23 '13 at 08:37

Rene Bern

vote

2 answers

Use plyr to compute margins

I have a data frame with something like the following structure: Trial Index Condition1 Condition2 Measures 1 A Y ... 2 A Y ... 3 B Y…

r plyr

asked Jan 09 '13 at 17:00

Nathan

vote

1 answer

count shared occurrences and remove duplicates

I have this data.frame : df <- read.table(text= " section to from time a 1 5 9 a 2 5 9 a 1 5 10 …

r plyr

asked Dec 29 '12 at 11:59

user1317221_G

15,087
3
52
78

Prev 1 2 3

…

99 100 Next