Questions tagged [plyr]

plyr is an R package with tools to solve a variety of problems using the split-apply-combine strategy

plyr is an R package written by Hadley Wickham which contains tools to solve a variety of problems using the strategy of split, apply and combine:

  • Split a data structure (data frame, list, array) into smaller pieces;
  • Apply a function to each piece; then
  • Combine the results into a data structure.

It partially replaces the apply family of functions (lapply, tapply, Map, etc.) in base-R, and is partially succeeded by .

Repositories

Other resources

Related tags

2465 questions
1
vote
2 answers

Manipulating data frames with Date format columns - R

For a data frame populated from a SQL query which looks like this: Company Month Total_Count ABC 2012-03 10 ABC 2009-01 1 DEF 2011-01 29 GHI 2001-09 10 GHI …
name_masked
  • 9,544
  • 41
  • 118
  • 172
1
vote
1 answer

Extract intervals inside groups in a dataframe , using information of another dataframe .

Like i said in the title, my purpose is to extract intervals of subset of my dataframe using information of another dataframe. my input: df1: subject x y 7G001-0024-10 0,00 15 7G001-0024-10 97,29 18 7G001-0024-10 197,34 …
mat
  • 107
  • 1
  • 6
1
vote
3 answers

plyr ddply and summarise use in R

Hi I want to avoid using loops and so want to use something from plyr to help solve my problem. I would like to create a function that gets the sum of a specifically chosen column for each factor from a dataframe. So if we have the following example…
h.l.m
  • 13,015
  • 22
  • 82
  • 169
1
vote
4 answers

Counting instances within a group (subset)

I made a small example for my data: mth <- c(rep(1,10)) day <- c(rep(10,5),rep(11,5)) hr <- c(3,4,5,6,7,3,4,5,6,7) v <- c(3,4,5,4,3,3,4,5,4,3) A <- data.frame(cbind(mth,day,hr,v)) What I need to do is to get how many value < 4 on a daily…
Rosa
  • 1,793
  • 5
  • 18
  • 23
1
vote
2 answers

Combining frequencies and summary statistics in one table?

I just discovered the power of plyr frequency table with several variables in R and I am still struggling to understand how it works and I hope some here can help me. I would like to create a table (data frame) in which I can combine frequencies…
user1043144
  • 2,680
  • 5
  • 29
  • 45
1
vote
2 answers

How to use ldply over the permutations of several vectors?

I'm building an R script that's intended to query a database multiple times (one for every permutation from the elements of 3 vectors, but I'm having a hard time figuring out how to use ldply to achieve this. tags <- c("tag1","tag2","tag3") times…
Tommy O'Dell
  • 7,019
  • 13
  • 56
  • 69
1
vote
2 answers

DDPLY Grouping Error

I'm running a ddply function and keep getting an error. Structure of data.frame: str(visits.by.user) 'data.frame': 80317 obs. of 5 variables: $ ClientID : Factor w/ 147792 levels "50912733","50098716",..: 1 3 4 5 6 7 8 10 11 12 ... $…
mikebmassey
  • 8,354
  • 26
  • 70
  • 95
1
vote
1 answer

Lagged variables in panels with ddply

I'm trying to generate precision change (based on estimated confidence intervals) in what is in essence a panel data set. So as a simple example here's the function I've written and applying it to a non-sensical example.... precision.gain <-…
slackline
  • 2,295
  • 4
  • 28
  • 43
1
vote
2 answers

Split data frame, apply function, and return results in a nested list

My question's title almost matches the dlply (plyr package) description, except for the "nested" part. Let me explain with an example: library(plyr) res <- dlply(mtcars, c("gear", "carb"), identity) head(res, 2) # $`3.1` # mpg cyl …
flodel
  • 87,577
  • 21
  • 185
  • 223
1
vote
1 answer

Using a BY variable in coxph( ) or survreg( )

I've got the output of some simulations that look something like this: Run,ID,Time,Var1,Outcome 1,1,6,0.5,1 1,2,4,0.25,1 1,3,2,0.9,1 2,1,5,0.07,1 ... 10,3,9,0.08,1 Basically a series of M studies of N individuals (in actuality M = 1000 and N =…
Fomite
  • 2,213
  • 7
  • 30
  • 46
1
vote
1 answer

plyr equivalent of statement done using mapply

I'm guessing mlply should be used here for the equivalent of what I'm doing in mapply, but I'm not able to figure out how. I really would like to understand the plyr package better. df <-…
JimmyT
  • 1,099
  • 4
  • 10
  • 15
1
vote
1 answer

Spline on multiple factors in data frame

This question is in the context where I have a lot Model types, each of the same class, but the amount of data for each Model is small and I want to spline to get a fuller dataset. I'm hoping to find a way to do this without having to individually…
rsgmon
  • 1,892
  • 4
  • 23
  • 35
1
vote
1 answer

Transformations of sparse dataframe subsets

I often find myself needing to apply a small number of rule-based transformations to dataframes based on certain conditions, typically a fixed number of fields having certain values. The transformations can modify any number of columns, usually one…
Sim
  • 13,147
  • 9
  • 66
  • 95
1
vote
1 answer

Why is there no progress bar in dlply (in the R plyr package)

I am using the plyr package to process lists and data frames. I have noticed the following behaviour: Example 1 - list_2 <- llply(list_1, function_1, .progress='text') this works as expected. It generates list_2 from list_1 with function_1 applied…
John
  • 41,131
  • 31
  • 82
  • 106
1
vote
1 answer

r - dlply with smooth.spline

dlply() gives me an error: "Object '...' not found" when I try the smooth.spline() function with it. The example below creates some data and shows how "lm" will work but "smooth.spline" won't. Note that I am doing some arithmetic in the function…
lambu0815
  • 311
  • 1
  • 2
  • 9
1 2 3
99
100