I've been attempting to understand what and how plyr works through trying different variables and functions and seeing what results. So I'm more looking for an explanation of how plyr works than specific fix it answers. I've read the documentation but my newbie brain is still not getting it.
Some data and names:
mydf<- data.frame(c("a","a","b","b","c","c"),c("e","e","e","e","e","e")
,c(1,2,3,10,20,30),
c(5,10,20,20,15,10))
colnames(mydf)<-c("Model", "Class","Length", "Speed")
mydf
Question 1: Summarise versus Transform Syntax
So if I Enter: ddply(mydf, .(Model), summarise, sum = Length+Length)
I get:
`Model ..1
1 a 2
2 a 4
3 b 6
4 b 20
5 c 40
6 c 60
and if I enter: ddply(mydf, .(Model), summarise, Length+Length)
I get the same result.
Now if use transform: ddply(mydf, .(Model), transform, sum = (Length+Length))
I get:
Model Class Length Speed sum
1 a e 1 5 2
2 a e 2 10 4
3 b e 3 20 6
4 b e 10 20 20
5 c e 20 15 40
6 c e 30 10 60
But if I state it like the first summarise :
ddply(mydf, .(Model), transform, (Length+Length))
Model Class Length Speed
1 a e 1 5
2 a e 2 10
3 b e 3 20
4 b e 10 20
5 c e 20 15
6 c e 30 10
So why does adding "sum =" make a difference?
Question 2: Why don't these work?
ddply(mydf, .(Model), sum, Length+Length)
#Error in function (i) : object 'Length' not found
ddply(mydf, .(Model), length, mydf$Length) #Error in .fun(piece, ...) :
2 arguments passed to 'length' which requires 1
These examples are more to show that somewhere I'm fundamentally not understanding how to use plyr.
Any anwsers or explanations are appreciated.