1

I'm trying to subset by Date from a data frame and calculate the mean on size column for each subset.

http://i.gyazo.com/27df6d87ca9222c7c982983661b770f9.png

For this proupose I'm using the function ddply from the package library plyr:

library('plyr')
ddply(df, .(date), summarize,  mean =mean(size))

but i'm getting the following error:

Error: 'names' attribute [11] must be the same length as the vector [1]

Anyone can help me?

Thanks!

edit: yes sorry!I'm quite new in R

               date                 host petition                                        resource protocol result size      date2
1995-07-01 06:00:01         199.72.81.55      GET                                /history/apollo/ HTTP/1.0    200 6245 1995-07-01
1995-07-01 06:00:06 unicomp6.unicomp.net      GET                             /shuttle/countdown/ HTTP/1.0    200 3985 1995-07-01
1995-07-01 06:00:09       199.120.110.21      GET    /shuttle/missions/sts-73/mission-sts-73.html HTTP/1.0    200 4085 1995-07-01
1995-07-01 06:00:11   burger.letters.com      GET                 /shuttle/countdown/liftoff.html HTTP/1.0    304    0 1995-07-01  
1995-07-01 06:00:11       199.120.110.21      GET /shuttle/missions/sts-73/sts-73-patch-small.gif HTTP/1.0    200 4179 1995-07-01
1995-07-01 06:00:12   burger.letters.com      GET                      /images/NASA-logosmall.gif HTTP/1.0    304    0 1995-07-01

df <- read.table(header = TRUE, text = "date                 host petition                                        resource protocol result size      date2
'1995-07-01 06:00:01'         199.72.81.55      GET                                /history/apollo/ HTTP/1.0    200 6245 1995-07-01
'1995-07-01 06:00:06' unicomp6.unicomp.net      GET                             /shuttle/countdown/ HTTP/1.0    200 3985 1995-07-01
'1995-07-01 06:00:09'       199.120.110.21      GET    /shuttle/missions/sts-73/mission-sts-73.html HTTP/1.0    200 4085 1995-07-01
'1995-07-01 06:00:11'   burger.letters.com      GET                 /shuttle/countdown/liftoff.html HTTP/1.0    304    0 1995-07-01  
'1995-07-01 06:00:11'       199.120.110.21      GET /shuttle/missions/sts-73/sts-73-patch-small.gif HTTP/1.0    200 4179 1995-07-01
'1995-07-01 06:00:12'   burger.letters.com      GET                      /images/NASA-logosmall.gif HTTP/1.0    304    0 1995-07-01",
                 colClasses = c('POSIXct','character','character','character',
                                'character','numeric','numeric','Date'))

I've also tryed converting to character and then applying ddply

tmp <- df$date2
df$date2 <- as.character(df$date2)
class(df$date2)
 [1] "character"
mean_con <- ddply(df, .(date2), summarize,  mean = mean(size))
 Error: 'names' attribute [11] must be the same length as the vector [1]

This is the str(df):

> str(df)
'data.frame':   1891715 obs. of  8 variables:
 $ date    : POSIXlt, format: "1995-07-01 06:00:01" "1995-07-01 06:00:06"         "1995-07-01 06:00:09" ...
 $ host    : chr  "199.72.81.55" "unicomp6.unicomp.net" "199.120.110.21"    "burger.letters.com" ...
 $ petition: chr  "GET" "GET" "GET" "GET" ...   
 $ resource: chr  "/history/apollo/" "/shuttle/countdown/"  "/shuttle/missions/sts-73/mission-sts-73.html" "/shuttle/countdown/liftoff.html"    ...
 $ protocol: chr  "HTTP/1.0" "HTTP/1.0" "HTTP/1.0" "HTTP/1.0" ...
 $ result  : Factor w/ 9 levels "","200","302",..: 2 2 2 4 2 4 2 2 2 2 ...
 $ size    : int  6245 3985 4085 0 4179 0 0 3985 3985 7074 ...
 $ date2   : chr  "1995-07-01" "1995-07-01" "1995-07-01" "1995-07-01" ...  
 > 

So date2 is chr...

Ok I reboot Rstudio and now it's working.. Ty to all!

z4k4
  • 161
  • 2
  • 5
  • Rather than including a picture of your data, can you include `head(df)` or the output from `dput()` if it isn't too large? – Alex A. Apr 21 '15 at 17:34
  • You may have to convert the dates to character as this discussion talks about http://stackoverflow.com/questions/14153092/meaning-of-ddply-error-names-attribute-9-must-be-the-same-length-as-the-vec – MichaelT Apr 21 '15 at 18:02
  • your `ddply` code works for me when I read in the data as in my edit. check to make sure the `str(df)` that you were using matches my edit, that is, that the class of date is actually a date and not a character or factor or something. actually this shouldn't really matter either. – rawr Apr 21 '15 at 18:58

0 Answers0