0

I have a data frame that looks like this, there are two groups (L, R):

    Group   Value
1     L   0.04058678
2     L   0.11657916
3     L   0.08382576
4     L   0.17477007
5     L   0.08214530
6     L   0.15685707
7     L   0.08237982
8     R   0.06680679
9     R   0.05153584
10    R   0.08919266

How do I format it to look like this, where the groups are one column and all the values fall under each group's column:

      L         R
  0.11657916 0.0668067
  0.08382576 0.05153584
  0.17477007 0.08919266
  0.08214530
  0.15685707
  0.08237982

*edit: I would like to be able to do something like get the mean or sum of each group.

Chris
  • 1,150
  • 3
  • 13
  • 29
  • 2
    What's the point of getting a data frame like that? There is no correspondence between two columns in each row. Why not just split them into lists? – Psidom Jun 15 '16 at 17:58
  • Splitting them into lists would be fine. I just want to get simple statistics for each group like mean(sum, max, etc) – Chris Jun 15 '16 at 18:40

1 Answers1

4

As suggested by @Psidom in the comments, a list is better suited for data with such format.

You can try to create a list from your dataframe df1 with split():

lst <- split(df1,df1$Group)
> lst
#$L
#  Group      Value
#1     L 0.04058678
#2     L 0.11657916
#3     L 0.08382576
#4     L 0.17477007
#5     L 0.08214530
#6     L 0.15685707
#7     L 0.08237982
#
#$R
#   Group      Value
#8      R 0.06680679
#9      R 0.05153584
#10     R 0.08919266

From this list, individual data.frames can be extracted, either by indexing (lst[[1]] and lst[[2]]) or by name (lst$L and lst$R), which can be saved and treated separately if required.


It has become clear in the comments that a separation into a list of different data.frames is not necessary in this case. If the sole purpose is to perform statistics on the groups, aggregate() is a simpler option than a preprocessing of the data with split().

Here are two examples:

aggregate(Value~Group, df1, mean)
#  Group      Value
#1     L 0.10530628
#2     R 0.06917843

or

aggregate(Value~Group, df1, sum)
#  Group     Value
#1     L 0.7371440
#2     R 0.2075353

data:

df1 <- structure(list(Group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
2L, 2L, 2L), .Label = c("L", "R"), class = "factor"), Value = c(0.04058678, 
0.11657916, 0.08382576, 0.17477007, 0.0821453, 0.15685707, 0.08237982, 
0.06680679, 0.05153584, 0.08919266)), .Names = c("Group", "Value"
), class = "data.frame", row.names = c("1", "2", "3", "4", "5", 
"6", "7", "8", "9", "10"))
RHertel
  • 23,412
  • 5
  • 38
  • 64
  • Thanks! Can you explain a little about what this is doing? – Chris Jun 15 '16 at 18:03
  • 1
    type this into the console `?split` – Zelazny7 Jun 15 '16 at 18:05
  • @Chris I would like to help, but I really think that I could not describe it any better than in the documentation of `?split`. The documentation contains, among other things: "split(x,f) divides the data in the vector (or data.frame) x into the groups defined by f." and "The value returned from split is a list of vectors containing the values for the groups.". There are also several examples. Please let me know if there is anything specific that you don't understand. – RHertel Jun 15 '16 at 18:10
  • Split gives me a list of data frames. How do I do something like get the means of each group? – Chris Jun 15 '16 at 18:16
  • Search stackoverflow: http://stackoverflow.com/questions/7651539/mean-of-elements-in-a-list-of-data-frames – Zelazny7 Jun 15 '16 at 18:18
  • You can extract individual data.frames from the list; in this case with `lst[[1]]` and `lst[[2]]` or `lst$L` and `lst$R`. This should allow you to handle the data in the usual fashion. For example: `mean(lst$L$Value)` yields `[1] 0.1053063`. – RHertel Jun 15 '16 at 18:20
  • I think this is becoming more complicated than I need it to be. In the end, I just need some simple statistics on each group from the original data frame. – Chris Jun 15 '16 at 18:38
  • 1
    @RHertel aggregate() is perfect, Thank you! – Chris Jun 15 '16 at 18:57
  • I don't see any advantage to splitting into a list here. The OP's stated goal (summary stats) certainly doesn't demand it. – Frank Jun 15 '16 at 19:07
  • 2
    @Frank I agree, but the OP has edited the post and further clarified the desired output in the comments. In its original form (and if you look at the title), the creation of a new data.frame was described as the desired output. The transformed data, however, had a form that was not suitable for a data.frame, which is why Psidom and I suggested a list instead. – RHertel Jun 15 '16 at 19:10
  • Yes, I naively stated that I wanted a new data frame because it looked more manageable to me. – Chris Jun 15 '16 at 19:57