0

I am an R and Stack Overflow newbie so hope I'm following protocol here.

How do I transform a data frame like:

FRUIT   NUMBER  NAME    AGE
apples  5       Joe     13
oranges 6       Joe     13
apples  2       Mary    10
oranges 4       Mary    10

into a grouped data fame like:

NAME    AGE  APPLES ORANGES
Joe     13   5       6
Mary    10   2       4

Thanks in advance.

musically_ut
  • 34,028
  • 8
  • 94
  • 106
kc2819
  • 1,491
  • 2
  • 10
  • 5
  • 1
    Hi and welcome to SO! Thanks for posting a minimal example data set and your expected answer, Since you are mentioning an SO protocol, you may read [about Stackoverflow](http://stackoverflow.com/about) and [what to ask](http://stackoverflow.com/help/on-topic). People are generally much more willing to help if you show that you have tried yourself, i.e. the code you have tried so far and why it didn't work. Cheers. – Henrik Nov 09 '13 at 20:53

2 Answers2

2

You can use the function dcast from package reshape2.

> library('reshape2')
> d <- read.table(header=T, text="FRUIT   NUMBER  NAME    AGE
+ apples  5       Joe     13
+ oranges 6       Joe     13
+ apples  2       Mary    10
+ oranges 4       Mary    10");


> dcast(d, NAME + AGE ~ FRUIT, value.var='NUMBER')
  NAME AGE apples oranges
1  Joe  13      5       6
2 Mary  10      2       4

This is changing data from a (somewhat) long format to deep format, you can find more details of how to approach this task here.

musically_ut
  • 34,028
  • 8
  • 94
  • 106
0

Using sqldf:

library(sqldf)
sqldf('SELECT NAME, AGE,
      MAX(CASE WHEN FRUIT = "apples" THEN NUMBER ELSE NULL END) apples, 
      MAX(CASE WHEN FRUIT = "oranges" THEN NUMBER ELSE NULL END) oranges
      FROM d 
      GROUP BY NAME 
      ORDER BY NAME')

Output:

  NAME AGE apples oranges
1  Joe  13      5       6
2 Mary  10      2       4
mpalanco
  • 12,960
  • 2
  • 59
  • 67