0

I am quite amateur on R, so I hope it doesn't sound silly. Here it is: I have a dataset that I am working on R. One of the variables (x1) has three categorical values (countrya, countryb, countryc). The dataset has many variables and observations, but I want to analyze it separately for each country. Should I prepare a dataframe, how can i do this with the given info? Let's say, dataset is called data; variable is called x1; and the values this variable takes and I want to analyze separately are countrya, countryb, and countryc. I hope this helps for the code. Thanks...

Paul Hiemstra
  • 59,984
  • 12
  • 142
  • 149
  • 2
    See http://4dpiecharts.com/2011/12/16/a-quick-primer-on-split-apply-combine-problems/ – Richie Cotton Apr 10 '12 at 10:29
  • Typically you would use `tapply`. For instance, to calculate the mean of variable x2 for each country: `tapply(x2, x1, mean)`. – Ernest A Apr 10 '12 at 10:44
  • 1
    There are a few similar questions around. For example, see http://stackoverflow.com/questions/10047124/grouping-ecological-data-in-r/10048629 – csgillespie Apr 10 '12 at 13:04
  • possible duplicate of [R Grouping functions: sapply vs. lapply vs. apply. vs. tapply vs. by vs. aggregate vs](http://stackoverflow.com/questions/3505701/r-grouping-functions-sapply-vs-lapply-vs-apply-vs-tapply-vs-by-vs-aggrega) – Ferdinand.kraft Aug 13 '13 at 19:21

1 Answers1

0

This sounds like a problem which is fit for ddply. Assuming your data is in a data.frame which looks something like:

value  country
21897  A
213903 A
6322   B
3567   B

you can use ddply:

ddply(df, .(country), summarise, mn = mean(value))

to calculate the mean of value for each level of country.

Paul Hiemstra
  • 59,984
  • 12
  • 142
  • 149