-3

year   name   sex numberOfBirth
1950   mark   M   25
1950   jill   F   60
1950   jesy   F   26
1950   john   M   50
1950   ana    F   78
.
.
.
2010  tom    M   67
2010  jack   M   25
2010  lia    F   45
2010  jesse  F   36

for 2000 rows

jpw
  • 44,361
  • 6
  • 66
  • 86
agnus
  • 5
  • 1
  • 1
  • Have a look at `?prop.table` – user20650 Oct 18 '14 at 23:28
  • 1
    You can't just dump data and expect us to know what you're asking. You did the same thing in [your other question](http://stackoverflow.com/questions/26442313/how-to-find-the-frequency-of-different-first-letters-in-a-name-for-2-different-y). At this rate, both questions will be closed and your account could get suspended for asking bad questions. – Rich Scriven Oct 18 '14 at 23:30
  • I suggest you start to write some code yourself – Rich Scriven Oct 18 '14 at 23:39
  • 1
    There are many duplicates of this question...here is one http://stackoverflow.com/questions/9623763/in-r-how-can-i-compute-percentage-statistics-on-a-column-in-a-dataframe-tabl – user20650 Oct 18 '14 at 23:39

2 Answers2

3
library(dplyr)
df %>% group_by(year) %>%
  summarize(pct.males = sum(df$sex == 'M') / length(df$sex) * 100,
            pct.female = sum(df$sex == 'F') / length(df$sex) * 100)
petew
  • 671
  • 8
  • 13
  • Why have you commented on your own post? Are you asking yourself a question? – Neeku Oct 19 '14 at 00:08
  • No @Neeku I am not asking myself a question. There were other comments to this but apparently they were deleted. Thank you for your contribution, however. – petew Oct 19 '14 at 00:11
  • Fair enough! I noticed this in the review queue and I thought I should comment on it. – Neeku Oct 19 '14 at 00:26
  • 2
    @epwalsh +1 I guess you don't need `df$`, just `sum(sex=='M')/length(sex) *100` would be sufficient. – akrun Oct 19 '14 at 03:46
  • Oh, yea. Good call, @akrun – petew Oct 19 '14 at 16:26
3

Or using data.table

library(data.table)
setDT(df)[, list(Males = sum(sex == "M")/.N, 
                 Females = sum(sex == "F")/.N), by = year]

Or base R solution proposed by @user20650

prop.table(with(df, table(year, sex)), 1)
David Arenburg
  • 91,361
  • 17
  • 137
  • 196
  • sorry,I mean the proportion of males with NoOFBirths for each year. example, in row1 there are 25 birth with name Mark – agnus Oct 19 '14 at 07:13
  • Sorry, I don't understand the question.Can you please modify your original question with desired output? – David Arenburg Oct 19 '14 at 07:47