-2

can anyone explain me the difference between these four statements in R language?

data
 [1] 18 22 18 20 20 20 20 17 17 16 20 17 21 19 18 19 13 21 19 14 22 19 20 20 16 19 21 19 17 20 15 20 18 19 26 21 19 22 20 24 25
 [42] 14 20 17 20 21 19 20 16 18 18 16 18 16 15 20 15 17 20 16 16 17 21 19 17 21 19 21 19 19 18 16 17 15 21 22 18 19 18 22 23 20
 [83] 21 17 17 15 12 23 18 19 18 21 18 17 18 22 16 20 21 18

table(data)
 data
 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 
 1  1  2  5  9 12 15 15 17 12  6  2  1  1  1 

hist(data)
hist(data, probability=TRUE)
hist(table(data))
hist(table(data), probability=TRUE)

The result is: enter image description here

Thank you!

Community
  • 1
  • 1
Mark116
  • 653
  • 1
  • 6
  • 23
  • Have a look at [this discussion](http://stackoverflow.com/questions/17416453/force-r-to-plot-histogram-as-probability-relative-frequency) as it corresponds to your question. – Konrad Feb 13 '16 at 19:39
  • 1
    Have you have the help of these functions? Using `table` means you're plotting a different information than the original data, and the `probability` argument is well described there. What exactly isn't clear? – Molx Feb 13 '16 at 19:41
  • Yes, i've seen the help, but i don't be able to assign a name to each graph. The first represent the frequency of different values, right? for example: length(data[data>=18 & data<20]) But the third? The table function return the different frequency, right? – Mark116 Feb 14 '16 at 08:50

1 Answers1

0

hist(data) is a historgam of the data while hist(table(data)) is a histogram of the counts (i.e. a histogram of a histogram). What the second plot tells you is that most ages occur 0 to 5 times in the data. The argument probability=TRUE only rescales the y axis to match the measured probability (=counts/total) instead of the counts.

AlexR
  • 2,412
  • 16
  • 26