1

I am new to ggplot2 and I am trying to obtain the same histogram that I would with hist(results, breaks = 30).

How do I replicate the same histogram with ggplot2? I am playing with the binwidth parameter of the geom_histogram, but I am having a hard time making the two histograms look identical.

Quality Catalyst
  • 6,531
  • 8
  • 38
  • 62
Vido
  • 13
  • 1
  • 3

2 Answers2

3

If you use the code you will see how the R decided to break up your data:

data(mtcars)
histinfo <- hist(mtcars$mpg)

From the histinfo you will get the necessary information concerning the breaks.

$breaks
[1] 10 15 20 25 30 35

$counts
[1]  6 12  8  2  4

$density
[1] 0.0375 0.0750 0.0500 0.0125 0.0250

$mids
[1] 12.5 17.5 22.5 27.5 32.5

$xname
[1] "mtcars$mpg"

$equidist
[1] TRUE

attr(,"class")
[1] "histogram"
> 

Now you can tweak the code below to make your ggplot histogram, look more like the base one. You would have to change axis labels, scale and colours. theme_bw() will help you to get some settings in order.

data(mtcars)
require(ggplot2)
qplot(mtcars$mpg,
      geom="histogram", 
      binwidth = 5) +
    theme_bw()

and change the binwidth value to whatever suits you. histogram

Konrad
  • 17,740
  • 16
  • 106
  • 167
1

Adding to @Konrad's answer, instead of using hist you can use one of the nclass.* functions directly (see the nclass documentation). There are three functions that are used by hist:

nclass.Sturges uses Sturges' formula, implicitly basing bin sizes on the range of the data.

nclass.scott uses Scott's choice for a normal distribution based on the estimate of the standard error, unless that is zero where it returns 1.

nclass.FD uses the Freedman-Diaconis choice based on the inter-quartile range (IQR) unless that's zero where it reverts to mad(x, constant = 2) and when that is 0 as well, returns 1.

The hist function by default uses nclass.Sturges.

Tim
  • 7,075
  • 6
  • 29
  • 58