I am new to Julia, and I am exploring the ways in which I can visualize distributions. Ultimately, I may fall back on the more robust matplotlib code base, but I really enjoy the dynamic visualization element that IJulia offers in the Notebook.
My issue concerns plotting histograms of proportional values with Gadfly. I am able to plot the kernel density with Gadfly automatically selecting a reasonable support (a.k.a. one consistent with the underlying data: [-0.1,0.5]).
#Visualization
using Gadfly
#(Python) pandas analogue
require("DataFrames")
#Practice sets
require("RDatasets")
#Extract the Hedonic set
hedonic=RDatasets.dataset("plm","Hedonic")
#Define density plot layer for black population proportion
dens_layer=layer(hedonic,x=:Blacks,Geom.density,Theme(default_color=color("#de2d26")))
#Plot kernel density
dens_plot=plot(dens_layer, Guide.title("Distribution of Black Proportion"))
The histogram plot, however, is fit to a support that is far too large ([0,4]). All of the relevant data are captured by a single bar that spans the entire [0,1] interval.
#Define histogram layer
hist_layer=layer(hedonic,x=:Blacks,Geom.histogram,Theme(default_color=color("#de2d26")))
#Plot histogram
hist_plot_default=plot(hist_layer, Guide.title("Distribution of Black Proportion"))
When I increase the bincount
, the support just grows. For example, with bincount=100
, the support grows to [0,150], with all of the data still represented by a single bar.
#Plot histogram again, this time with 100 bins
hist_plot_bin100=plot(hedonic,x=:Blacks,Geom.histogram(bincount=100),Theme(default_color=color("#de2d26")))
So, if anyone can tell me what I am screwing up, it would certainly be appreciated. Alternatively, perhaps restricting the range would force the appropriate allocation of histogram bars...? To that end, how do I restrict the range so that I can view the distribution on the [0,1] interval?