Exploring a new data set: What is the easiest, quickest way to visualise many (all) variables?
Ideally, the output shows the histograms next to each other with minimal clutter and maximum information. Key to this question is flexibility and stability to deal with large and different data sets. I'm using RStudio and usually deal with large and messy survey data.
One example which comes out of the box of Hmisc
and works quite well here is:
library(ggplot2)
str(mpg)
library(Hmisc)
hist.data.frame(mpg)
Unfortunately, somewhere else I run into problems with data lables (Error in plot.new() : figure margins too large). It also crashed for a larger data set than mpg
and I haven't figured out how to control binning. Moreover, I'd prefer a flexible solution in ggplot2
. Note that I just started learning R and am used to the comfortable solutions provided by commercial software.
More questions on this topic:
R histogram - too many variables
...?