0

I currently have a dataframe called df with 18 columns, and I have plotted a histogram of each to check the distribution shape of each variable using the hist() function in pandas:

df.hist(figsize=(30,30))

What I now want to do is add a boxplot above each box plot so I can understand at a glance which variables contain outliers. I want the plot to look as follows:

enter image description here

I can plot the boxplot using the following code, but it displays all of the boxplots on a single plot:

df.boxplot(figsize=(30,30))

And I can add a group by, however, this isn't what I require. I just want each histogram in my df.hist plot to be overlayed with the boxplot derived from the same column of data. I suspect I could write a funciton to do this, but as the hist function seems quite intuitive, I suspect there is a straighforward way that I'm probably missing.

JGW
  • 314
  • 4
  • 18
  • 2
    Not that I know of. Pandas provides convenience wrappers for commonly used matplotlib functions. What you gain in comfortability, you lose in adaptability. You can modify this answer, though: [Histogram with Boxplot above in Python](https://stackoverflow.com/questions/33381330/histogram-with-boxplot-above-in-python) – Mr. T Dec 10 '20 at 15:18
  • 1
    Thanks for advice - I can see in the example provided the function is only for a 1-d array, however, this should be pretty straightforward to update using a for loop so it iterates through each of my column. Thanks again – JGW Dec 10 '20 at 15:52

0 Answers0