0

I would like to draw a boxplot figure using matplotlib.

This is my current figure: enter image description here

And this is the code to generate the figure:

pt = plt.boxplot(all_data, sym='+')
plt.yticks([0, 0.2, 0.4, 0.6, 0.8, 1.0], ['0', '20%', '40%', '60%', '80%', '100%'])
plt.xticks([y + 1 for y in range(len(all_data))], ['WMC', 'DIT', 'CBO', 'RFC', 'LCOM', 'Ca', 'NPM'])
mean = []

for line in pt['medians']:
    x, y = line.get_xydata()[1] # top of median line
    plt.text(x, y, '%.1f' % x,
      horizontalalignment='center') # draw above, centered

plt.savefig("boxplot1.pdf")

A box in a boxplot shows the 1st, 2nd and 3rd quartiles (Q1, the median and Q3) of a dataset. For each box, there is a line (which is also called a whisker and whose length is 1.5*IQR (inter quartile range) by default). So basically what I am looking for is instead of using the default value, explicitly set the lower and upper limits (or the whisker length) to a certain value I specify.

Could anyone shed some lights on this?

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
lllllllllllll
  • 8,519
  • 9
  • 45
  • 80
  • What is Q1 and wouldn't that give you 7 different ranges? Or, if you choose the first bar, the second bar would be cut out. Is this what you want? You may want to update the question and replace `Q1 +/- 1.5 * IQR` with something understandable. – ImportanceOfBeingErnest May 04 '17 at 23:17
  • In any case, by not separating the question into its two parts, (1) get the quartiles, (2) set the ylimits, you're making it harder for everyone. Duplicate of [this question](http://stackoverflow.com/questions/23461713/obtaining-values-used-in-boxplot-using-python-and-matplotlib), combined with [this question](http://stackoverflow.com/questions/23349626/getting-data-of-a-box-plot-matplotlib). Also see [this question](http://stackoverflow.com/questions/32415838/matplotlib-how-do-i-set-ylim-for-a-series-of-plots) – ImportanceOfBeingErnest May 04 '17 at 23:31
  • @ImportanceOfBeingErnest Hi there, thank you for your information. For each box, there is a line (which should equals to `Q1+1.5*IQR` by default?) on the top and another line on the bottom. So basically what I am looking for is instead of using the default value, explicitly *set* the top and bottom line to certain value. Am I clear on this? – lllllllllllll May 04 '17 at 23:52
  • @ImportanceOfBeingErnest I checked the question, and it seems not duplicate. – lllllllllllll May 04 '17 at 23:52

1 Answers1

1

To change the whiskers of the boxplot, use the whis argument of boxplot.

whis : float, sequence, or string (default = 1.5)
As a float, determines the reach of the whiskers to the beyond the first and third quartiles. In other words, where IQR is the interquartile range (Q3-Q1), the upper whisker will extend to last datum less than Q3 + whis*IQR). Similarly, the lower whisker will extend to the first datum greater than Q1 - whis*IQR. Beyond the whiskers, data are considered outliers and are plotted as individual points. Set this to an unreasonably high value to force the whiskers to show the min and max values. Alternatively, set this to an ascending sequence of percentile (e.g., [5, 95]) to set the whiskers at specific percentiles of the data. Finally, whis can be the string 'range' to force the whiskers to the min and max of the data.

ImportanceOfBeingErnest
  • 321,279
  • 53
  • 665
  • 712