1

So i have a frequency distribution plot that looks like this:

image of the Desired point

an i need the x value corresponding to the peak of the y value. How can i get it for the plotting code?

seaborn.distplot('TheSeries',bins = 30, ax=axes[0][1])

Can someone explain how i can get that corresponding value for this and similar cases?

1 Answers1

1

You can extract the coordinates of the kde curve from ax.lines[-1] and use np.argmax() to find the mode of the curve.

Note the in the latest seaborn version distplot has been deprecated. Here histplot with kde=True would be its replacement.

from matplotlib import pyplot as plt
import numpy as np
import seaborn as sns

samples = np.random.randn(300) ** 2 * 50
ax = sns.histplot(samples, bins=30, kde=True, color='skyblue')
kdeline = ax.lines[0]
xs = kdeline.get_xdata()
ys = kdeline.get_ydata()
mode_idx = np.argmax(ys)
ax.vlines(xs[mode_idx], 0, ys[mode_idx], color='tomato', ls='--', lw=2)
plt.show()

example plot

JohanC
  • 71,591
  • 8
  • 33
  • 66
  • I have.Once again Thanks.Also is there any way to get the peak value for the same kde,but with the outliers about + 3 z values removed? – confused banana Nov 24 '20 at 12:06
  • If you remove the outliers, the peak should stay at the same position, I guess. Maybe you want to restrict the range of the x-axis, e.g. `ax.set_xlim(-1, 300)`? – JohanC Nov 24 '20 at 12:09
  • Somewhat yes.I am trying to find the x value corresponding to the peak value(in terms of frequency or occurance) of a continuous distribution.The data is moderately-highly skewed,and i have multiple such distributions(their ranges vary greatly,their total sample size is also different).So trying to model a Confidence Interval range of 95% for the peak value. – confused banana Nov 24 '20 at 12:15
  • The Outliers change the standard Deviation to a degree greatly enough.By the way will the Outliers have a large effect on the lean of the kde? – confused banana Nov 24 '20 at 12:17
  • The `x` of the peak value shouldn't change at all, the `y` would be a tiny bit higher (the integral of the kde sums to 1, so fewer values gives a bit more weight to each of the remaining values). The mean and the median would change a bit, depending on how far away the outliers are, and how many there are. To calculate the confidence interval, you probably shouldn't remove the outliers (except e.g. when you're convinced they relate to measurement errors). – JohanC Nov 24 '20 at 12:39
  • Thanks again for the tip!You are incredible! – confused banana Nov 24 '20 at 15:04