2

How do I fit a curve on a barplot?

I have an equation, the diffusion equation, which has some unknown parameters, these parameters make the curve larger, taller, etc. On the other hand I have a barplot coming from a simulation. I would like to fit the curve on the barplot, and find the best parameters for the curve, how can I do that?

This is what I obtained by 'manual fitting', so basically I changed manually all the parameters for hours. However is there a way to do this with python?

Attempt of curve fitting

To make it simple, imagine I have the following code:

import matplotlib.pyplot as plt


list1 = []
for i in range(-5,6):
    list1.append(i)
    
width = 1/1.5

list2 = [0,0.2,0.6,3.5,8,10,8,3.5,0.6,0.2,0]

plt.bar(list1,list2,width)

plt.show()
T = 0.13
xx = np.arange(-6,6,0.01)
yy = 5*np.sqrt(np.pi)*np.exp(-((xx)**2)/(4*T))*scipy.special.erfc((xx)/(2*np.sqrt(T))) + np.exp(-((xx)**2)/(4*T)) 

plt.plot(xx,yy)

plt.show()

Clearly the fitting here would be pretty hard, but anyway, is there any function or such that allows me to find the best coefficients for the equation: (where T is known)

y = A*np.sqrt(np.pi*D)*np.exp(-((x-E)**2)/(4*D*T))*scipy.special.erfc((x-E)/(2*np.sqrt(D*T))) + 300*np.exp(-((x-E)**2)/(4*D*T)) 

EDIT: This is different from the already asked question and from the scipy documentation example. In the latter the 'xdata' is the same, while in my case it might and might not be. Furthermore I would also be able to plot this curve fitting, which isn't shown on the documentation. The height of the bars is not a function of the x's! So my xdata is not a function of my ydata, this is different from what is in the documentation. To see what I mean try to change the code in the documentation a little bit, to fall into my example, try this:

def func(x,a,b,c):
    return a * np.exp(-b * x) + c
    
xdata = np.linspace(0,4,50)
y = func(xdata, 2.5, 1.3, 0.5)
ydata = [1,6,3,4,6,7,8,5,7,0,9,8,2,3,4,5]

popt, pcov = curve_fit(func,xdata,ydata)

if you run this, it doesn't work. The reason is that I have 16 elements for the ydata and 50 for the function. This happens because y takes values from xdata, while ydata takes values from another set of x values, which is here unknown.

Thank you

Community
  • 1
  • 1
Euler_Salter
  • 3,271
  • 8
  • 33
  • 74
  • http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html – reptilicus Jun 17 '16 at 17:30
  • Possible duplicate of [curve fitting with python](http://stackoverflow.com/questions/8280871/curve-fitting-with-python) – lanery Jun 17 '16 at 17:39
  • @lanery No it's different, I don't have data, I have a barplot which doesn't take values from the same x-range! – Euler_Salter Jun 17 '16 at 17:43
  • @reptilicus, thank you for the link, however it doesn't work in my example, my function doesn't take x values from the same range of x's of the bargraph – Euler_Salter Jun 17 '16 at 17:44
  • @lanery, furthermore I want to find the parameters, so the question is totally different – Euler_Salter Jun 17 '16 at 17:58
  • The main difference here is that in order to make a bargraph I need certain x values, say integers from -10 to 10. While when I plot the function, I used different x values, say np.arrange(-10,10,0.01) or something like that! Hence I can't do as in the question or in the example. – Euler_Salter Jun 17 '16 at 18:19

1 Answers1

3

I stand by my thinking that this question is a duplicate. Here is a brief example of the typical workflow using curve_fit. Let me know if you still think that your situation is different.

import numpy as np
from scipy.optimize import curve_fit
import matplotlib.pyplot as plt

# bar plot data
list1 = range(-5, 6)
list2 = [0, 0.2, 0.6, 3.5, 8, 10,
         8, 3.5, 0.6, 0.2, 0]
width = 1/1.5

plt.bar(list1, list2, width, alpha=0.75)

# fit bar plot data using curve_fit
def func(x, a, b, c):
    # a Gaussian distribution
    return a * np.exp(-(x-b)**2/(2*c**2))

popt, pcov = curve_fit(func, list1, list2)

x = np.linspace(-5, 5, 100)
y = func(x, *popt)

plt.plot(x + width/2, y, c='g')

enter image description here

lanery
  • 5,222
  • 3
  • 29
  • 43
  • @Ianery, I don't understand why you used `*popt` inside `y = func(x, *popt)` – Euler_Salter Jun 20 '16 at 01:19
  • 1
    @Euler_Salter as you may know from the documentation, `popt` is a list of the parameters (`a`, `b`, and `c`) that best fit the data. So I do `func(x, *popt)` as a shorthand of writing `func(x, popt[0], popt[1], popt[2])` where `popt[0]` is the best guess for `a`, `popt[1]` is the best guess for `b`, and `popt[3]` is the best guess for `c`. – lanery Jun 20 '16 at 16:48