0

I'm trying to fit a double broken profile function which consists of the arctangent function. My code doesn't seem to be working:

XX=np.linspace(7.5,9.5,16)
YY=np.asarray([7,7,7,7.1,7.3,7.5,8.4,9,9.3,9.6,10.3,10.2,10.4,10.5,10.5,10.5])

def func_arc(x,a1,a2,a3,b1,b2,b3,H1,H2):
    beta=0.001
    w1=np.zeros(len(x))
    w2=np.zeros(len(x))
    for i in np.arange(0,len(x)):
        w1[i]=(((math.pi/2)+atan((x[i]-H1)/beta))/math.pi)
        w2[i]=(((math.pi/2)+atan((x[i]-H2)/beta))/math.pi)
    y=(a1*x[i]+b1)*(1-w1[i])+(a2*x[i]+b2)*w1[i]*(1-w2[i])+(a3*x+b3)*w2[i]
    return(y)

Where the a and b terms are slope and zero-point values of the linear regressions. The w terms are used to switch the domain.

I take into account the following restrictions for continuity (H1 y H2) and restrict parameters:

mask=(XX<=8.2)
mask2=(XX>8.2) & (XX<9)
mask3=(XX>=9)

l1=np.polyfit(XX[mask], YY[mask], 1)
l2=np.polyfit(XX[mask2], YY[mask2], 1)
l3=np.polyfit(XX[mask3], YY[mask3], 1)
H1=(l2[1]-l1[1])/(l1[0]-l2[0])
H2=(l3[1]-l2[1])/(l2[0]-l3[0])

p0=[l1[0],l2[0],l3[0],l1[1],l2[1],l3[1],H1,H2]

popt_arc1, pcov_arc1 =curve_fit(func_arc, XX, YY,p0)

I obtain a single line instead of a broken profile (S-shape).
What I obtain:

image show current results

Calandra
  • 11
  • 1
  • First I'd like to say that you are probably completely over-fitting your data. It does not look like data with two domains of linear bnehavior. It is more like an `arctan` itself. So what is the motivation here? Can you provide some insight? – mikuszefski Sep 16 '20 at 08:29
  • Thank you @mikuszefski; actually, I am working with a big database and I intended to exemplify the general behavior with only a dozen, trying to reproduce two plateaus, and I used this mathematical expression to reproduce a scientific paper. – Calandra Oct 05 '20 at 18:55
  • Hi. While technically speaking SO is not to give advice on your model-function, I think you made it by far too complicated. If you can state more precisely what you actually need, I am sure that this part can be improved as well. – mikuszefski Oct 06 '20 at 06:25

1 Answers1

0

Here is my version. Due to the fact that the linear functions should connect continuously, the parameters are actually less. The offsets b2 and b3, hence, are not fitted, but are a result of reacquiring the linear function to meet at the transitions. Moreover, one might argue that the data does not justify a slope in the beginning or end. This could be justified / checked via the reduced chi-square or other statistical methods.

import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import curve_fit

XX=np.linspace( 7.5, 9.5, 16 )
YY=np.asarray( [
    7, 7, 7, 7.1, 7.3, 7.5, 8.4, 9, 9.3, 9.6,
    10.3, 10.2, 10.4, 10.5, 10.5, 10.5
])


yy0 = np.median( YY )
xx0 = np.median( XX )
h0 = 0.8 * min( XX ) + 0.2 * max( XX )
h1 = 0.8 * max( XX ) + 0.2 * min( XX )

def transition( x, x0, s ):
    return ( 0.5 * np.pi + np.arctan( ( x - x0 ) * s ) ) / np.pi

def box( x, x0, x1, s ):
    return transition( x, x0, s ) * transition( -x, -x1, s )

def my_piecewise( x, a1, a2, a3, b1, x0,  x1 ):
    S = 100
    b2 = ( a1 - a2 ) * x0 + b1 ### due to continuity
    b3 = ( a2 - a3 ) * x1 + b2 ### due to continuity
    out = transition( -x , -x0, S ) * ( a1 * x + b1 )
    out += box( x , x0, x1 , S ) * ( a2 * x + b2 )
    out += transition( x , x1, S ) * ( a3 * x + b3 )
    return out

def parameter_reduced( x, a2, b1, x0,  x1 ):
    return my_piecewise(x, 0, a2, 0, b1, x0,  x1 )

def alternative( x, x0, a, s, p, y0 ):
    out = np.arctan( np.abs( ( s * ( x - x0 ) ) )**p )**( 1.0 / p )
    out *= a * (x - x0 ) / np.abs( x - x0 )
    out += y0
    return out

xl=np.linspace( 7.2, 10, 150 )

sol, _ = curve_fit(
    my_piecewise, XX, YY, p0=[ 0, 1, 0, min( YY ), h0,  h1 ] 
)
fl = np.fromiter( ( my_piecewise(x, *sol ) for x in xl ), np.float )

rcp =  np.fromiter( ( y - my_piecewise(x, *sol ) for x,y in zip( XX, YY ) ), np.float )
rcp = sum( rcp**2 )

solr, _ = curve_fit(
    parameter_reduced, XX, YY, p0=[ 1, min(YY), h0,  h1 ]
)
rl = np.fromiter( ( parameter_reduced( x, *solr ) for x in xl ), np.float )

rcpr =  np.fromiter( ( y - parameter_reduced(x, *solr ) for x, y in zip( XX, YY ) ), np.float )
rcpr = sum( rcpr**2 )

guessa = [ xx0, max(YY)-min(YY), 1, 1, yy0 ]
sola, _ = curve_fit(  alternative, XX, YY, p0=guessa)
al = np.fromiter( ( alternative( x, *sola ) for x in xl ), np.float )

rca =  np.fromiter( ( y - alternative(x, *sola ) for x, y in zip( XX, YY ) ), np.float )
rca = sum( rca**2 )

print rcp, rcp / ( len(XX) - 6 )
print rcpr, rcpr / ( len(XX) - 4 )
print rca, rca / ( len(XX) - 5 )

fig = plt.figure( figsize=( 12, 8 ) )
ax = fig.add_subplot( 1, 1, 1 )
ax.plot( XX, YY , ls='', marker='o')
ax.plot( xl, fl )
ax.plot( xl, rl )
ax.plot( xl, al )
plt.show()

Results look okey for me.

fits

mikuszefski
  • 3,943
  • 1
  • 25
  • 38