1

How can I run sympy Solve for timeseries?

Lets say I want to Solve for X in the following equation (1+df['LTEPSG']) * df['TTMEPS']-X=0 and I want my solution for X to be a timeseries.

Dates       LTEPSG  TTMEPS  
2022-09-01  0.13434 219.603 
2022-10-01  0.13241 218.116 
2022-11-01  0.11709 217.181 
2022-12-01  0.10718 218.751 
2023-01-01  0.10084 218.109 

So the solution will look like:

Dates       LTEPSG  TTMEPS  Solution
2022-09-01  0.13434 219.603 249.104467
2022-10-01  0.13241 218.116 246.996740
2022-11-01  0.11709 217.181 242.610723
2022-12-01  0.10718 218.751 242.196732
2023-01-01  0.10084 218.109 240.103112

The equation I gave above is a simplified version of the problem I have, my main question is how to run Solve for timeseries.

EDIT: Per @no_hex request posting the actual equation I want to solve below:

Dates        SPX    US30    DIV1        DIV2        DIV3        DIV4        DIV5        DIVT
2022-09-01  3966.85 0.0337  74.695960   84.730615   96.113326   109.025190  123.671634  123.713312
2022-10-01  3678.43 0.0373  74.978836   84.906784   96.149292   108.880419  123.297276  123.343265
2022-11-01  3856.10 0.0414  73.660144   82.285010   91.919762   102.682647  114.705758  114.753246
2022-12-01  4076.57 0.0364  74.472697   82.454680   91.292173   101.076868  111.910287  111.951022
2023-01-01  3839.50 0.0397  73.966816   81.425629   89.636590   98.675544   108.625985  108.669110
 A=df['DIV1']/(1+df['US30']+X)**1
 B=df['DIV2']/(1+df['US30']+X)**2
 C=df['DIV3']/(1+df['US30']+X)**3
 D=df['DIV4']/(1+df['US30']+X)**4
 E=(df['DIV5'] + (df['DIVT']/X))/(1+df['US30']+X)**5

 df['SPX']-(A+B+C+D+E)=0

The equation is the last line of the code, need to solve for X. ABCDE are just here to simplify the viewing

  • If the equation is `(1+df['LTEPSG']) * df['TTMEPS']-X=0` then `X = (1+df['LTEPSG']) * df['TTMEPS']`, and you can add it as a new column, is this what you're looking for? why do you need sympy? – CodeCop Jan 13 '23 at 15:04
  • As I mentioned in the question, the equation I gave is a simplified version of the problem I have, my main question is how to run Solve for timeseries. – user20856754 Jan 13 '23 at 15:11
  • `df['Solution'] = (df['LTEPSG'] + 1) * df['TIMEPS']` will add a new column with the equation you supported, if the equation is more complicated, then provide it in the question.. – CodeCop Jan 13 '23 at 15:12
  • added the actual equation for you – user20856754 Jan 13 '23 at 15:31

1 Answers1

0

This is your equation:

In [24]: from sympy import *

In [25]: x = symbols('x')

In [26]: a, b, c, d, e, u, t, s = symbols('a, b, c, d, e, u, t, s')

In [27]: A = a/(1 + u + x)**1

In [28]: B = b/(1 + u + x)**2

In [29]: C = c/(1 + u + x)**3

In [30]: D = d/(1 + u + x)**4

In [31]: E = (e + t/x)/(1 + u + x)**5

In [32]: eq = s - (A+B+C+D+E)

In [33]: eq
Out[33]: 
                                                                      t    
                                                                  e + ─    
      a            b              c              d                    x    
- ───────── - ──────────── - ──────────── - ──────────── + s - ────────────
  u + x + 1              2              3              4                  5
              (u + x + 1)    (u + x + 1)    (u + x + 1)        (u + x + 1) 

If we could solve this symbolically then we could get a formula that you could use for a new column. There is no hope of solving this symbolically though as this will lead to a fully arbitrary symbolic polynomial equation of degree 6 and the Abel-Ruffini theorem prevents symbolic solutions to those in ordinary functions:

https://en.wikipedia.org/wiki/Abel%E2%80%93Ruffini_theorem

However given numerical values for all of the symbols you can easily compute the roots of the polynomial numerically. For this you will need to loop through all rows calling a numerical routine but we can try to do it more efficiently than just calling SymPy's symbolic solve function for each row. Let's turn it into a polynomial equation:

In [49]: together(eq)
Out[49]: 
                 4                  3                  2                                          5
- a⋅x⋅(u + x + 1)  - b⋅x⋅(u + x + 1)  - c⋅x⋅(u + x + 1)  - d⋅x⋅(u + x + 1) - e⋅x + s⋅x⋅(u + x + 1) - t
───────────────────────────────────────────────────────────────────────────────────────────────────────
                                                          5                                        
                                             x⋅(u + x + 1)                                         

In [50]: numer(together(eq))
Out[50]: 
                 4                  3                  2                                          5
- a⋅x⋅(u + x + 1)  - b⋅x⋅(u + x + 1)  - c⋅x⋅(u + x + 1)  - d⋅x⋅(u + x + 1) - e⋅x + s⋅x⋅(u + x + 1)  - t

In [51]: numer(together(eq)).as_poly(x)
Out[51]: 
Poly(s*x**6 + (-a + 5*s*u + 5*s)*x**5 + (-4*a*u - 4*a - b + 10*s*u**2 + 20*s*u + 10*s)*x**4 + (-6*a
*u**2 - 12*a*u - 6*a - 3*b*u - 3*b - c + 10*s*u**3 + 30*s*u**2 + 30*s*u + 10*s)*x**3 + (-4*a*u**3 -
 12*a*u**2 - 12*a*u - 4*a - 3*b*u**2 - 6*b*u - 3*b - 2*c*u - 2*c - d + 5*s*u**4 + 20*s*u**3 + 30*s*
u**2 + 20*s*u + 5*s)*x**2 + (-a*u**4 - 4*a*u**3 - 6*a*u**2 - 4*a*u - a - b*u**3 - 3*b*u**2 - 3*b*u 
- b - c*u**2 - 2*c*u - c - d*u - d - e + s*u**5 + 5*s*u**4 + 10*s*u**3 + 10*s*u**2 + 5*s*u + s)*x -
 t, x, domain='ZZ[s,t,u,a,b,c,d,e]')

Now we can get the coefficients of the polynomial and use lambdify to evaluate them efficiently:

In [52]: numer(together(eq)).as_poly(x).all_coeffs()
Out[52]: 
⎡                                              2                         2                         
⎣s, -a + 5⋅s⋅u + 5⋅s, -4⋅a⋅u - 4⋅a - b + 10⋅s⋅u  + 20⋅s⋅u + 10⋅s, - 6⋅a⋅u  - 12⋅a⋅u - 6⋅a - 3⋅b⋅u -

                 3         2                         3         2                       2           
 3⋅b - c + 10⋅s⋅u  + 30⋅s⋅u  + 30⋅s⋅u + 10⋅s, - 4⋅a⋅u  - 12⋅a⋅u  - 12⋅a⋅u - 4⋅a - 3⋅b⋅u  - 6⋅b⋅u - 

                             4         3         2                      4        3        2        
3⋅b - 2⋅c⋅u - 2⋅c - d + 5⋅s⋅u  + 20⋅s⋅u  + 30⋅s⋅u  + 20⋅s⋅u + 5⋅s, - a⋅u  - 4⋅a⋅u  - 6⋅a⋅u  - 4⋅a⋅u

          3        2                  2                                5        4         3        
 - a - b⋅u  - 3⋅b⋅u  - 3⋅b⋅u - b - c⋅u  - 2⋅c⋅u - c - d⋅u - d - e + s⋅u  + 5⋅s⋅u  + 10⋅s⋅u  + 10⋅s⋅

 2                ⎤
u  + 5⋅s⋅u + s, -t⎦

In [53]: coeffs = numer(together(eq)).as_poly(x).all_coeffs()

In [54]: coeffs_lam = lambdify([a, b, c, d, e, s, t, u], coeffs)

In [56]: coeffs_lam(1, 2, 3, 4, 5, 6, 7, 8)
Out[56]: [6, 269, 4822, 43197, 193370, 345991, -7]

Now that you have numeric polynomial coefficients numpy's np.roots function is the fastest way to get the roots (more accurate but slower methods could alternatively be used):

In [57]: np.roots(coeffs_lam(1, 2, 3, 4, 5, 6, 7, 8))
Out[57]: 
array([-9.69478098e+00+0.44599886j, -9.69478098e+00-0.44599886j,
       -8.86569067e+00+0.88529552j, -8.86569067e+00-0.88529552j,
       -7.71241026e+00+0.j        ,  2.02315114e-05+0.j        ])

The question doesn't acknowledge the possibility of there being multiple solutions but for almost all values of the parameters there will be 6 solutions. Some might be non-real but the number of real solutions will always be even so if there are any real solutions there will be at least 2 of them. You will need to decide for yourself how to choose which is one that you want and write some code to select it.

Oscar Benjamin
  • 12,649
  • 1
  • 12
  • 14
  • You can get a bit of vectorisation by calling `coeffs_lam` with whole columns as arguments. You will still need a loop to call `np.roots` though. – Oscar Benjamin Jan 13 '23 at 17:51
  • there is a unique solution. I can solve it in excel using Goal Seek function no problem, i am looking for a way to do it in python instead – user20856754 Jan 13 '23 at 17:54
  • No idea how goal seek works but the equation does not have a unique real solution (for almost all possible values of the parameters). – Oscar Benjamin Jan 13 '23 at 19:30