Solving 3 linear equations based on DataFrame - unclear error code

Question

I have the following dataframe:

 station       xf        yf  ...        yp        xm        ym
         1  0.386532  0.269953  ... -0.427596  0.501989  0.545583
         2  0.329727  0.240086  ... -0.350937  0.556123  0.539533
         3  0.560896  0.241310  ... -0.438103  0.600259  0.566153

with 8 variables in total (xf, yf, xs, ys, xp, yp, xm, ym).

I want to solve a set of three equations (unknowns are as, ap, am) based on these variables, iterating over the rows:

(1) ap*xp + am*xm + as*xs = xf 

(2) ap*yp + am*ym + as*ys = yf 

(3) ap + am + as = 1

Here's the relevant part of my script:

from sympy import *
a_p, a_m, a_s = symbols('a_p a_m a_s')
for s in range (len(station)):
  eq1 = Eq(a_p*xp + a_m*xm + a_s*xs, xf)
  eq2 = Eq(a_p*yp + a_m*ym + a_s*ys, yf)
  eq3 = Eq(a_p+a_m+a_s, 1)
  solve((eq1, eq2, eq3), (a_p, a_m, a_s))

Doing this results in a very long error that I don't understand:

Traceback (most recent call last):
  File "/Applications/PyCharm CE.app/Contents/plugins/python-ce/helpers/pydev/pydevd.py", line 1496, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/Applications/PyCharm CE.app/Contents/plugins/python-ce/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/Users/user/Desktop/PyCharmProject/relative_importance.py", line 43, in <module>
    eq1 = Eq(a_p*xp + a_m*xm + a_s*xs, xf)
  File "/Users/user/Desktop/PyCharmProject/venv/lib/python3.10/site-packages/sympy/core/relational.py", line 626, in __new__
    lhs = _sympify(lhs)
  File "/Users/user/Desktop/PyCharmProject/venv/lib/python3.10/site-packages/sympy/core/sympify.py", line 529, in _sympify
    return sympify(a, strict=True)
  File "/Users/user/Desktop/PyCharmProject/venv/lib/python3.10/site-packages/sympy/core/sympify.py", line 450, in sympify
    raise SympifyError(a)
sympy.core.sympify.SympifyError: SympifyError: 0       0.501989171514667*a_m - 0.491164753801456*a_p ...
1       0.556122642496865*a_m - 0.536273571001629*a_p ...
2       0.600259383842623*a_m - 0.463743647539074*a_p ...
3       0.527434386544111*a_m - 0.55189941080158*a_p +...
4       0.614672129443805*a_m - 0.502872339578653*a_p ...
                              ...                        
4376    0.657505926330861*a_m - 0.571314782222311*a_p ...
4377    0.514830107248954*a_m - 0.624890307991312*a_p ...
4378    0.417734347565796*a_m - 0.552231558373007*a_p ...
4379    0.619699526378333*a_m - 0.564445066398518*a_p ...
4380    0.615655457259689*a_m - 0.445460476959232*a_p ...
Name: x, Length: 4381, dtype: object
/Applications/PyCharm CE.app/Contents/plugins/python-ce/helpers/pydev/_pydevd_bundle/pydevd_utils.py:606: FutureWarning: iteritems is deprecated and will be removed in a future version. Use .items instead.
  for item in s.iteritems():

It would seem that you have not split up your ```symbols``` values. Change this to: ```['a_p','a_m','a_s']``` It would seem that everything else is done correctly. — thmasXX, Nov 10 '22 at 12:01
The pertinent part of the script is missing -- how did you assign values to your x and y values? Also, since the system is linear and easy to solve, you might solve it once and then use substitution to substitute in the values of interest. — smichr, Nov 10 '22 at 12:25

score 2 · Answer 1 · answered Nov 10 '22 at 12:38

I think you were using the entire columns instead of single values. Here is how I would do it:

import pandas as pd
import numpy as np
d = {k.strip(): np.random.uniform(0, 1, 10) for k in "xf, yf, xs, ys, xp, yp, xm, ym".split(",")}
df = pd.DataFrame(d)

# solve your system of linear equations only once
a_p, a_m, a_s = symbols('a_p a_m a_s')
xp, xm, xs, xf = symbols('xp xm xs xf')
yp, ym, ys, yf = symbols('yp ym ys yf')
eq1 = Eq(a_p*xp + a_m*xm + a_s*xs, xf)
eq2 = Eq(a_p*yp + a_m*ym + a_s*ys, yf)
eq3 = Eq(a_p+a_m+a_s, 1)
sol = solve((eq1, eq2, eq3), (a_p, a_m, a_s))

# convert symbolic solutions to numerical functions
# note the order of symbols is the same as the columns
# of the dataframe
fap = lambdify([xf, yf, xs, ys, xp, yp, xm, ym], sol[a_p])
fam = lambdify([xf, yf, xs, ys, xp, yp, xm, ym], sol[a_m])
fas = lambdify([xf, yf, xs, ys, xp, yp, xm, ym], sol[a_s])

# evaluate the functions over the dataframe
for index, row in df.iterrows():
    xf, yf, xs, ys, xp, yp, xm, ym = row
    print("a_p: {}\ta_m: {}\ta_s={}".format(
        fap(xf, yf, xs, ys, xp, yp, xm, ym),
        fam(xf, yf, xs, ys, xp, yp, xm, ym),
        fas(xf, yf, xs, ys, xp, yp, xm, ym),
    ))

I am having a hard time understanding what this lines does: ```d = {k.strip(): np.random.uniform(0, 1, 10) for k in "xf, yf, xs, ys, xp, yp, xm, ym".split(",")}```. Also, why does it return 10 rows, when my original dataframe has 4381? — Cec35, Nov 11 '22 at 18:54
With that line I'm just creating a dictionary containing random data that will be used to create a test dataframe. — Davide_sd, Nov 11 '22 at 18:57

Solving 3 linear equations based on DataFrame - unclear error code

1 Answers1