1

I am trying to use the scipy.newton method to optimize in a pandas dataframe.

First, my dataframe creation is below. Second, create the function Px. Third, create another function YieldCalc where I use scipy.newton to optimize to find the value for Rate such that Px = 0. Then I am trying to add that value to a new column 'Yield' but get the following error. Any help would be much appreciated. Thanks in advance.

from pandas import *
import pandas as pd
from scipy import *
import scipy
import timeit   
#In:
#Creating Dataframe
df = DataFrame(list([100,2,34.1556,9,100]))
df = DataFrame.transpose(df)
df = df.rename(columns={0:'Face',1:'Freq',2:'N',3:'C',4:'Mkt_Price'})
df2= df
df = concat([df, df2])
df

#Out:
Face  Freq    N          C  Mkt_Price
100    2     34.1556     9    100
100    2     34.1556     9    100


#In:
Face = df['Face']
Freq = df['Freq']
N = df['N']
C = df['C']
Mkt_Price = df['Mkt_Price']


def Px(Rate):
    return Mkt_Price - (Face * ( 1 + Rate / Freq ) ** ( - N ) + ( C / Rate ) * ( 1 - (1 + ( Rate / Freq )) ** -N ) )

def YieldCalc():
    return scipy.optimize.newton(Px, .1, tol=.0001, maxiter=100)
df['Yield'] = YieldCalc()

Error/Output:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-89-f4961d3f817b> in <module>()
     12 def YieldCalc(Rate):
     13     return scipy.optimize.newton(Px, .1, tol=.0001, maxiter=100)
---> 14 df['Yield'] = YieldCalc(.05)

<ipython-input-89-f4961d3f817b> in YieldCalc(Rate)
     11 
     12 def YieldCalc(Rate):
---> 13     return scipy.optimize.newton(Px, .1, tol=.0001, maxiter=100)
     14 df['Yield'] = YieldCalc(.05)

C:\Users\rebortz\Anaconda\lib\site-packages\scipy\optimize\zeros.pyc in newton(func, x0, fprime, args, tol, maxiter, fprime2)
    145         q1 = func(*((p1,) + args))
    146         for iter in range(maxiter):
--> 147             if q1 == q0:
    148                 if p1 != p0:
    149                     msg = "Tolerance of %s reached" % (p1 - p0)

C:\Users\rebortz\Anaconda\lib\site-packages\pandas\core\generic.pyc in __nonzero__(self)
    674         raise ValueError("The truth value of a {0} is ambiguous. "
    675                          "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
--> 676                          .format(self.__class__.__name__))
    677 
    678     __bool__ = __nonzero__

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
FooBar
  • 15,724
  • 19
  • 82
  • 171
Wade Bratz
  • 321
  • 1
  • 4
  • 16
  • 1
    Does adding `.values` to your columns (as in `Mkt_price.values`) fix the error? Iirc, standard boolean operations with pandas dataframes are not the same as with numpy matrices. – FooBar Jul 15 '14 at 14:19
  • Just tried it. Did not work. – Wade Bratz Jul 15 '14 at 14:22
  • Also, if you insert 'df['Yield'] = Px(.05)', it creates the new column with the Px() for that Rate. So i think it is something to do with the YieldCalc formula. – Wade Bratz Jul 15 '14 at 14:24
  • First, "Did not work" is not a useful statement for further investigation. Second, if your method doesn't work with `numpy` matrices either, it's not a `pandas` specific problem. Try stripping it down until it works and figure out what exactly causes the problem. Perhaps restating the simpler problem (in the context of `numpy`/`scipy` will get your additional responses under those tags. – FooBar Jul 15 '14 at 14:25
  • This was also rather a preemptive comment. You may still get useful feedback the way this is stated, but this is not a very typical use case within `pandas`. Within `numpy`, surely. – FooBar Jul 15 '14 at 14:26
  • Noted - I am new new poster so I thank you for your patience. Will restate and try to make more clear. Thanks again. – Wade Bratz Jul 15 '14 at 14:28

1 Answers1

1

Part of the trick here is what you get back from df['Face'] is not a single value or even an array. They're still tied to pandas.

You can, as suggested, start getting access the raw data by way of .values and feed that into a function.

Alternatively, pandas data frames have a .apply method that will allow you to take a function and run it over every row or col.

I put the following at the end of the code that you posted (commenting out the offending line first)

def Foo(thing, Rate):
    return thing[0]*Rate

df['Yield'] = df.apply(Foo,axis=1,args=(0.1,))
df.head()

In here the .apply method will pass the function Foo all of the entries in a given row of df as a series, and the argument 0.1 as well. The axis specification is what sets this to be done by row (axis=0 will do the col).

Just reorganize Px to accept 'Rate' and the series of values from df (in that order). Then have YieldCalc accept that series as well. Also, you'll need to use an args= entry in the newton call to pass that series of values to Px when it hunts for zeros.

The flow should be:

.apply makes a series thing out of a row from df and passes it to YieldCalc. YieldCalc runs newton on Px(Rate,thing)' to findRate` returns 0. Then all of those results get put into your new Yield col.

Dan
  • 608
  • 1
  • 5
  • 9