1

I want to pass list as an argument in optimize.newton. I have imported one csv and stored each row in an array. The code for this looks like:

with open('rand1.csv','rb') as f:
    array=[]
    for line in f:
        array.append(line)

Now, if I look at array[1], it looks like: '2,6,76,45,78,1\r\n'

I have defined one function as:

def func(a,b,c,d,e,f):
    return a*b*c-d*e-f

And I am running the Newton method as:

res=[optimize.newton(func,5102,args=(x)) for x in array[0]]

But it is giving me a TypeError saying ": can only concatenate tuple (not "str") to tuple"

Can someone help me here? I know that tuple elements have to be comma separated and I have tried writing args=(x,) too, but it didn't work.

ali_m
  • 71,714
  • 23
  • 223
  • 298

2 Answers2

1

First, bear in mind that in your code, array is not actually a numpy array - it is a normal Python list of strings. It's possible to work with this list by splitting the strings and converting the elements to integers, as in Anmol_uppal's answer, but it's much simpler to convert the contents of the csv file directly to an nrows x 6 numpy array, e.g. using np.loadtxt:

import numpy as np

data = np.loadtxt('rand1.csv', delimiter=',', dtype=np.int)
print(repr(data[0]))
# array([ 2,  6, 76, 45, 78,  1])

Now when you call optimize.newton, the args= parameter should get a sequence of 6 parameter values. Your original code was not working because each row in array contained a single string, rather than 6 numerical values. Now that data* is an nrows x 6 array, each row will contain 6 numerical values, so you can now just do:

res = [optimize.newton(func, 5102, args=row) for row in data]

*Note that I've renamed your variable array to data to avoid confusion with the np.array class


Update

There was another error in your original code that I didn't spot initially. Take a look at the documentation for scipy.optimize.newton:

func : function

The function whose zero is wanted. It must be a function of a single variable of the form f(x,a,b,c...), where a,b,c... are extra arguments that can be passed in the args parameter.

x0 : float

An initial estimate of the zero that should be somewhere near the actual zero.

Now look at your function definition:

def func(a,b,c,d,e,f):
    return a*b*c-d*e-f

The first argument to func() (which you've called a) should correspond to the x parameter, then there are only 5 extra arguments (b ... f according to your definition) that need to be passed using args=. When you try to call

optimize.newton(func, 5102, args=(422, 858, 129, 312, 79, 371))

what happens is that 5102 is interpreted as the x0 parameter, and is passed as the first argument to func(). The 6 values in the args= tuple are treated as extra arguments, so your function actually gets 7 arguments in total:

func(5102, 422, 858, 129, 312, 79, 371)

Obviously, func() is defined as taking 6 arguments, so you get an error. The correct way to fix this depends on how you interpret the parameters of your function. The goal of newton is to find a value of x such that f(x, a, b, c, ...) = 0.

Which of your 6 parameters do you want to minimize func() over?


Full explanation

A slightly more interesting question is why you don't get the error when you pass the extra arguments as an array (e.g. args=data[0]) instead of a tuple. The answer is a bit more complicated, but read on if you're interested.

If you take a look at the source code for scipy.optimize.newton you can find the line where your function gets called for the first time:

q0 = func(*((p0,) + args))

In this case p0 and p1 would be the x0 argument to newton(), and args is the set of extra arguments:

q0 = func(*((5102,) + (422, 858, 129, 312, 79, 371)))

(p0,) is a tuple, and if args is also a tuple then the + operator would just join these two tuples together:

q0 = func(*(5102, 422, 858, 129, 312, 79, 371))

Finally, the * unpacks the tuple to pass the arguments to func. The final call would look like this:

q0 = func(5102, 422, 858, 129, 312, 79, 371)

This will raise an error, since there are 7 arguments to a 6-argument function. However, when args is an np.array:

q0 = func(*(5102,) +  array([422, 858, 129, 312, 79, 371]))

the + will add value p0 to each element in args:

q0 = func(*(5524, 5960, 5231, 5414, 5181, 5473))

Since there are now only 6 arguments going to func() the call will succeed, but newton will converge on the wrong answer!

I think this is not particularly good design in scipy - it caught me out because in most other cases any array-like input will do, including lists, tuples, arrays etc. To be fair, it does say in the documentation for newton that args= should be a tuple, but I would still either do type-checking or cast it explicitly to a tuple for safety. I may try and fix this issue in scipy.

Community
  • 1
  • 1
ali_m
  • 71,714
  • 23
  • 223
  • 298
  • Thanks, this solves my problem a bit. But I have few more doubts here. If i run your code on a csv having 8 rows and 6 columns, it works fine. If I am correct, the array formed will be 8x6. The 'res' variable will have 8 values. But what if I want to run optimize code only for 1st row? I am not able to run this: res = [optimize.newton(func, 5102, args=row) for row in data[0]]. Also, if I run the code by specifying values manually, it is giving different value for res. please help. – Prateek Saxena Jan 27 '15 at 05:43
  • If you just want to run the optimization for the first row, you could do `res0 = optimize.newton(func, 5102, args=data[0])`, where `res0` will be a single scalar rather than a list. If you want to check that your array has the expected dimensions, check the value of the `data.shape` attribute. I don't know what you mean when you say that you are getting different results when you specify values manually. Perhaps you are indexing the wrong row in your array? Remember that in Python, indexing starts at 0 rather than 1. – ali_m Jan 27 '15 at 10:05
  • I am aware that index in Python starts from 0. Here is the situation. data[0] looks like: 'array([422, 858, 129, 312, 79, 371])'. When I run the code "optimize.newton(func, 5102, args=data[0])", the answer is -129 whereas if i run the same code as "optimize.newton(func, 5102, args=(422, 858, 129, 312, 79, 371))", it is giving me an error. – Prateek Saxena Jan 27 '15 at 11:02
  • I am waiting for your reply on this. I would be extremely thankful for your help. – Prateek Saxena Jan 27 '15 at 14:22
  • @PrateekSaxena I've amended my answer - there was another bug in your code that was not initially obvious to me because of (what I consider to be) a flaw in `optimize.newton`. The full explanation is a bit long, but the important points are that 1) you are passing too many arguments to `func()`, and 2) the `args=` parameter should actually be a tuple rather than an array (e.g. `args=tuple(data[0])`). – ali_m Jan 27 '15 at 22:16
  • This `newton` function looks like it could be run with plain Python, without `numpy`. It doesn't do anything with arrays. – hpaulj Jan 27 '15 at 23:38
  • @hpaulj yes, my intent was only to simplify reading the csv file – ali_m Jan 27 '15 at 23:58
  • Loading parameters directly from a `csv` file is a specialized case, and shouldn't have much influence on how `newton` (and similar functions like ODE solvers) handle arguments. Loading a `csv` and running `newton` are distinct tasks that need to been understood independently. – hpaulj Jan 28 '15 at 19:17
0

First of all you need to remove that trailing '\r', '\t' and for that you can use .strip() , Now you have a string in which the desired elements are seprated by comma, here you can use .split() method and pass the character you want to split upon the given string. Then finally we used the map() function which takes a function as first parameter ( int in this case) and the second argument is a list or a tuple and maps each element of that list of tuple with the function passed as the first parameter.

line = '2,6,76,45,78,1\r\n'
line_stripped = line.strip()
print line_stripped
>>> '2,6,76,45,78,1'

line_splitted = line_stripped.split(",")
print line_splitted
>>> '2' ,'6', '76', '45', '78', '1'

line_integers = map(int,line_splitted)
print line_integers
>>> [2, 6, 76, 45, 78, 1]

Combining all the above steps we can cleanly write it as :

with open('rand1.csv','rb') as f:
    array=[]
    for line in f:
        array.append(map(int,line.strip().split(',')))
ZdaR
  • 22,343
  • 7
  • 66
  • 87
  • This helped me in getting rid from '\r', '\t' but i still not able to run my optimize.newton function. Any suggestions here? – Prateek Saxena Jan 22 '15 at 14:53