I'm used to using Excel for this kind of problem but I'm trying my hand at Python for now.
Basically I have two sets of arrays, one constant, and the other's values come from a user-defined function.
This is the function, simple enough.
import scipy.stats as sp
def calculate_probability(spread, std_dev):
return sp.norm.sf(0.5, spread, std_dev)
I have two arrays of data, one with entries that run through the calculate_probability
function (these are the spreads
), and the other a set of constants called expected_probabilities
.
spreads = [10.5, 9.5, 10, 8.5]
expected_probabilities = [0.8091, 0.7785, 0.7708, 0.7692]
The below function is what I am seeking to optimise.
import numpy as np
def calculate_mse(std_dev):
spread_inputs = np.array(spreads)
model_probabilities = calculate_probability(spread_inputs,std_dev)
subtracted_vector = np.subtract(model_probabilities,expected_probabilities)
vector_powered = np.power(subtracted_vector,2)
mse_sum = np.sum(vector_powered)
return mse_sum/len(spreads)
I would like to find a value of std_dev
such that function calculate_mse
returns as close to zero as possible. This is very easy in Excel using solver but I am not sure how to do it in Python. What is the best way?
EDIT: I've changed my calculate_mse
function so that it only takes a standard deviation as a parameter to be optimised. I've tried to return Andrew's answer in an API format using flask but I've run into some issues:
class Minimize(Resource):
std_dev_guess = 12.0 # might have a better guess than zeros
result = minimize(calculate_mse, std_dev_guess)
def get(self):
return {'data': result},200
api.add_resource(Minimize,'/minimize')
This is the error:
NameError: name 'result' is not defined
I guess something is wrong with the input?