0

I am trying to code an optimizer finding the optimal constant parameters so as to minimize the MSE between an array y and a generic function over X. The generic function is given in pre-order, so for example if the function over X is x1 + c*x2 the function would be [+, x1, *, c, x2]. The objective in the previous example, would be minimizing:

sum_for_all_x (y - (x1 + c*x2))^2

I show next what I have done to solve the problem. Some things that sould be known are:

  1. X and y are torch tensors.
  2. constants is the list of values to be optimized.

    def loss(self, constants, X, y):

        stack = []   # Stack to save the partial results
        const = 0    # Index of constant to be used
        for idx in self.traversal[::-1]:   # Reverse the prefix notation
            if idx > Language.max_variables:   # If we are dealing with an operator
                function = Language.idx_to_token[idx]  # Get its associated function
                first_operand = stack.pop()    # Get first operand
                if function.arity == 1:   # If the arity of the operator is one (e.g sin)
                    stack.append(function.function(first_operand))   # Append result
                else:   # Same but if arity is 2
                    second_operand = stack.pop()  # Need a second operand
                    stack.append(function.function(first_operand, second_operand))
                
            elif idx == 0:  # If it is a constant -> idx 0 indicates a constant
                stack.append(constants[const]*torch.ones(X.shape[0]))  # Append constant
                const += 1   # Update
            else:
                stack.append(X[:, idx - 1])   # Else append the associated column of X
        
        prediction = stack[0]
        return (y - prediction).pow(2).mean().cpu().numpy()


    def optimize_constants(self, X, y):
        '''
        # This function optimizes the constants of the expression tree.
        '''
        if 0 not in self.traversal:  # If there are no constants to be optimized return
            return self.traversal
        
        x0 = [0 for i in range(len(self.constants))]  # Initial guess
        ini = time.time()
        res = minimize(self.loss, x0, args=(X, y), method='BFGS', options={'disp': True})
        print(res)
        print('Time:', time.time() - ini)

The problem is that the optimizer theoretically terminates successfully but does not iterate at all. The output res would be something like that:

Optimization terminated successfully.
         Current function value: 2.920725
         Iterations: 0
         Function evaluations: 2
         Gradient evaluations: 1
      fun: 2.9207253456115723
 hess_inv: array([[1]])
      jac: array([0.])
  message: 'Optimization terminated successfully.'
     nfev: 2
      nit: 0
     njev: 1
   status: 0
  success: True
        x: array([0.])

So far I have tried to:

  1. Change the method in the minimizer (e.g Nelder-Mead, SLSQP,...) but it happens the same with all of them.
  2. Change the way I return the result (e.g (y - prediction).pow(2).mean().item())
  • try adding a couple of print statements to `loss` to see what's going on, one printing out `constants` and one printing out the value that will be returned. I'd guess that your loss function is constant, hence the optimiser says you're already at the minimum – Sam Mason Jun 22 '22 at 14:20
  • Const: [0. 0.] Loss: 32353817000000.0 Const: [1.49011612e-08 0.00000000e+00] Loss: 32353817000000.0 Const: [0.00000000e+00 1.49011612e-08] Loss: 32353817000000.0 Optimization terminated successfully. Current function value: 32353816674304 Iterations: 0 Function evaluations: 3 Gradient evaluations: 1 fun: 32353816674304.0 hess_inv: array([[1, 0], [0, 1]]) jac: array([0., 0.]) message: 'Optimization terminated successfully.' nfev: 3 nit: 0 njev: 1 status: 0 success: True x: array([0., 0.]) – Alex Ferrando Jun 22 '22 at 14:51
  • so yes, your function is constant everywhere it tried, so it gave up. `minimize` is doing what it's supposed to be doing. maybe simplify your `loss` function so it's more obvious what's actually being calculated – Sam Mason Jun 22 '22 at 15:06
  • Don't think that there is any way to simpligy my loss loss function. Is there any other way to find the optimal parameters in this problem? – Alex Ferrando Jun 22 '22 at 15:54
  • by simplify, I mean take out all the "generic function" stuff, and just code it up directly. the aim is to help you understand what's going on inside the calculation, and why it's coming out with a constant value – Sam Mason Jun 22 '22 at 16:02

1 Answers1

0

It seems that scipy optimize minimize does not work well with Pytorch. Changing the code to use numpy ndarrays solved the problem.

  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Jun 23 '22 at 05:05