I am currently trying to solve a non linear problem very similar to the one provided here: https://aws-ml-blog.s3.amazonaws.com/artifacts/prevent_churn_by_optimizing_incentives/preventing_customer_churn_by_optimizing_incentives.html. It has a non linear objective function and linear constraint. My problem size is about 1.5 Million variables (actual customers). Apologies in advance, as I am not able to provide a sample of the data as it contains personal information. However my data is quite similar to the example, except for the problem size. (I tried to linearize the objective but have not been successful yet).
I first used Gekko (with remote =False, IPOPT) as suggested in the article but despite running for several hours (10-12 hours) on AWS instance ml-m4-16xlarge - it does not return a solution. I did try a smaller size (about 100K variables) and the result was the same. I tried CyIPOPT but it errors out due to memory issues. To try a simpler example, I used an example recommended here (GEKKO Exception: @error: Max Equation Length (Number of variables greater than 100k)) and increased problem size to 100K variables but the kernel was unresponsive for Gekko. What steps could I attempt next to try and fix this issue?
Update based on first comment:
Thank you.
I first tried a small example of 500 variables (code attached below for my problem with sample data) and it ran successfully. However, I noticed that despite specifying IPOPT the model used APOPT (images below)
when I increased the size N to 5000 I got the Max Equation Length exception stated in the other thread. I then tried to modify the problem as you had stated there
But at the bottom of the solution the solver is stated as IPOPT:
from gekko import GEKKO
import numpy as np
import pandas as pd
alpha = np.random.uniform(low=0.01, high=0.99, size=(500,))
P= np.random.uniform(low=2, high=250, size=(500,))
N = 500
gamma = np.ones(N)
len(np.where(alpha > 0.80)[0])
indices_gamma_eq_zero = np.union1d(np.where(alpha > 0.80)[0], np.where(alpha < 0.40)[0])
gamma[indices_gamma_eq_zero] = 10
gamma
m = GEKKO(remote=False)
m.options.SOLVER = 3 #IPOPT Solver
m.options.IMODE = 3
C = 100
# variable array dimension
#create array
#x = m.Array(m.Var,N)
#for i in range(N):
# x[i].value = C / N
# x[i].lower = 0
# x[i].upper = 50
#initialize variable
x = np.array([m.Var(lb=0,ub=50) for i in range(N)])
#constraints
m.Equation(m.sum(list(x))<=C)
beta = [1 - m.exp(-gamma[i] * x[i]) for i in range(N)]
ival = [m.Intermediate(beta[i] * (alpha[i] * P[i] - x[i])) for i in range(N)]
m.Obj(-sum(ival))
# minimize expected cost /maximize expected profit
m.solve()
print(x)
This is the slightly revised code:
from gekko import GEKKO
import numpy as np
import pandas as pd
alpha = np.random.uniform(low=0.01, high=0.99, size=(5000,))
P= np.random.uniform(low=2, high=250, size=(5000,))
N = 5000
gamma = np.ones(N)
len(np.where(alpha > 0.80)[0])
indices_gamma_eq_zero = np.union1d(np.where(alpha > 0.80)[0], np.where(alpha < 0.40)[0])
gamma[indices_gamma_eq_zero] = 10
gamma
m = GEKKO(remote=False)
m.options.SOLVER = 3 #IPOPT Solver
m.options.IMODE = 3
C = 500
#initialize variable
x = np.array([m.Var(lb=0,ub=50) for i in range(N)])
#constraints
m.Equation(m.sum(list(x))<=C)
#objective
alpha=pd.DataFrame(alpha,columns=['alpha'])
P=pd.DataFrame(P,columns=['P'])
gamma=pd.DataFrame(gamma,columns=['gamma'])
dataset= pd.concat([alpha,P,gamma],axis=1)
P = dataset['P'].values
alpha = dataset['alpha'].values
gamma = dataset['gamma'].values
beta = [1 - m.exp(-gamma[i] * x[i]) for i in range(N)]
[m.Maximize(beta * (alpha * P - x))]
#optimization
m.solve(disp=True)
But now I get this error: