1

i have a noisy optimization task. I try to find parameters of a function (which assets to choose and their weights) so that they will minimize function result (tracking error - a difference between portfolio and index returns). Financial terms are not really crucial to find a solution imo.

When i use optimization from mlrMBO the result (y) is always far away from theoretical and expected from intuition result. Result from mbo is even different from the same result with optimal parameters used in a function.

I tried changing learners, settings of the paramset and recoding the function but it does not work

Here is a minimal reproducible example ( the results are the same and the problem persists so i guess it is reproducible)

max_stocks <- 10 # maximum number of assets to choose

# for the sake of reproducibility below is a provisional matrix of assets returns, each column is a different asset and stock index returns
quarter.pool <- matrix(rnorm(1830, 0, 0.01), nrow = 61, ncol = 300) 
stock_index <- matrix(rnorm(61,0,0.02))

# function with 2 inputs of same length vectors and tracking error as output
obj_func <- function(stock_weights, which_stocks){

  quarter.pool_ <- quarter.pool[,which_stocks] #choosing which assets to invest in
  portfolio <- matrix(rep(0,nrow(quarter.pool_))) # empty index-tracking portfolio

  for (i in 1:nrow(quarter.pool_)) {
    portfolio[i] <- sum(quarter.pool_[i,]*stock_weights, na.rm = TRUE)}

  tracking_error <- sqrt(sum(portfolio-stock_index)^2/length(portfolio-1))
  return(tracking_error)}

obj_f <- makeSingleObjectiveFunction(name = "index_tracking",
                                     fn = obj_func, 
                                     par.set = makeParamSet( makeNumericVectorParam("stock_weights", len = max_stocks, lower = 0, upper = 0.25),
                                                             makeIntegerVectorParam("which_stocks", len = max_stocks, lower = 1, upper = ncol(quarter.pool))), 
                                     noisy = TRUE, 
                                     vectorized = TRUE)

ctrl <-  makeMBOControl(final.method = "best.true.y", propose.points = 1)
ctrl <- setMBOControlTermination(ctrl, iters = 20, target.fun.value = 0.0000001)

run <- mbo(fun = obj_f, control = ctrl)

obj_func(run$x$stock_weights, run$x$which_stocks) == run$y 
# the result is different even tho function works as usual

the results vary a lot wherever i repeat the script. But the main problem is twofold:

  1. It should be close to 0 and usually doesnt go even below 0.2. Dont even know if its possible.
  2. The "optimal" y from mbo differ from results of putting "optimal" parameters into the same function. Maybe is it because the y is predicted and not real? although the difference is high.

EDIT[1] Apparently when i deleted parameter "which_stocks" everything is good, so it must be because of it. Still is there any way to have it? Maybe the algorithm justcant handle that hard task?

SquintRook
  • 13
  • 3
  • Remark regarding the setup: The dimensionality might be a bit too high for this MBO setup. Also the function is deterministic and not noisy. If it was noisy you should change `best.true.y` to `best.predicted.y` to not select a point that has just randomly good performance. – jakob-r Oct 15 '19 at 08:29

1 Answers1

1

Please see my comments regarding the setup. Technically it works by passing the parameters as a list and add has.simple.signature = FALSE to the smoof wrapper.

max_stocks <- 10 # maximum number of assets to choose

# for the sake of reproducibility below is a provisional matrix of assets returns, each column is a different asset and stock index returns
quarter.pool <- matrix(rnorm(1830, 0, 0.01), nrow = 61, ncol = 300) 
stock_index <- matrix(rnorm(61,0,0.02))

# function with 2 inputs of same length vectors and tracking error as output
obj_func <- function(x){
  stock_weights = x$stock_weights
  which_stocks = x$which_stocks
  quarter.pool_ <- quarter.pool[,which_stocks] #choosing which assets to invest in
  portfolio <- matrix(rep(0,nrow(quarter.pool_))) # empty index-tracking portfolio

  for (i in 1:nrow(quarter.pool_)) {
    portfolio[i] <- sum(quarter.pool_[i,]*stock_weights, na.rm = TRUE)}

  tracking_error <- sqrt(sum(portfolio-stock_index)^2/length(portfolio-1))
  return(tracking_error)}

obj_f <- makeSingleObjectiveFunction(name = "index_tracking",
                                     fn = obj_func, 
                                     par.set = makeParamSet( makeNumericVectorParam("stock_weights", len = max_stocks, lower = 0, upper = 0.25),
                                                             makeIntegerVectorParam("which_stocks", len = max_stocks, lower = 1, upper = ncol(quarter.pool))), 
                                     noisy = FALSE, 
                                     vectorized = FALSE,
                                     has.simple.signature = FALSE
                                     )

ctrl <-  makeMBOControl(final.method = "best.true.y", propose.points = 1)
ctrl <- setMBOControlTermination(ctrl, iters = 10, target.fun.value = 0.0000001)

run <- mbo(fun = obj_f, control = ctrl)

obj_func(list(stock_weights = run$x$stock_weights, which_stocks = run$x$which_stocks)) == run$y 

What happened is that stock_weights and which_stocks got combined (by unlist()) into one vector and passed to your obj_func, which unfortunately did also work even if the second argument (which_stocks) is not defined and the first vector was too long. This should be a reminder to always add some feasibility checks to function arguments if the function is a bit more complex. You could do so by using the checkmate package. Basically you did not optimize the true objective function.

jakob-r
  • 6,824
  • 3
  • 29
  • 47