0

I'm currently running Bayesian Optimization, written in c++. I use a toolbox call Bayesopt from Ruben Martinez-Cantin (http://rmcantin.bitbucket.org/html/). I'm doing my thesis about Bayesian Optimization (https://en.wikipedia.org/wiki/Bayesian_optimization).

I had previously experimented with this toolbox and I have noticed this week that the code is running a lot slower than I remembered. It's worth mentioning that I did write some code that works with this toolbox.

I decided to try to understand why this was happening and I did witness that the code was running much slower than it should.

To try to understand if it was my code's fault or otherwise, I tried an example that doesn't use any of my code.

Consider the following example:

#include <iostream>
#include <bayesopt.hpp>


class ExampleMichalewicz: public bayesopt::ContinuousModel
{
public:
  ExampleMichalewicz(bopt_params par);

  double evaluateSample(const vectord& x);
  bool checkReachability(const vectord &query) {return true;};

  void printOptimal();

private:
  double mExp;
};


ExampleMichalewicz::ExampleMichalewicz(bopt_params par):
  ContinuousModel(10,par) 
{
  mExp = 10;
}

double ExampleMichalewicz::evaluateSample(const vectord& x)
{
  size_t dim = x.size();
  double sum = 0.0;

  for(size_t i = 0; i<dim; ++i)
    {
      double frac = x(i)*x(i)*(i+1);
      frac /= M_PI;
      sum += std::sin(x(i)) * std::pow(std::sin(frac),2*mExp);
    }
  return -sum;
}

void ExampleMichalewicz::printOptimal()
{
  std::cout << "Solutions: " << std::endl;
  std::cout << "f(x)=-1.8013 (n=2)"<< std::endl;
  std::cout << "f(x)=-4.687658 (n=5)"<< std::endl;
  std::cout << "f(x)=-9.66015 (n=10);" << std::endl;
}

int main(int nargs, char *args[])
{
  bopt_params par = initialize_parameters_to_default();
  par.n_iterations = 20;
  par.n_init_samples = 30;
  par.random_seed = 0;
  par.verbose_level = 1;
  par.noise = 1e-10;
  par.kernel.name        = "kMaternARD5";
  par.crit_name          = "cBEI";
  par.crit_params[0] =   1;
  par.crit_params[1] =   0.1;
  par.n_crit_params  =   2;
  par.epsilon        =   0.0;
  par.force_jump     =   0.000;
  par.verbose_level  =   1;
  par.n_iter_relearn     =     1; // Number of samples before relearn kernel
  par.init_method        =     1; // Sampling method for initial set 1-LHS, 2-Sobol (if available),
  par.l_type             = L_MCMC; // Type of learning for the kernel params


  ExampleMichalewicz michalewicz(par);
  vectord result(10);

  michalewicz.optimize(result);
  std::cout << "Result: " << result << "->" 
      << michalewicz.evaluateSample(result) << std::endl;
  michalewicz.printOptimal();

  return 0;
}

If I compile this example alone, the run time is about 23 seconds.

With this cmake file

PROJECT ( myDemo )

ADD_EXECUTABLE(myDemo ./main.cpp)

find_package( Boost REQUIRED )
if(Boost_FOUND)
   include_directories(${Boost_INCLUDE_DIRS})
else(Boost_FOUND)
   find_library(Boost boost PATHS /opt/local/lib)
   include_directories(${Boost_LIBRARY_PATH})
endif()

include_directories(${PROJECT_SOURCE_DIR}/include)
include_directories("../bayesopt/include")
include_directories("../bayesopt/utils")

set(CMAKE_CXX_FLAGS " -Wall -std=c++11 -lpthread -Wno-unused-local-typedefs -DNDEBUG -DBOOST_UBLAS_NDEBUG")

target_link_libraries(myDemo libbayesopt.a libnlopt.a)

Now consider the same main example, but where I add three additional files to my cmake project (without including them in main.cpp). These three files are subpart of all my code.

PROJECT ( myDemo )

ADD_EXECUTABLE(myDemo ./iCubSimulator.cpp ./src/DatasetDist.cpp ./src/MeanModelDist.cpp ./src/TGPNode.cpp)

find_package( Boost REQUIRED )
if(Boost_FOUND)
   include_directories(${Boost_INCLUDE_DIRS})
else(Boost_FOUND)
   find_library(Boost boost PATHS /opt/local/lib)
   include_directories(${Boost_LIBRARY_PATH})
endif()

include_directories(${PROJECT_SOURCE_DIR}/include)
include_directories("../bayesopt/include")
include_directories("../bayesopt/utils")

set(CMAKE_CXX_FLAGS " -Wall -std=c++11 -lpthread -Wno-unused-local-typedefs -DNDEBUG -DBOOST_UBLAS_NDEBUG")

target_link_libraries(myDemo libbayesopt.a libnlopt.a)

This time, the run time is about 3 minutes. This is critical in my work since if I increase par.n_iterations it tends to get much worse.

I further arrived at the conclusion that if I comment a line in TGPNode.cpp

utils::cholesky_decompose(K,L); (NOTICE THAT THIS LINE IS NEVER CALLED).

I get the 23 seconds. This function belongs to a file: ublas_cholesky.hpp, from the bayesopt toolbox.

It is also important to note that the same function is also called within the toolbox code. This line is not commented and it runs during michalewicz.optimize(result);.

Does anyone have any ideia why this is happening? It would be a great help if anyone has some insight about the subject.

Greatly appreciated.

Kindly, José Nogueira

Ze Nog
  • 13
  • 2
  • 1
    Did you use a debugger (`gdb`)? Did you compile (`g++`) with all warnings & debug info (`-Wall -Wextra -g`)? With optimizations (`-O2 -march=native`)? With profiling (`-pg`) then `gprof` and/or `oprofile` ? – Basile Starynkevitch Nov 05 '15 at 16:41
  • Also, I suspect that one of those files has some kind of global static variable whose initialization is taking some time. But that's just a guess. – Anon Mail Nov 05 '15 at 16:52
  • _"I have noticed this week that the code is running a lot slower than I remembered"_ - Can you go back to the version that you used previously? – sehe Nov 06 '15 at 20:57

1 Answers1

1

It's not gonna return.

It's going to infinitely recurse (to a stack overflow).

Here's what the code reads like:

bopt_params initialize_parameters_to_default(void)
{
  bayesopt::Parameters par;
  return par.generate_bopt_params();

And generate_bopt_params:

bopt_params Parameters::generate_bopt_params(){
    bopt_params c_params = initialize_parameters_to_default();

It looks like someone tried to remove code duplication without actually testing things. At all. You could reinstate the commented out body of the first function

sehe
  • 374,641
  • 47
  • 450
  • 633
  • Mmm. The revision that broke this was [4561d213e4 oct 28](https://bitbucket.org/rmcantin/bayesopt/commits/4561d213e4dd2b6aeb82dfd8153c46b59a421ba2). Looking further since you seem to have had an older version. – sehe Nov 06 '15 at 21:06