I recently started using pybrain library for classification problems using neural networks and with some struggle and documentation I made it work.
Now, I would like to use blackbox optimization algorithms from the same library, but not applied to classification.
Basically, I am trying to reproduce some result from Randy's blog http://www.randalolson.com/2015/02/03/heres-waldo-computing-the-optimal-search-strategy-for-finding-waldo/.
So, as a first step, I constructed supervised dataset with the following snippet:
ds = SupervisedDataSet(2, 2)
for row in range(len(waldo_df)):
ds.addSample(inp=waldo_df.iloc[row][['Book', 'Page']], target=waldo_df.iloc[row][['X', 'Y']])
return ds
Now, one sample from the dataset looks like:
ds.getSample()
[array([ 5., 8.]), array([ 3.51388889, 4.31944444])]
On the next step I would like to use HillClimber algorithm to find the optimal path:
ef = ds.evaluateModuleMSE
init_value = ds.getSample()
learner = HillClimber(evaluator=ef, initEvaluable=init_value, minimize=True)
learner.learn()
What I get back in exception:
/Users/maestro/anaconda/lib/python2.7/site-packages/pybrain/datasets/supervised.pyc in evaluateModuleMSE(self, module, averageOver, **args)
96 res = 0.
97 for dummy in range(averageOver):
---> 98 module.reset()
99 res += self.evaluateMSE(module.activate, **args)
100 return res/averageOver
AttributeError: 'numpy.ndarray' object has no attribute 'reset'
Can someone help me figure out what I am doing wrong? The documentation on this is very sparse and even searching through the code base did not help.
Thanks
P.S. If I am reading the API correctly
class pybrain.optimization.HillClimber(evaluator=None, initEvaluable=None, **kwargs)
The simplest kind of stochastic search: hill-climbing in the fitness landscape.
the optimization algorithm needs to take only evaluator which in my case would be ds.evaluateModuleMSE
Update
The whole code snippet is:
import pandas as pd
from pybrain.optimization import HillClimber
from pybrain.datasets import SupervisedDataSet
waldo_df = pd.read_csv('whereis-waldo-locations.csv')
ds = SupervisedDataSet(2, 2)
for row in range(len(waldo_df)):
ds.addSample(inp=waldo_df.iloc[row][['Book', 'Page']], target=waldo_df.iloc[row][['X', 'Y']])
learner = HillClimber(evaluator=ds.evaluateModuleMSE, initEvaluable=ds.getSample(), minimize=True)