5

I am trying to tune some params and the search space is very large. I have 5 dimensions so far and it will probably increase to about 10. The issue is that I think I can get a significant speedup if I can figure out how to multi-process it, but I can't find any good ways to do it. I am using hyperopt and I can't figure out how to make it use more than 1 core. Here is the code that I have without all the irrelevant stuff:

from numpy    import random
from pandas   import DataFrame
from hyperopt import fmin, tpe, hp, Trials





def calc_result(x):

    huge_df = DataFrame(random.randn(100000, 5), columns=['A', 'B', 'C', 'D', 'E'])

    total = 0

    # Assume that I MUST iterate
    for idx_and_row in huge_df.iterrows():
        idx = idx_and_row[0]
        row = idx_and_row[1]


        # Assume there is no way to optimize here
        curr_sum = row['A'] * x['adjustment_1'] + \
                   row['B'] * x['adjustment_2'] + \
                   row['C'] * x['adjustment_3'] + \
                   row['D'] * x['adjustment_4'] + \
                   row['E'] * x['adjustment_5']


        total += curr_sum

    # In real life I want the total as high as possible, but for the minimizer, it has to negative a negative value
    total_as_neg = total * -1

    print(total_as_neg)

    return total_as_neg


space = {'adjustment_1': hp.quniform('adjustment_1', 0, 1, 0.001),
         'adjustment_2': hp.quniform('adjustment_2', 0, 1, 0.001),
         'adjustment_3': hp.quniform('adjustment_3', 0, 1, 0.001),
         'adjustment_4': hp.quniform('adjustment_4', 0, 1, 0.001),
         'adjustment_5': hp.quniform('adjustment_5', 0, 1, 0.001)}

trials = Trials()

best = fmin(fn        = calc_result,
            space     = space,
            algo      = tpe.suggest,
            max_evals = 20000,
            trials    = trials)

As of now, I have 4 cores but I can basically get as many as I need. How can I get hyperopt to use more than 1 core, or is there a library that can multiprocess?

user1367204
  • 4,549
  • 10
  • 49
  • 78

4 Answers4

7

If you have a Mac or Linux (or Windows Linux Subsystem), you can add about 10 lines of code to do this in parallel with ray. If you install ray via the latest wheels here, then you can run your script with minimal modifications, shown below, to do parallel/distributed grid searching with HyperOpt. At a high level, it runs fmin with tpe.suggest and creates a Trials object internally in a parallel fashion.

from numpy    import random
from pandas   import DataFrame
from hyperopt import fmin, tpe, hp, Trials


def calc_result(x, reporter):  # add a reporter param here

    huge_df = DataFrame(random.randn(100000, 5), columns=['A', 'B', 'C', 'D', 'E'])

    total = 0

    # Assume that I MUST iterate
    for idx_and_row in huge_df.iterrows():
        idx = idx_and_row[0]
        row = idx_and_row[1]


        # Assume there is no way to optimize here
        curr_sum = row['A'] * x['adjustment_1'] + \
                   row['B'] * x['adjustment_2'] + \
                   row['C'] * x['adjustment_3'] + \
                   row['D'] * x['adjustment_4'] + \
                   row['E'] * x['adjustment_5']


        total += curr_sum

    # In real life I want the total as high as possible, but for the minimizer, it has to negative a negative value
    # total_as_neg = total * -1

    # print(total_as_neg)

    # Ray will negate this by itself to feed into HyperOpt
    reporter(timesteps_total=1, episode_reward_mean=total)

    return total_as_neg


space = {'adjustment_1': hp.quniform('adjustment_1', 0, 1, 0.001),
         'adjustment_2': hp.quniform('adjustment_2', 0, 1, 0.001),
         'adjustment_3': hp.quniform('adjustment_3', 0, 1, 0.001),
         'adjustment_4': hp.quniform('adjustment_4', 0, 1, 0.001),
         'adjustment_5': hp.quniform('adjustment_5', 0, 1, 0.001)}

import ray
import ray.tune as tune
from ray.tune.hpo_scheduler import HyperOptScheduler

ray.init()
tune.register_trainable("calc_result", calc_result)
tune.run_experiments({"experiment": {
    "run": "calc_result",
    "repeat": 20000,
    "config": {"space": space}}}, scheduler=HyperOptScheduler())
richliaw
  • 1,925
  • 16
  • 14
  • I just tried ray (0.6.0) and sadly it requires tensorflow. – Wojciech Migda Dec 23 '18 at 22:12
  • What was the error message that you got? It might be benign (the TF logger not starting correctly, but Ray Tune continues to run). – richliaw Dec 26 '18 at 09:13
  • I don't remember exactly (I deleted ray version of my script since then), but it was aborting trying to import tensorflow. – Wojciech Migda Dec 26 '18 at 23:18
  • I see - let me know if you ever revisit it; I'd be interested in resolving your issue. – richliaw Dec 28 '18 at 07:49
  • Is this still the correct way to parallelize `hyperopt` trials with `ray`? I can't get the example to work and in the [documentation](https://docs.ray.io/en/latest/tune/api_docs/suggestion.html#hyperopt-tune-suggest-hyperopt-hyperoptsearch) I do not find a reference to parallelization. – fabian Sep 12 '20 at 12:07
  • Try something like this - https://docs.ray.io/en/latest/tune/tutorials/tune-tutorial.html#search-algorithms-in-tune – richliaw Sep 12 '20 at 17:23
1

You can use multiprocessing to run tasks that, through bypassing Python's Global Interpreter Lock, effectively run concurrently in the multiple processors available.

To run a multiprocessing task, one must instantiate a Pool and have this object execute a map function over an iterable object.

The function map simply applies a function over every element of an iterable, like a list, and returns another list with the elements on it.

As an example with search, this gets all items larger than five from a list:

from multiprocessing import Pool

def filter_gt_5(x):
   for i in x:
       if i > 5
           return i

if __name__ == '__main__':
    p = Pool(4)
    a_list = [6, 5, 4, 3, 7, 8, 10, 9, 2]
    #find a better way to split your list.
    lists = p.map(filter_gt_5, [a_list[:3], a_list[3:6], a_list[6:])
    #this will join the lists in one.
    filtered_list = list(chain(*lists))

In your case, you would have to split your search base.

Gabriel Fernandez
  • 580
  • 1
  • 3
  • 14
  • The issue is that `hyperopt` keeps track of the best combination of parameters and uses the best combo as suggestions on where to search next, so if I do what you suggested then each thread does not refer to the same best combo but to different ones. It needs to handle the shared object, which your suggestion doesn't consider and I'm not technically competent enough to write. – user1367204 Mar 19 '18 at 19:56
  • It seems like the `fmin` function does the search work. I don't really know how to run it concurrently (since it will use `space` as a shared object, I suspect that it is not possible). The only way would be to get the source of this search function and modify to run the task with multiprocessing. But how much of the actual work does it take? Is the workload generation heavy? It can definitely be divided. – Gabriel Fernandez Mar 19 '18 at 20:04
  • This particular example would easily take days to converge to an answer. – user1367204 Mar 19 '18 at 20:05
1

What you are asking can be achieved by using SparkTrials() instead of Trials() from hyperopt.

Refer the document here.

SparkTrials API :
SparkTrials may be configured via 3 arguments, all of which are optional:

parallelism

The maximum number of trials to evaluate concurrently. Greater parallelism allows scale-out testing of more hyperparameter settings. Defaults to the number of Spark executors.

Trade-offs: The parallelism parameter can be set in conjunction with the max_evals parameter in fmin(). Hyperopt will test max_evals total settings for your hyperparameters, in batches of size parallelism. If parallelism = max_evals, then Hyperopt will do Random Search: it will select all hyperparameter settings to test independently and then evaluate them in parallel. If parallelism = 1, then Hyperopt can make full use of adaptive algorithms like Tree of Parzen Estimators (TPE) which iteratively explore the hyperparameter space: each new hyperparameter setting tested will be chosen based on previous results. Setting parallelism in between 1 and max_evals allows you to trade off scalability (getting results faster) and adaptiveness (sometimes getting better models).

Limits: There is currently a hard cap on parallelism of 128. SparkTrials will also check the cluster’s configuration to see how many concurrent tasks Spark will allow; if parallelism exceeds this maximum, SparkTrials will reduce parallelism to this maximum.

Code snippet:

from hyperopt import SparkTrials, fmin, hp, tpe, STATUS_OK

spark_trials = SparkTrials(parallelism= no. of cores)

best_hyperparameters = fmin(
  fn=train,
  space=search_space,
  algo=algo,
  max_evals=32)

Another useful reference:

  • For me this rises `Exception: SparkTrials cannot import pyspark classes. Make sure that PySpark is available in your environment. E.g., try running 'import pyspark'` and the suggested `import pyspark` raises `ModuleNotFoundError: No module named 'pyspark'`. Any suggestions how to solve this? – NicoH Nov 09 '21 at 16:12
  • This isnt actually using `spark_trials` – John Stud Sep 10 '22 at 07:50
0

Just some side-notes on your question. I am recently working on hyperparameter search too, if you have your own reasons, please just ignore me.

Thing is you should prefer random search over grid search.

Here is the paper where they proposed this.

And here is some explanation: Basically random search is better distributed on the subfeatures, grid search is better distributed on the whole feature space, which is why this feels to be the way to go.

http://cs231n.github.io/neural-networks-3/ this is where the image is from

Image is from here

mrk
  • 8,059
  • 3
  • 56
  • 78
  • Hyperopt is in most cases better than random search, because it chooses it's next combination of parameters based on all scoring results you have at that moment. It just makes smarter choices of parameters. – Sander van den Oord May 22 '19 at 11:27