2

I'm learning to use the Python DEAP module and I have created a minimising fitness function and an evaluation function. The code I am using for the fitness function is below:

ct.create("FitnessFunc", base.Fitness, weights=(-0.0001, -100000.0))

Notice the very large difference in weights. This is because the DEAP documentation for Fitness says:

The weights can also be used to vary the importance of each objective one against another. This means that the weights can be any real number and only the sign is used to determine if a maximization or minimization is done.

To me, this says that you can prioritise one weight over another by making it larger.


I'm using algorithms.eaSimple (with a HallOfFame) to evolve and the best individuals in the population are selected with tools.selTournament.

The evaluation function returns abs(sum(input)), len(input). After running, I take the values from the HallOfFame and evaluate them, however, the output is something like the following (numbers at end of line added by me):

(154.2830144, 3)            1
(365.6353634, 4)            2
(390.50576340000003, 3)     3
(390.50576340000003, 14)    4
(417.37616340000005, 4)     5

The thing that is confusing me is that I thought that the documentation stated that the larger second weight meant that len(input) would have a larger influence and would result in an output like so:

(154.2830144, 3)            1
(365.6353634, 4)            2
(390.50576340000003, 3)     3
(417.37616340000005, 4)     5
(390.50576340000003, 14)    4

Notice that lines 4 and 5 are swapped. This is because the weight of line 4 was much larger than the weight of line 5.

It appears that the fitness is actually evaluated based on the first element first, and then the second element is only considered if there is a tie between the first elements. If this is the case, then what is the purpose of setting a weight other than -1 or +1?

JolonB
  • 415
  • 5
  • 25

1 Answers1

5

From a Pareto-optimality standpoint, neither of the two A=(390.50576340000003, 14) and B=(417.37616340000005, 4) solutions are superior to the other, regardless of the weights; always f1(A) > f1(B) and f2(A) < f2(B), and therefore neither dominates the other (source):

Source: Wikipedia

If they are on the same frontier, the winner can now be selected based on a secondary metric: density of solutions surrounding each solution in the frontier, which now accounts for the weights (wighted crowding distance). Indeed, if you select an appropriate operator, like selNSGA2. The selTournament operator you are using selects on the basis the first objective only:

def selTournament(individuals, k, tournsize, fit_attr="fitness"):
    chosen = []
    for i in xrange(k):
        aspirants = selRandom(individuals, tournsize)
        chosen.append(max(aspirants, key=attrgetter(fit_attr)))
    return chosen

If you still want to use that, you can consider updating your evaluation function to return a single output of the weighted sum of the objectives. This approach would fail in the case of a non-convex objective space though (Page 12 here for details).

enter image description here

Reveille
  • 4,359
  • 3
  • 23
  • 46