0

I am currently working on a NP-complete problem, and have implemented a personal genetic algorithm for this purpose. The results are more than I could have expected. With a well-designed fitness function and a couple population/mutation carefully tuned, I guess GA can be an excellent tool in certain cases.

Anyway, I am now looking for a metaheuristic (GA, simulated annealing...) capable of producing an optimal shuffling output.

By shuffle, I mean, in this context, an unbiased (à la Fisher-Yates) random permutation of a finite set. Like a card deck. A huge one (~ 500! permutations).

The values of this set are all different. No collisions are to be expected.

Because of this contraint, I have some difficulties to implement a GA solution. Indeed, the shuffled values cannot be used as genes. It is easy to see why:

#include <iostream>
#include <vector>
#define SPLICING 50 // 50|50 one-point crossover
int crossover(int gene, int DNA_length, int A, int B)
{if (gene < (SPLICING*DNA_length)/100) return A; else return B;}

int main() {
    std::vector<int> A, B, C;
    A = { 3, 4, 8, 12, 2, 0, 9, 7, 10, 20 };
    B = { 8, 10, 3, 4, 20, 0, 7, 9, 2, 12 };
    int DNA_length = int(A.size());
    for (int i=0; i<DNA_length; i++) {
        C.push_back(crossover(i, DNA_length, A[i], B[i]));
                    if (i == DNA_length/2) std::cout << "| ";
                    std::cout << C[i] << " ";}
            }

Output: 3 4 8 12 2 | 0 7 9 2 12

There are two collisions (2, 12).

My expected output is something like this: 3 4 8 12 2 | 0 7 9 10 20 (no collision, perfect shuffle of the original set).

Then, I need to encode the order of these values in order to avoid this kind of difficulties.

A naive way is to identify each value with a unique key. But the set then created is an ordinal one because it refers to the sequencing of the values.

I appears that the crossover function has to deal with the ordinality of the DNAs of the parents. But I can not wrap my head around the issue of mixing two nonlinearly ordered ordinal subsets (parents' DNA slices) of an ordinal set (whole DNA) without collision!

Maybe I can rely only on mutation for convergence. No selection, no parents/children, and only a swap function in the same set (individual's DNA). In short: not very convincing.

It is indeed easy to permute ordinal numbers in a unique finite set (e.g., trivially: the first becomes the seventh; the second, the tenth, etc.). But I am not sure if it makes sense to says that the first of set A becomes the seventh when the second of set B becomes the tenth of the new set.

Then, my question is:

In your opinion, can the ordinality of a set be shuffled using a crossover function in the context of a genetic algorithm? If no, can you suggest a metaheuristic approach more efficient for this purpose than a brute-force, hill climbing technique or genetic algorithm?

Thank you.

  • Mating of the Chromosomes would generally result in collisions in future generations, which would produce many unfit solutions. Have you considered manipulating the offspring (perhaps before or after mutation) to ensure there are no collisions in the solution? Perhaps the collision genes could be randomly assigned or assigned serially based on the parents with original placements. It's not the ideal for GAs, but it may assist in removing all the unfit solutions. Also, isn't using mutation like saying that some cards in the deck can be replaced with others while shuffling? – Matthew Spencer Sep 09 '14 at 00:21
  • It is a possibility, indeed, but I am afraid it will dramatically increase the randomness factor. Imagine you keep intact the DNA portion from parent 1 [1, 4, 3, 2 |. You have then to modify the DNA part from parent 2 | 2, 7, 5, 0] in order to make all the values uniques in the child's DNA [1, 4, 3, 2 | 6*, 7, 5, 0]. Thus, the second part of this DNA does not really reflect the qualities of parent 2's DNA. And if we see this as a mutation, this mutation is too important, probably > 5% of the whole DNA in my case. So more randomness than genetic selection. Thanks anyway for your answer! –  Sep 09 '14 at 00:49

1 Answers1

2

What you are looking for is called order based genetic algorithms. You have many order based crossover and mutation operators that are meant to work with this kind of problem. The simplest crossover operator works as follows:

  1. Select to crossover points
  2. Copy the part from parent1 within the crossover points to the first son.
  3. Make a list of the elements of parent1 that are outside the crossover points
  4. Put the elements of the unused list in the same order of that used in parent2
  5. Copy those elements to the first son in the order established in step 4.

You can see an example from my book in the figure below (sorry, but the descriptions are in portuguese - please correlate to the list above):

enter image description here

You can search the web for order based operators or if you prefer, please check the figures from my book at My Geneetic Algorithm book. The figures that interest you are those from chapter 10 (you can use Google translator to understand the legends).

You do not have to mind that the book uses sequential numbers - if you have no repetition, all the concepts explained are valid to your problem.

I hope it helps.

rlinden
  • 2,053
  • 1
  • 12
  • 13
  • Thank you very much. I have just implemented this solution (one-point crossover) and it gives promising results! I hope it will be useful for people in the same situation. –  Sep 09 '14 at 22:25
  • One thing worth mentioning: there are different permutation/order based crossover operators, and they aim to inherit different information. Suppose you have an individual like `1 4 5 0 2 3 6 7`, that is pretty good. What's good about it? For example, is it that 1 is next to 4, wherever they occur in the string, or is it that 4 is in exactly the second position? You can look into different operators that preserve different properties of good parents, and it can affect performance heavily. – deong Sep 12 '14 at 10:38
  • That's why IMO order based genetic algorithms are very difficult to tune. We can see two kind of set *ordinality*: optimal ordinal *subsets* (this number must be three positions after this one), and an optimal *global* ordinality (this number is to be at the tenth position of the set). If we focus only on the former, as n! becomes large, the solution is more likely to be suboptimal; on the latter only, and the solution is unlikely to be found in a reasonable time (because of the quasi-randomization of parent 2's DNA). Maybe the key is to mix the two approaches (ordinality of ordinalities). –  Sep 14 '14 at 13:25