6

I have a problem which I am trying to solve with genetic algorithms. The problem is selecting some subset (say 4) of 100 integers (these integers are just ids that represent something else). Order does not matter, the solution to the problem is a SET of integers not an ordered list. I have a good fitness function but am having trouble with the crossover function.

I want to be able to mate the following two chromosomes:

[1 2 3 4] and [3 4 5 6] into something useful. Clearly I cannot use the typical crossover function because I could end up with duplicates in my children which would represent invalid solutions. What is the best crossover method in this case.

mjv
  • 73,152
  • 14
  • 113
  • 156
aloo
  • 5,331
  • 7
  • 55
  • 94

5 Answers5

3

Just ignore any element that occurs in both of the sets (i.e. in their intersection.), that is leave such elements unchanged in both sets.

The rest of the elements form two disjoint sets, to which you can apply pretty much any random transformation (e.g. swapping some pairs randomly) without getting duplicates.

This can be thought of as ordering and aligning both sets so that matching elements face each other and applying one of the standard crossover algorithms.

Rafał Dowgird
  • 43,216
  • 11
  • 77
  • 90
  • I will try this. Why is it best to ignore common elements? If these are two good parents don't you want to keep the common elements because that is what may be making them good solutions? – aloo Dec 14 '10 at 13:24
  • Also, is there a common name for this type of problem such that I can start looking up research papers on it? – aloo Dec 14 '10 at 13:24
2

Sometimes it is beneficial to let your solution go "out of bounds" so that your search will converge more quickly. Rather than making a set of 4 unique integers a requirement for your chromosome, make the number of integers (and their uniqueness) part of the fitness function.

Ben Jackson
  • 90,079
  • 9
  • 98
  • 150
  • Wouldn't this tale longer to coverage because you are testing a bunch of permutations that that will never have a chance of being valid? – aloo Dec 14 '10 at 13:19
  • Genetic algorithms converge best if the fitness function has a nice hill to climb. By allowing the longer solutions you let the algorithm solve the simpler problem of "find N integers with the desired property" followed by a minimization problem of "reduce N to 4". – Ben Jackson Dec 14 '10 at 18:34
2

Since order doesn't matter, just collect all the numbers into an array, sort the array, throw out the duplicates (by disconnecting them from a linked list, or setting them to a negative number, or whatever). Shuffle the array and take the first 4 numbers.

Andy A.
  • 21
  • 2
1

I don't really know what you mean on "typical crossover", but I think you could use a crossover similar to what is often used for permutations:

  • take m ints from the first parent (m < n, where n is the number of ints in your sets)
  • scan the second and fill your subset from it with (n-m) ints that are free (not in the subset already).

This way you will have n ints from the first and n-m ints from the second parent, without duplications.

Sounds like a valid crossover for me :-).

I guess it might be beneficial not to do either steps on ordered sets (or using an iterator where the order of returned elements correlates somehow with the natural ordering of ints), otherwise either smaller or higher numbers will get a higher chance to be in the child making your search biased.

If it is the best method depends on the problem you want to solve...

Sandor Murakozi
  • 4,372
  • 25
  • 27
1

In order to combine sets A and B, you could choose the resulting set S probabilistically so that the probability that x is in S is (number of sets out of A, B, which contain x) / 2. This will be guaranteed to contain the intersection and be contained in the union, and will have expected cardinality 4.

Josephine
  • 1,409
  • 1
  • 9
  • 11
  • This seems to work. However will this produce children so similar to their parents that the entire population will quickly converge to the same solution? Also, is there a common name for this type of problem such that I can start looking up research papers on it? – aloo Dec 14 '10 at 13:22