2

I am currently Coding a Genetic algorithm to solve a problem that could be considered similar to the knapsack problem but the differentiating factor is that I am storing items over multiple "trucks", and their value/importance is based on a 1 = Most important, 3 = Least important.

How i am doing this is with an array of binary integers, 0 = not on the truck, 1 = on the truck.

For the most part, the program seems to be doing what it needs to do. Except when I am comparing the chromosomes to find the best three individual chromosomes for each generation. Individual being that no chromosomes can contain the same item, As the item can only be on one truck at a time.

This is the function I am using to compare the chromosomes:

    private int[] getBestSolution(int count)
    {
        int[] Genes = new int[3];
        int min1 = Global.p.population[0].fitness;
        int min2 = Global.p.population[0].fitness;
        int min3 = Global.p.population[0].fitness;
        int ref1 = 0;
        int ref2 = 0;
        int ref3 = 0;
        bool areSame = true;

        //searches for the first "Fittest" chromosome and stores to ref1
        for (int i = 0; i < Global.population; i++)
        {
                if (min1 > Global.p.population[i].fitness)
                {
                    min1 = Global.p.population[i].fitness;
                    ref1 = i;
                }
        }

        //searches for 2nd fittest, while also checking they are different
        for (int i = 0; i < Global.population; i++)
        {
            areSame = arrayCompare(Global.p.population[ref1].chromosome, Global.p.population[i].chromosome);
            if(areSame == true)
            {
                continue;
            }
            if (min2 > Global.p.population[i].fitness)
            {
                min2 = Global.p.population[i].fitness;
                ref2 = i;
            }
        }

        //Searches for 3rd fittest while checking that it is different from the first two
        for (int i = 0; i < Global.population; i++)
        {
            areSame = arrayCompare(Global.p.population[ref1].chromosome, Global.p.population[i].chromosome);
            if (areSame == true)
            {
                continue;
            }
            areSame = arrayCompare(Global.p.population[ref2].chromosome, Global.p.population[i].chromosome);
            if(areSame == true)
            {
                continue;
            }
            if (min3 > Global.p.population[i].fitness)
            {
                min3 = Global.p.population[i].fitness;
                ref3 = i;
            }
        }
        //stores the reference of each chromosome and return
        Genes[0] = ref1;
        Genes[1] = ref2;
        Genes[2] = ref3;
        return Genes;
    }

This is the function i am using to compare the chromosome to the Previously selected chromosome to ensure they do not containt the same value.

    public bool arrayCompare(int[] a, int[] b)
    {
        for (int i = 0; i < a.Length; i++)
        {
            if ((a[i] == 1) && b[i] == 1)
            {
                return true;
            }
        }
        return false;
    }

I have narrowed it down to these two fuctions causing the problem by using breakpoints at different points of the code and checking what the values are vs what they are expected to be.

Finally, to the problem itself. The getBestSolution() produces the first "fittest" value perfectly. The second and third values stay as what they were initialized as "population[0].fitness" and that is displayed.

I attempted to change the breeding structure, the selection criteria for breeding as I believed that if there were no chromosomes containing different values, it would stay as initialized. I have also tested using larger populations for this same reason, so i believe that their must be a problem with my code.

Any help is much appreciated.

J.Ralph
  • 37
  • 4

1 Answers1

0

1. A note about your code.

You should initialize min1, min2, and min3, to a value higher than the maximum that fitness can be, because in the case Global.p.population[0].fitness is the best fitness in the population, you'll never find the second and third bests, because in this case no individual will have a fitness lower than Global.p.population[0].fitness.

2. Probabilities about the genes your are using.

I may be wrong but here is my theory on the problem you described.

In a random distribution, for 1 gene, there is a total of 4 possible combinations of the genes' values for two individuals. On these 4 combinations, there are 3 where the two individuals don't have in common this gene set to 1.

               Individual 1    Individual 2   Match criterion    Probability
Combination 1:      0               0               Yes            25% | 
Combination 2:      0               1               Yes            25% | 75%
Combination 3:      1               0               Yes            25% |
Combination 4:      1               1               No             25%

It means that for individuals that have only one gene, you have 75% chances to find two individuals that match the criterion. In other words, if you compare the genes of 100 individuals, you will find in average 75 pairs of individuals that match the criterion. But there will be some populations where you won't find them, and the most of populations where you will find them.

As the number of genes grow, the percentage decrease. Each time you add 1 gene to the chromosome, you have to multiply by 75/100 the percentage of individuals matching the criterion. Each time you add 2 genes, you have to multiply the percentage by 56,25/100.

Number of genes         percentage of individuals matching the criterion
      1                                75,00%
      2                                56,25% = 75,00 * 75,00 / 100

      4                                31,64% = 56,25 * 56,25 / 100

      6                                17,79% = 31,64 * 56,25 / 100

      8                                10,01% = 17,79 * 56,25 / 100

    [...]

     16                                 1%

It means that for a chromosome of 16 genes, and a population of 100, you will find in average 1 pair of individuals matching the criterion. The percentage is even smaller if you want 3 individuals matching the criterion. For 6 genes, the probability to find these 3 individuals should be arround 1,5%.

So the problem is that in a population, there isn't enough pairs of individuals that are matching the criterion.

3. Another way to design the algorithm.

In your algorithm, you seem to consider an individual as a truck. I would design the algorithm in another way.

I would consider a gene as the truck number for an item, and if needed, 0 to indicate the item has no assigned truck:

gene[0] = 1 (item 0 is in truck 1)
gene[1] = 0 (item 1 is not in any truck) 
gene[2] = 3 (item 2 is in truck 3)
gene[3] = 2 (item 3 is in truck 2)
gene[4] = 1 (item 4 is in truck 1)

With this coding of the information, the second and third best individuals can have one item in the same truck than the best. For example:

first best individual:
----------------------
truck1: item1 item2
truck2: item3 item4
truck3: item5 item6 item7
not in any truck: item0

second best individual:
-----------------------
truck1: item1 item3
truck2: item2 item5
truck3: item4 item6
not in any truck: item0 item7

In this example, item1 is always in truck1, and item6 is always in truck3.

So it is possible to select the second and third best individuals without comparing the genes, just checking the fitness.

Bertrand125
  • 904
  • 6
  • 13