Genetic Algorithms Introduction

Question

Starting off let me clarify that i have seen This Genetic Algorithm Resource question and it does not answer my question.

I am doing a project in Bioinformatics. I have to take data about the NMR spectrum of a cell(E. Coli) and find out what are the different molecules(metabolites) present in the cell.

To do this i am going to be using Genetic Algorithms in R language. I DO NOT have the time to go through huge books on Genetic algorithms. Heck! I dont even have time to go through little books.(That is what the linked question does not answer)

So i need to know of resources which will help me understand quickly what it is Genetic Algorithms do and how they do it. I have read the Wikipedia entry ,this webpage and also a couple of IEEE papers on the subject.

Any working code in R(even in C) or pointers to which R modules(if any) to be used would be helpful.

There's also http://biostar.stackexchange.com/ that might be of some help. — Roman Luštrik, Dec 29 '11 at 09:41
It takes some theoretical knowledge to properly encode your problem in a way which the genetic algorithm will efficiently answer it. You're being short-sighted by refusing to invest time reading a book, such as Goldberg's. It's not a big book and you only have to read the first half. — Larry OBrien, Dec 30 '11 at 02:27

score 10 · Accepted Answer · answered Dec 29 '11 at 09:46

10

A brief (and opinionated) introduction to genetic algorithms is at http://www.burns-stat.com/pages/Tutor/genetic.html

A simple GA written in R is available at http://www.burns-stat.com/pages/Freecode/genopt.R The "documentation" is in 'S Poetry' http://www.burns-stat.com/pages/Spoetry/Spoetry.pdf and the code.

answered Dec 29 '11 at 09:46

Patrick Burns

887
4
7

Check also packages 'gaoptim', 'GA' and 'genalg'. – Fernando Apr 01 '13 at 14:10

score 4 · Answer 2 · answered Dec 29 '11 at 09:15

I assume from your question you have some function F(metabolites) which yields a spectrum but you do not have the inverse function F'(spectrum) to get back metabolites. The search space of metabolites is large so rather than brute force it you wish to try an approximate method (such as a genetic algorithm) which will make a more efficient random search.

In order to apply any such approximate method you will have to define a score function which compares the similarity between the target spectrum and the trial spectrum. The smoother this function is the better the search will work. If it can only yield true/false it will be a purely random search and you'd be better off with brute force.

Given the F and your score (aka fitness) function all you need to do is construct a population of possible metabolite combinations, run them all through F, score all the resulting spectrums, and then use crossover and mutation to produce a new population that combines the best candidates. Choosing how to do the crossover and mutation is generally domain specific because you can speed the process greatly by avoiding the creation of nonsense genomes. The best mutation rate is going to be very small but will also require tuning for your domain.

Without knowing about your domain I can't say what a single member of your population should look like, but it could simply be a list of metabolites (which allows for ordering and duplicates, if that's interesting) or a string of boolean values over all possible metabolites (which has the advantage of being order invariant and yielding obvious possibilities for crossover and mutation). The string has the disadvantage that it may be more costly to filter out nonsense genes (for example it may not make sense to have only 1 metabolite or over 1000). It's faster to avoid creating nonsense rather than merely assigning it low fitness.

There are other approximate methods if you have F and your scoring function. The simplest is probably Simulated Annealing. Another I haven't tried is the Bees Algorithm, which appears to be multi-start simulated annealing with effort weighted by fitness (sort of a cross between SA and GA).

score 1 · Answer 3 · edited Aug 22 '13 at 19:00

1

I've found the article "The science of computing: genetic algorithms", by Peter J. Denning (American Scientist, vol 80, 1, pp 12-14). That article is simple and useful if you want to understand what genetic algorithms do, and is only 3 pages to read!!

edited Aug 22 '13 at 19:00

Community

1
1

answered Aug 04 '13 at 19:22

Natalia

11
1

Genetic Algorithms Introduction

3 Answers3