A guide to the three pressures that shape innovation in living and non-living systems.
(a version of this article has also been posted to Medium)
I teach a course on Bio-Inspired AI and Optimization that is meant to be a graduate-level survey of nature-inspired algorithms that also provides a more serious background in the natural-science (primarily biological) fundamentals underlying the inspiration. The first half of the course covers nature-inspired optimization metaheuristics, with a heavy focus on evolutionary algorithms. An evolutionary algorithm is a scheme for automating the process of goal-directed discovery to allow computers to find innovative solutions to complex problems. There is a wide range of evolutionary algorithms, but a common feature of each is that the computer generates a population of random candidate solutions, evaluates the performance of each of these candidates, and then uses the best of these candidates as “parents” to guide the generation of new candidates in the next generation.
Most of my students come from computer science or engineering backgrounds and, as such, have very little formal education in biology let alone something as specific as population genetics (“popgen”). However, to really understand the complex process of evolutionary innovation inherent to evolutionary algorithms (and evolutionary computing in general), it requires at least some fundamental background in popgen. I think when most people reflect back on their high-school biology courses, they might remember something about natural selection and mutation being important in thinking about the evolution of adaptations in natural populations. However, there is a third evolutionary force that is extremely important — especially when considering small populations, like the ones that are artificially generated in an evolutionary algorithm. That force is (genetic) drift. So let’s review all three:
- Natural selection reflects that some individuals in a population will be at a fundamental disadvantage with respect to other individuals. Those individuals (who are, in the computational creativity context, are relatively poor solutions to a problem) will be very likely to be “selected out” in large populations because there will be so many other individuals who are relatively “fitter.” “Fitness” is a measure of how many offspring an individual can put into the next generation given the current context. If some individuals can put more individuals into the next generation than others, they are “more fit.” If all individuals have the same fitness, then every parent has the same chance of getting her offspring into the next generation. If some individuals have less fitness than others, then they have less chance of getting their offspring into the next generation.
Some people are taught that natural selection only matters when resources are scarce and thus population sizes are limited (thus making individuals compete for opportunities). This is not the whole story and is why we must discuss (genetic) drift below. Before getting into that, note that even in populations that are not limited, differences in the rates of growth of different strategies will gradually change the relative share a strategy has of a population. So even without resource limitation, differences in “relative fitness” will naturally select for the most fit individuals to have the strongest share of the population.
By itself, selection can only tune the relative proportions that different strategies have in a population. However, many evolutionary processes have a way of blending from different parents to create offspring that somehow interpolate from those parents. In biology, we view “sex” as the primary way in which we see “recombination” of strategies. There are sex-like mechanisms in evolutionary algorithms that do the same. So when natural selection is combined with recombination (“sex”), we get optimization combined with a little bit of goal-directed novelty generation. However, recombining strategies across different parents can deleterious because breaking up two functional strategies and putting them together does not guarantee that the result will itself be functional. Those strategies that result that are functional might improve upon both parents, but the novelty may be limited because it simply borrows from strategies of the parents. - Mutation is one way to introduce novelty that can be tuned to be less disruptive as recombination while also producing more novel solutions than recombining solutions from the parent generation. In mutation, random changes in a parent strategy are introduced. In a population of clones of a single strategy, mutation introduces novel variations that generates differences in offspring that hopefully lead to differences in relative fitness. These fitness differences will cause some mutations to grow in representation and others to shrink in representation. So one of the functions of drift is exploration to find new candidate solutions that might be better than anything in the current population. However, another important function of mutation is to balance the stagnating force of genetic drift.
- (Genetic) Drift is a subtle but extremely important evolutionary pressure that represents what occurs when population sizes eventually meet their limits. As mentioned above, in a world of plentiful resources, natural selection will allow every strategy to survive and produce offspring, but strategies that produce more offspring will grow in their share of the total population. Eventually, if the population is very large and becomes limited in how much it can grow, those strategies that have a lot of representation will have a much higher probability of being represented after the limitation kicks in. In other words, when population sizes are high, resource limitation is a culling effect—strategies that are more fit tend to be selected to continue and strategies that are less fit are “selected out” and removed. However, this culling effect eventually leads to its own demise as the removal of low-fitness individuals also results in the removal of diversity which is required for natural selection to work. As mentioned above, the action of natural selection only optimizes among the diversity of solutions in the parent generation. If the parent generation has no diversity, then there are no improvements that natural selection can make. When a population finds itself full of identical individuals and thus stuck and unable to generate any new novelty, we refer to that population as being “fixed” or having reached “fixation.” Genetic drift represents this gradual march toward fixation. Natural selection, when combined with population limitation, is always being pulled toward fixation where natural selection will fail to be able to act.
Fortunately, mutation (mentioned above) can rescue us from drift. Mutation introduces new variation in a population, and natural selection can choose strategies out of that new variation. So if we want to combat drift, we can just crank up the mutation rate. The downside of that is that the mutation rate also quickly corrupts well-performing strategies. So populations that have a high mutation rate will tend to have a diverse set of strategies within them and maintain a diverse set of fitnesses. Some individuals will have very high fitness, but they will co-exist with individuals with very low fitness (due to mutation) that are just a side effect of the stabilizing force of mutation. Reducing the mutation rate helps to ensure all solutions have similar fitness, but there is never any way to know if a population of individuals with similar fitness is because their shared strategy is good or they simply reached fixation too soon.
The problem of reaching fixation “too soon” is particularly strong for small population sizes. In a small population size, small differences in fitness may fail to generate sufficient selective pressure to dominate the force of genetic drift. For example, in a population that is limited to a size of 10, an individual with a fitness 1/100 of some other individual may still by “good luck” produce a single offspring in the next generation. That offspring, although 1/100'th as fit of a strategy as some other in the population, nevertheless takes up 10% of the next generation. So for small population sizes, mutation and drift are essentially the only drivers of evolution.
So when building an evolutionary algorithm, it is important to start with a diverse population and then build mutation and selection operations that maintain diversity as long as possible (staving off genetic drift). So long as the population is diverse, natural selection will continue to explore large regions of the strategy space. However, if mutation is too strong, then it will limit exploitation and tuning of strategies because adaptations that make small changes in fitness will quickly be lost to mutation. Consequently, if you have the computational budget, it is best to build very large population sizes with very low mutation rates and choose selection operators that moderate selection pressure — giving low-fitness strategies a chance to stay in the large population pool.
Similarly, when thinking about evolution in natural systems, it is important to remember how large the ancestral populations were. Those that evolved in large-population contexts may tend to show more signs of natural selection (and will likely have evolved mechanisms to reduce the mutation rate). Those that evolved in small-population contexts may tend to have high mutation rates and show diversity patterns more closely related to randomness. This latter case relates to neutral theories of evolution, which are important to consider when trying to understand the source of observed variation in systems we see today.
This story is summarized in the graphic I’ve prepared above, which shows mutation and natural selection as forces re-shaping populations within a drift field that, in the absence of those forces, will eventually homogenize the population on an arbitrary strategy.
So how do we come up with interesting new ideas for mutation and selection operators for evolutionary algorithms? We should continue to look at population genetics. In fact, some theories in population genetics (like Sewall Wright’s shifting-balance theory) are much better descriptors of evolutionary algorithm behavior than the more complex evolutionary trajectories of living systems. For example, distributed genetic algorithms, which create islands of evolutionary algorithms that only exchange population members across island boundaries infrequently, tend to out-perform conventional genetic algorithms on the same computational budgets for reasons that make sense in light of population genetics. This is a more advanced topic, and you’re welcome to read/listen more about this in my CSE/IEE 598 lectures. For now, I hope you look at living and non-living populations around you through the lenses of mutation, drift, and natural selection.