Biology's Gift to a Complex World
How studying biological interactions and evolution yields techniques for predicting the outcome of complex interactions.
By John Holland
Article Extras1 It broke new ground for aircraft turbine efficiency. In 1990, a Santa Fe, NM-based investment firm called The Prediction Company introduced a new strategy, the long-term success of which attracted the attention of some of the largest finance houses. The company was bought by Union Bank of Switzerland in 2005. In 2008, two researchers developed a more efficient way of finding defects in silicon microchips, saving tens of millions of dollars annually, and inserting more control into the process. These three breakthroughs, and many others besides, share a common foundation: The application of biology to complex engineering problems.
When I started working on the mathematical description of evolution called genetic algorithms, over 40 years ago, I hardly imagined the range of problems that genetic algorithms could solve.2 Today, scientists and engineers from fields as diverse as agriculture, synthetic biology, and mechanical engineering are using genetic algorithms to find efficient solutions to their problems.
I'm no engineer, so I've applied genetic algorithms to the area I know and like best: complex adaptive systems. These are systems that have multiple levels of co-evolution, like ecosystems, which through their complex interactions, become more than the sum of their parts. One example is the immune system, a conglomerate of cells, cytokines and proteins, containing many levels of interaction and adaptation which provide the power to protect a human from the majority of pathogens encountered. To be able to capture the essence of these complex adaptive systems is to be able to predict their most powerful abilities. If we can create a mathematical model of complex adaptive systems that matches the usefulness of the genetic algorithm, it could help solve problems that we haven't yet learned how to tackle.
The idea to build a computer algorithm which mimicked evolution's selection and adaption wouldn't have occurred to me if not for my introduction to the earliest computers. As an undergrad at Massachusetts Institute of Technology in 1949, my mentor Zdenek Kopal, who later became the chair of astronomy at the University of Manchester in England, let me tinker with the Whirlwind computer, the first real-time computer with a screen, for my undergraduate thesis. I was working on Southwell's relaxation method, which was a way of solving geometric differential equations. Although Whirlwind was built for missile defense, it served quite well for a computerized version of Southwell's method.
To my surprise, that experience was enough to give me the qualifications IBM was looking for to help design its first commercial, internally programmable computer. It was first called the Defense Calculator, and later designated the IBM 701. With only an undergraduate degree, I was the youngest member by far. I worked alongside future leaders of artificial intelligence, John McCarthy and Art Samuel, testing the programs we fed into the computer on punch cards through the night. At daybreak, the engineers took over the computer, rearranging the hardware that we had built our programs on, much to our frustration. Art was working on a program that could play checkers with a human, and I was working on simulating neural networks. In 1952, I left this exciting work to begin work on a PhD in mathematics at the University of Michigan, though I remained a consultant to IBM for many years.
Computers and their potential were still on my mind when I studied in Michigan. I spent hours in the library stacks of the math department, idly picking up and flipping through books. There was one book that caught my interest. It was a classic on genetics by Ronald A. Fisher, but full of mathematical theory, not just statistics. In it was a mathematics that was much less abstract than the cylindrical algebra I was working on for my dissertation, and it was the first time I had seen real math in biology.
I remember being spellbound by Fisher's theory on mutations and the spread of traits. He postulated that the more variance there was in a population, the more rapidly that population would evolve, and wrote a mathematical theorem that represented this pattern. To solve his equations, Fisher concentrated on the effects of variant genes (alleles) at a single locus, adding up the fitness at all the individual loci to determine the overall fitness of an organism. With this approximation, the greater the variance in fitness at individual loci, the more rapidly the average fitness of a population will increase. Fisher also pointed out that as average fitness increases, variant alleles increasing fitness become more rare, so that high variance becomes more difficult to maintain as average population fitness increases.
This approximation bothered me because, even in the 1950's, it was well-established that alleles at different loci interact in nonadditive ways, forming clusters called co-adapted sets of alleles. This approximation also ignored crossover which is the vital component of cross-breeding. In sexual organisms, crossover occurs in every individual in every generation. It is part of why we carry some features of both parents. Mutations at loci, in contrast, occur with very low probabilities, usually less than one in a million. I started to think about how to account for these two missing factors mathematically.
It was around that time that I started talking more frequently with Arthur Burks, a pioneer in computer theory, who was in the process of founding a graduate program in Communication Sciences - a field that was later renamed computer science. He convinced me to switch to the new program, which meant scrapping my dissertation and preparing a new topic on the complexity of logical networks. To enter the program I had to take a number of classes in biology and genetics, which helped me consolidate my thoughts about Fisher's theorem.
The more I read about biology and talked with geneticist Jim Neal, also at the University of Michigan, the more it seemed to me that the major driving force behind evolution was not mutation, but the recombination of genes that occurs because of the crossing over of chromosomes during sexual reproduction. In order to breed a better racehorse, breeders mate a fast horse to a sturdy one, rather than introduce random mutations and look for a horse with superior qualities.
Once I realized the importance of recombination, I saw that the traditional mathematics of population genetics with its dependence on additive approximations would fall short in modeling evolution because it lacked the ability to handle recombination. It seemed obvious to use a computer, which wasn't limited to solving linear, additive problems. I started working on algorithms mimicking crossing over and published a sketch of the basic ideas behind genetic algorithms in a 1962 paper. An extension of Fisher's Fundamental Theorem to genes that interacted in non-additive ways followed - it was called the Schema Theorem. That theorem has a central role in my book Adaptation in Natural and Artificial Systems.3
For more than 20 years, the only people working on genetic algorithms were my students and then their students. At the time, most people thought of evolution as an extraordinarily slow process, and couldn't imagine how an algorithm that mimicked evolution could be better than an engineered solution. Luckily, some of my students had been engineers, and started applying the algorithm to problems that were otherwise unsolvable.
Genetic algorithms are fascinating because they can be applied to almost anything. All you have to do is think of the problem as if it were a mass of chromosomes that carry a set of traits (see graphic on p. 40). Each chromosome represents a completed design made up of multiple genes (e.g., the design for a jet engine). A rating can be assigned to the design by testing it, usually by simulation, which confers each design a "fitness." As with the racehorses, the building blocks of good designs can be recombined, by crossing chromosomes, to produce better, fitter designs. One can expect that some new offspring design will be an improvement on previous good designs (see graphic above). Indeed, the schema theorem gives a theoretical basis for this expectation. So it has proved in practice.4
Today, genetic algorithms underlie a wide range of engineering algorithms that search for the optimal form or better designs. Every year since 2004, researchers have vied for the "Humies" award, totaling $10,000, for evolutionary computational techniques that compete with human capabilities. I am not directly involved in this competition, however, many of my former students attend. This year's Bronze award went to Moshe Sipper from Ben-Gurion University and Assaf Glazer from Israeli company Applied Materials, who used genetic algorithms to improve the accuracy of their silicon microchip defect review process. The modest 3% increase in efficiency they gained translates to tens of millions of dollars in savings. The genetic algorithm also improves the process for future products by creating a higher classification system and feedback control. Past contestants used genetic algorithms to improve how cochlear implants are fitted, or to find a more efficient design for fiber optic cables. Genetic algorithms don't necessarily find the simplest, or most elegant solution - the mathematician's ideal. Solutions found by genetic algorithms often look messy, but the complexity of their design, can confer advantage.
Despite its versatility, there are situations in which the genetic algorithm is not the problem solving tool of choice. If the problem is linear, if it can be divided into simple parts and the solution can be obtained by adding these parts, then other algorithms will be more effective. Also, if a problem is based on too many random and unrelated variables, then using the genetic algorithm will be no better than guessing.
I continued working on genetic algorithms through the 1980s, teaching and giving talks. A talk I gave at Los Alamos in 1986 caught the attention of a distinguished looking man in the audience, who stood up and asked a number of questions about my work. Only later did I learn the inquisitive man was the pre-eminent physicist Murray Gell-Mann, who won the 1968 Nobel Prize in physics for his theory on quarks.
Gell-Mann invited me to join a group of some 15 researchers to discuss the possibilities of forming an institute devoted to the study of complex systems.5 Complex systems are those in which the aggregate of parts exhibit properties which wouldn't be expected from the parts alone. For example, a cell performs numerous functions, which require the interaction of multiple components. When taken alone, the components of a cell has very limited functions.
These informal gatherings, organized by George Cowan, were the beginnings of what would eventually become the Santa Fe Institute. Cowan has been something of a hero to me. He was an early member of the Manhattan Project and is a very modest man who initiated many important activities, both cultural and scientific. He had the discernment to pick scientists who were as good at listening as at talking for the Santa Fe Institute, which made it the world-renowned center it is today.
Complexity had caught the attention of these titans of the scientific world. It had all the trappings of the next big problem in science. In order to understand these robust systems we would have to look across many disciplines of science for commonalities and patterns that are present across the board.
In my view, there was one component of complex systems which was missing - the component of adaptation. The complex systems that I was interested in were those whose parts not only interacted to create novel properties, but also co-evolved and adapted new rules to weather the fluctuations in their environment. I called them 'complex adaptive systems' or CAS. Genetic algorithms were the perfect tool to model the adaptation of these systems.
It was at the Santa Fe Institute that I met economist Brian Arthur around 1988. Together we started playing with the idea of applying my genetic algorithms to the complex system of economics. Many people have modeled the stock market using combinations of random variables, statistics and averages, but these models do not capture the dynamics of the market. For example, they do not exhibit the bubbles and crashes that are common features of real markets. We decided that if we could apply genetic algorithms - which not only recombined parts, but modeled adaptation and learning - we'd have a way of predicting the varied behavior of a market. Instead of applying genetic algorithms to objects - like the parts of a jet engine - we applied them to the rules that govern the marketplace, in order to form new rules, cleverer rules, through recombination. We published our ideas in 1997.6
As these ideas percolated through the Santa Fe Institute, others became interested in genetic algorithms as a way of modeling the adaptation of rule-based agents. Doyne Farmer and Norm Packard, active members of the Institute who had collaborated in the past, founded the Prediction Company with the purpose of using similar techniques to profit from day to day fluctuations in the market. On the evidence of the nice cars driven by its founders, it was, and is, a successful venture.
These studies and ventures made it clear that we could use genetic algorithms for more than optimization - we could use them as the learning component within CAS, systems which are both pervasive and notoriously difficult to study.
At the outset, it may not seem that complex adaptive systems such as the immune system, the economic function of a city, and tropical rainforests could have much in common. But all CAS involve a diverse array of agents that adapt and learn from their interactions with one another. An individual's immune system is built upon an array of cells, proteins, and chemical factors that are constantly in flux. This vast system of interacting elements responds to all manner of pathogenic threats, protects the body from too much inflammation and even records to memory its fighting strategy so it can be efficient the next time it encounters the same pathogen. The buyers and sellers that comprise the economic infrastructure of a city and the diverse species in a rainforest act in similar ways (click 7
In looking for common elements in these complex adaptive systems, we see that all CAS exhibit lever points - points at which a small effort can produce a desired, directed effect. In the immune system, a vaccine would be an example of a lever point. From an engineering perspective, a vaccine is a solution that, with the relatively small effort of inactivating a virus or bacterium, produces a large effect: protecting an entire population from that pathogen. A virologist might design a vacine by searching for the most immunogenic portion of the pathogen. From a CAS perspective, the solution or lever point could be something much different: perhaps a combination of cytokines or the timing of administration that provides critical advantage. Just as genetic algorithms find solutions that are messier but more effective than a human would design, so a theory of CAS would find solutions that might not be obvious to the virologist. A theory of CAS would eliminate the trial and error. On a broader scale, a theory of CAS would let us transfer well-known results about one CAS to other CAS where data or understanding is lacking.8
In order to build that theorem, we are characterizing the agents or parts of a system in a way that is as general as possible, and then using genetic algorithms to simulate the learning between our modeled parts. To construct CAS agents we must first characterize the building blocks that will represent the agent in our theorem, as well as the rules that govern their interaction or combination. With our model of economic systems, we saw how genetic algorithms could recombine rules to form new methods of interacting. Our model of CAS would not only recombine the building blocks of the system, but also the rules by which they interact.
My strategy for finding a theorem that can predicts lever points is to study how novel properties emerge from aggregates of building blocks (click 9
Critics of this theoretical approach might say that a great many researchers are already looking for the silver bullets that would cure cancer, create a malaria vaccine, or stop global warming. These experts arguably understand their systems better than anyone, so how could a CAS theorem help? Just as my student-engineers took genetic algorithms and applied them to un-solvable problems, so I think a complex adaptive systems theorem could help researchers solve problems in ways we haven't yet imagined. A good theorem won't necessarily define the lever point, but it would - if we work it out - show us where to look for an answer.