In an environment that looks random, but isn’t, it becomes particularly important to ask what it would mean behaviorally for a clump of driven matter to exhibit an emergent adaptation behavior in the presence of a subtle influence. If inexorable physical principles are biasing the gradual exploration of shape space, so that things have a specialized matching to a pattern whose nonrandomness is hard to discern, that means the dynamics of the matter in question has had to “figure out” something difficult about how to detect the pattern. If we had been given the task of discerning the pattern ourselves, either we would have had to engage in some form of intuition or abstract mathematical manipulation, or we would have had to write a computer program that could use numerical calculation to substitute for those activities. In any event, whether human or artificial, we’d be inclined to say some kind of intelligence was being brought to bear. Here, the question that suddenly confronts us is whether, in the absence of any brain, equation, or algorithm, an emergently fine-tuned assembly of particles can fairly be said to be doing something called computing—or better yet, learning.
Posing the question might be likened to asking whether a pond that refreezes after being partly melted is healing or repairing itself. Clearly, for such primitive examples, language like this is loaded and misleading—and the same goes for saying that when a ball rolls down a hillside it is “learning” to lower its altitude. The case we are not considering, though, in leveling that critique is one where optimizing one’s relationship to the environment is a much less generic kind of task, and in which the environment is assumed to have a regularity that is not at all obvious. When a collection of matter we have built with our own ingenuity to be a computer behaves in a way that spits out an accurate prediction of a complex signal, we call that computation. If a bunch of particles whose structure we did not design nevertheless engages in behavior that is just as behaviorally successful with respect to the task of prediction, we may well want to refrain from saying it is computing. But if so, we still have to acknowledge that it is doing something just as useful as what computation does.
Consider, for example, the work of Weishun Zhong, David Schwab, and Arvind Murugan carried out at the University of Chicago, where they simulated the equilibrium self-assembly of many grayscale-shaded particles whose interparticle forces had been designed to arrange the particles as pixels in multiple recognizable photographic images (such as a picture of Einstein). The authors demonstrated that when such a soup of sticky building blocks got exposed to fragmentary partial remnants of the original image (through some patterned, but static, external forces), the collective was able to reconstruct the whole image by leveraging what little information the fragments contained to select the matching image state and assemble into it. From one perspective, the physics of what went on in this study was always only a very high-dimensional version of a ball rolling downhill (with the fragmentary image clues acting as a way of tilting the energy landscape to help the ball roll the right way). Yet, from another perspective, remembering and reconstructing an image sounds like something useful that we make software do for us. Moving away from thermal equilibrium, in principle, should only expand the playbook available to us in terms of the types of behaviors physically allowed to occur in the system. This possibility immediately raises the question of whether self-organizing, nonequilibrium, many-particle systems might be coaxed into doing various other things that computer programs are good for.
Musings such as these may sound far-fetched, but there are both theoretical and empirical reasons to give them serious consideration. In simulation, we can study the behavior of “messy” many-body systems that we subject to forcing patterns much subtler than simple oscillations, and the results so far show significant promise. Random spin glasses are materials built out of collections of atoms arrayed together in such a way that they interact with their neighbors like tiny bar magnets. Each atom has a north-south arrow that results from the behavior of its electrons, and any one arrow can feel the influence of the other arrows of neighboring atoms. What makes these materials random is that there can be disorder in how the arrows couple: one pair of neighbors might feel a force trying to align their arrows north to north, whereas another pair might be trying to anti-align and match up north to south. In a big jumble of such atoms, it’s very hard to satisfy all the forces at once, because the same atom might be pointing up as the result of the influence of one neighbor, but this could be against the direction of a force being exerted by another neighbor. Accordingly, the story of when a given arrow flips, as the system tries to lower its energy, can be complicated and unpredictable.
Experiments on spin glasses have already shown that they exhibit intriguing memory effects: if you cool one down with an external bath whose gradually dropping temperature plateaus for a while at a few different temperatures you have chosen, then the magnetic properties of the material are detectably different at those same chosen temperatures when you warm things back up, as though an echo of the original treatment is stored in the configurations of all the arrows. MIT mathematics graduate student Jacob Gold and I picked up on this idea and simulated a random spin glass with a “barcoded” environment that varied over time. In addition to the forces acting between atoms, roughly half the atoms were made to feel external forces randomly pushing some arrows up and others down. After a fixed duration, a new set of random forces was generated (i.e., a new barcode), and the configuration of the arrows in the simulated material was allowed to evolve over time.
If the simulation always keeps showing the atoms new random barcodes pushing some arrows up and others down, then the behavior is largely indistinguishable from random thermal fluctuations at a higher temperature than the ambient environment—in other words, it just acts like friction is heating it up. However, things get remarkably different when a limited number of barcodes get shuffled into a deck and then reused over and over again. At each moment, the current barcode is randomly selected without any particular order, but over a longer stretch the same small set of barcodes recurs.
The first—now perhaps unsurprising—thing to notice is that the rate of energy absorption drops over time. Though the barcoded forces initially “look” random to the material, eventually it adapts to the pattern and finds a configuration that absorbs less energy from its environment. What happens, though, if we suddenly introduce a new barcode, never seen before in the history of the system dynamics? Naturally, the rate of energy absorption spikes upward, because the system has not yet adapted to a pattern that includes this novel environmental state as a possibility.
Let us recap: A messy collection of atoms gets shown a bunch of different forces from the environment, each described with its own unique barcode. Its energy absorption adapts and drops, and now will spike if we poke the system in a new way in which it has not yet been poked. Inadvertently, perhaps, we have built what one might call a novelty detector.
The state of the system after it has adapted embodies an implicit prediction about the future, namely: I expect to see this, this, or this, but not that. In essence, then, the mere fact that these atoms have been poked in a patterned way leads them into a state that appears to have learned to tell the precise difference between something familiar and something new and unexpected. And, interestingly, if we were to design a computer program to make that kind of judgment for us, we might be inclined to call it machine learning.
The past decade has seen tremendous growth in the power and diversity of so-called machine-learning technologies that exhibit flexible capability for discovering accurate ways of computationally modeling complex relationships previously thought to be exclusively the province of the human mind. Facial recognition and language processing are two of the most famous examples, but the common principle in many of these applications is that a long list of numerical parameters describing some way of mapping computational inputs to outputs are trained using large data sets (whether pictures of faces, a corpus of text, or whatever else). In practice, this means that a computer algorithm is trying to search through a very high-dimensional parameter space in order to find a special and exceptional choice of parameters that will enable the computational model to exhibit high-quality matching to the data used for training. This search is typically carried out by programming the parameter coordinates to traverse partly random but biased paths that tend to go downhill over time in a “landscape” whose altitude is defined by some measure of error generated by using the current parameters to map inputs to outputs.
If any of this sounds familiar, it is because the parallels between machine learning of this kind and the mechanism for dissipative adaptation we described in the previous chapter are numerous and significant. An assemblage of many particles has a shape or structure that is described by a high-dimensional space of possible arrangements, much the same way that a machine-learning training algorithm has a high-dimensional space of modeling parameters that determine the way it maps inputs to outputs. A machine learner is trying, over time, to change its parameters so as to reduce an error function that measures how badly it models relationships in the training data; a driven mechanical system that is gradually reducing its rate of energy absorption does so by intricately changing its shape so as to modify the way it responds to and moves with the external forcing. Moreover, just as a machine learner maps observed inputs to calculated outputs, a driven mechanical system maps the input of external forces into the outputs of dynamical behavior. Does this mean we want to say that all driven, many-particle assemblies are doing a form of machine learning? Probably not, unless we expand the definition of the term excessively. However, in a side-by-side comparison, it is apparent that the mathematical structure of what is going on in each case is similar on multiple scores, which suggests there may be an as-yet-unexplored spectrum of possibilities that could lie somewhere in between these two poles, and look like an evolving jumble of building blocks that compute something useful as they disperse and reassemble.
Excerpted from Every Life Is on Fire: How Thermodynamics Explains the Origins of Living Things by Jeremy England. Copyright © 2020 by Jeremy England. Available from Basic Books.