CINXE.COM

<?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" > <channel> <title>Stephen Wolfram Writings</title> <atom:link href="https://writings.stephenwolfram.com/feed/" rel="self" type="application/rss+xml" /> <link>https://writings.stephenwolfram.com</link> <description>Just another wordpress.wolfram.com site</description> <lastBuildDate>Tue, 11 Feb 2025 18:11:41 +0000</lastBuildDate> <language>en-US</language> <sy:updatePeriod>hourly</sy:updatePeriod> <sy:updateFrequency>1</sy:updateFrequency> <generator>https://wordpress.org/?v=4.7.2</generator> <item> <title>Towards a Computational Formalization for Foundations of Medicine</title> <link>https://writings.stephenwolfram.com/2025/02/towards-a-computational-formalization-for-foundations-of-medicine/</link> <comments>https://writings.stephenwolfram.com/2025/02/towards-a-computational-formalization-for-foundations-of-medicine/#respond</comments> <pubDate>Mon, 03 Feb 2025 23:27:46 +0000</pubDate> <dc:creator><![CDATA[Stephen Wolfram]]></dc:creator> <category><![CDATA[Computational Science]]></category> <category><![CDATA[Computational Thinking]]></category> <category><![CDATA[Life Science]]></category> <category><![CDATA[New Kind of Science]]></category> <guid isPermaLink="false">https://writings.stephenwolfram.com/?p=66885</guid> <description><![CDATA[<span class="thumbnail"><img width="128" height="108" src="https://content.wolfram.com/sites/43/2025/02/icon-medicine-v1.png" class="attachment-thumbnail size-thumbnail wp-post-image" alt="" /></span>A Theory of Medicine? As it’s practiced today, medicine is almost always about particulars: “this has gone wrong; this is how to fix it”. But might it also be possible to talk about medicine in a more general, more abstract way—and perhaps to create a framework in which one can study its essential features without […]]]></description> <content:encoded><![CDATA[<span class="thumbnail"><img width="128" height="108" src="https://content.wolfram.com/sites/43/2025/02/icon-medicine-v1.png" class="attachment-thumbnail size-thumbnail wp-post-image" alt="" /></span><p><img class="aligncenter" title="Towards a Computational Formalization for Foundations of Medicine" src="https://content.wolfram.com/sites/43/2025/02/medicine-hero-v3.png" alt="Towards a Computational Formalization for Foundations of Medicine" width="620" height="909" /></p> <h2 id="a-theory-of-medicine">A Theory of Medicine?</h2> <p>As it’s practiced today, medicine is almost always about particulars: “this has gone wrong; this is how to fix it”. But might it also be possible to talk about medicine in a more general, more abstract way—and perhaps to create a framework in which one can study its essential features without engaging with all of its details?</p> <p>My goal here is to take the first steps towards such a framework. And in a sense my central result is that there are many broad phenomena in medicine that seem at their core to be fundamentally computational—and to be captured by remarkably simple computational models that are readily amenable to study by computer experiment.</p> <p>I should make it clear at the outset that I’m not trying to set up a specific model for any particular aspect or component of biological systems. Rather, my goal is to “zoom out” and create what one can think of as a “metamodel” for studying and formalizing the abstract foundations of medicine.<span id="more-66885"></span></p> <p>What I’ll be doing builds on my recent work on using the <a href="https://www.wolframscience.com/nksonline/toc.html">computational paradigm</a> to study the <a href="https://writings.stephenwolfram.com/2024/12/foundations-of-biological-evolution-more-results-more-surprises/">foundations of biological evolution</a>. And indeed in constructing idealized organisms we’ll be using the <a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/">very same class of basic computational models</a>. But now, instead of considering idealized genetic mutations and asking what types of idealized organisms they produce, we’re going to be looking at specific evolved idealized organisms, and seeing what effect perturbations have on them. Roughly, the idea is that an idealized organism operates in its normal “healthy” way if there are no perturbations—but perturbations can “derail” its operation and introduce what we can think of as “disease”. And with this setup we can then think of the “fundamental problem of medicine” as being the identification of additional perturbations that can “treat the disease” and put the organism at least approximately back on its normal “healthy” track.</p> <p>As we’ll see, most perturbations lead to lots of detailed changes in our idealized organism, much as perturbations in biological organisms normally lead to vast numbers of effects, say at a molecular level. But as in medicine, we can imagine that all we can observe (and perhaps all we care about) are certain coarse-grained features or “symptoms”. And the fundamental problem of medicine is then to work out from these symptoms what “treatment” (if any) will end up being useful. (By the way, when I say “symptoms” I mean the whole cluster of signs, symptoms, tests, etc. that one might in practice use, say for diagnosis.)</p> <p>It’s worth emphasizing again that I’m not trying here to derive specific, actionable, medical conclusions. Rather, my goal is to build a conceptual framework in which, for example, it becomes conceivable for general phenomena in medicine that in the past have seemed at best vague and anecdotal to begin to be formalized and studied in a systematic way. At some level, what I’m trying to do is a bit like what Darwinism did for biological evolution. But in modern times there’s a critical new element: the computational paradigm, which not only introduces all sorts of new, powerful theoretical concepts, but also leads us to the practical methodology of computer experimentation. And indeed much of what follows is based on the (often surprising) results of computer experiments I’ve recently done that give us raw material to build our intuition—and structure our thinking—about fundamental phenomena in medicine. </p> <h2 id="a-minimal-metamodel">A Minimal Metamodel</h2> <p>How can we make a metamodel of medicine? We need an idealization of biological organisms and their behavior and development. We need an idealization of the concept of disease for such organisms. And we need an idealization of the concept of treatment. </p> <p>For our idealization of biological organisms we’ll use a class of simple computational systems called <a href="https://www.wolframscience.com/nks/chap-2--the-crucial-experiment#sect-2-1--how-do-simple-programs-behave">cellular automata</a> (that I happen to have <a href="https://www.wolframscience.com/nks/chap-1--the-foundations-for-a-new-kind-of-science#sect-1-4--the-personal-story-of-the-science-in-this-book">studied since the early 1980s</a>). Here’s a specific example:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025minimalimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025minimalimg1.png' alt='' title='' width='461' height='513'> </div> </p></div> <p>What’s going on here is that we’re progressively constructing the pattern on the left (representing the development and behavior of our organism) by repeatedly applying cases of the rules on the right (representing the idealized genome—and other biochemical, etc. rules—of our organism). Roughly we can think of the pattern on the left as corresponding to the “life history” of our organism—growing, developing and eventually dying as it goes down the page. And even though there’s a rather organic look to the pattern, remember that the system we’ve set up isn’t intended to provide a model for any particular real-world biological system. Rather, the goal is just for it to capture enough of the foundations of biology that it can serve as a successful metamodel to let us explore our questions about the foundations of medicine.</p> <p>Looking at our model in more detail, we see that it involves a grid of squares—or “cells” (computational, not biological)—each having one of 4 possible colors (white and three others). We start from a single red “seed” cell on the top row of the grid, then compute the colors of cells on subsequent steps (i.e. on subsequent rows down the page) by successively applying the rules on the right. The rules here are basically very simple. But we can see that when we run them they lead to a fairly complicated pattern—which in this case happens to “die out” (i.e. all cells become white) after exactly 101 steps. </p> <p>So what happens if we perturb this system? On the left here we’re showing the system as above, without perturbation. But on the right we’re introducing a perturbation by changing the color of a particular cell (on step 16)—leading to a rather different (if qualitatively similar) pattern:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025minimalimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025minimalimg2.png' alt='' title='' width='543' height='446'> </div> </p></div> <p>Here are the results of some other perturbations to our system:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025minimalimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025minimalimg3.png' alt='' title='' width='666' height='329'> </div> </p></div> <p>Some perturbations (like the one in the second panel here) quickly disappear; in essence the system quickly “heals itself”. But in most cases even single-cell perturbations like the ones here have a long-term effect. Sometimes they can “increase the lifetime” of the organism; often they will decrease it. And sometimes—like in the last case shown here—they will lead to essentially unbounded “tumor-like” growth.</p> <p>In biological or medical terms, the perturbations we’re introducing are minimal idealizations of “things that can happen to an organism” in the course of its life. Sometimes the perturbations will have little or no effect on the organism. Or at least they won’t “really hurt it”—and the organism will “live out its natural life” (or even extend it a bit). But in other cases, a perturbation can somehow “destabilize” the organism, in effect “making it develop a disease”, and often making it “die before its time”.</p> <p>But now we can formulate what we can think of as the “fundamental problem of medicine”: given that perturbations have had a deleterious effect on an organism, can we find subsequent perturbations to apply that will serve as a “treatment” to overcome the deleterious effect?</p> <p>The first panel here shows a particular perturbation that makes our idealized organism die after 47 steps. The subsequent panels then show various “treatments” (i.e. additional perturbations) that serve at least to “keep the organism alive”:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025minimalimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025minimalimg4.png' alt='' title='' width='637' height='227'> </div> </p></div> <p>In the later panels here the “life history” of the organism gets closer to the “healthy” unperturbed form shown in the final panel. And if our criterion is restoring overall lifetime, we can reasonably say that the “treatment has been successful”. But it’s notable that the detailed “life history” (and perhaps “quality of life”) of the organism will essentially never be the same as before: as we’ll see in more detail later, it’s almost inevitably the case that there’ll be at least some (and often many) long-term effects of the perturbation+treatment even if they’re not considered deleterious.</p> <p>So now that we’ve got an idealized model of the “problem of medicine”, what can we say about solving it? Well, the main thing is that we can get a sense of why it’s fundamentally hard. And beyond anything else, the central issue is a fundamentally computational one: the phenomenon of <a href="https://www.wolframscience.com/nks/chap-12--the-principle-of-computational-equivalence#sect-12-6--computational-irreducibility">computational irreducibility</a>. </p> <p>Given any particular cellular automaton rule, with any particular initial condition, one can always explicitly run the rule, step by step, from that initial condition, to see what will happen. But can one do better? Experience with mathematical science might make one imagine that as soon as one knows the underlying rule for a system, one should in principle immediately be able to “solve the equations” and jump ahead to work out everything about what the system does, without explicitly tracing through all the steps. But one of the central things I discovered in studying simple programs back in the early 1980s is that it’s common for such systems to show what I called computational irreducibility, which means that the only way to work out their detailed behavior is essentially just to run their rules step by step and see what happens.</p> <p>So what about biology? One might imagine that with its incremental optimization, biological evolution would produce systems that somehow <a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/#what-it-means-for-whats-going-on-in-biology">avoid computational irreducibility</a>, and (like simple machinery) have obvious easy-to-understand mechanisms by which they operate. But in fact that’s not what biological evolution typically seems to produce. And instead—<a href="https://writings.stephenwolfram.com/2024/12/foundations-of-biological-evolution-more-results-more-surprises/">as I’ve recently argued</a>—what it seems to do is basically just to put together randomly found “lumps of irreducible computation” that happen to satisfy its fitness criterion. And the result is that biological systems are full of computational irreducibility, and mostly aren’t straightforwardly “mechanically explainable”. (The presence of computational irreducibility is presumably also why theoretical biology based on mathematical models has always been so challenging.)</p> <p>But, OK, given all this computational irreducibility, how is it that medicine is even possible? How is it that we can know enough about what a biological system will do to be able to determine what treatment to use on it? Well, computational irreducibility makes it hard. But it’s a fundamental feature of computational irreducibility that within any computationally irreducible process there must always be pockets of computational reducibility. And if we’re trying to achieve only some fairly coarse objective (like maximizing overall lifetime) it’s potentially possible to leverage some pocket of computational reducibility to do this. </p> <p>(And indeed pockets of computational reducibility within computational irreducibility are what make many things possible—including having <a href="https://writings.stephenwolfram.com/2023/12/observer-theory/">understandable laws of physics</a>, doing <a href="https://writings.stephenwolfram.com/2022/03/the-physicalization-of-metamathematics-and-its-implications-for-the-foundations-of-mathematics/">higher mathematics</a>, etc.) </p> <h2 id="the-diversity-and-classification-of-disease">The Diversity and Classification of Disease</h2> <p>With our simple idealization of disease as the effect of perturbations on the life history of our idealized organism, we can start asking questions like “What is the distribution of all possible diseases?” </p> <p>And to begin exploring this, here are the patterns generated with a random sample of the 4383 possible single-point perturbations to the idealized organism we’ve discussed above:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg1.png' alt='' title='' width='647' height='537'> </div> </p></div> <p>Clearly there’s a lot of variation in these life histories—in effect a lot of different symptomologies. If we average them all together we lose the detail and we just get something close to the original:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg2.png' alt='' title='' width='149' height='430'> </div> </p></div> <p>But if we look at the distribution of lifetimes, we see that while it’s peaked at the original value, it nevertheless extends to both shorter and longer values: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg3.png' alt='' title='' width='365' height='162'> </div> </p></div> <p>In medicine (or at least Western medicine) it’s been traditional to classify “things that can go wrong” in terms of discrete diseases. And we can imagine also doing this in our simple model. But it’s already clear from the array of pictures above that this is not going to be a straightforward task. We’ve got a different detailed pattern for every different perturbation. So how should we group them together?</p> <p>Well—much as in medicine—it depends on what we care about. In medicine we might talk about signs and symptoms, which in our idealized model we can basically identify with features of patterns. And as an example, we might decide that the only features that matter are ones associated with the boundary shape of our pattern:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg4.png' alt='' title='' width='175' height='248'> </div> </p></div> <p>So what happens to these boundary shapes with different perturbations? Here are the most frequent shapes found (together with their probabilities):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg5.png' alt='' title='' width='577' height='213'> </div> </p></div> <p>We might think of these as representing “common diseases” of our idealized organism. But what if we look at all possible “diseases”—at least all the ones produced by single-cell perturbations? Using boundary shape as our way to distinguish “diseases” we find that if we plot the frequency of diseases against their rank we get roughly a power law distribution (and, yes, it’s not clear <a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/#probabilistic-approximations">why it’s a power law</a>):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg6.png' alt='' title='' width='357' height='214'> </div> </p></div> <p>What are the “rare diseases” (i.e. ones with low frequency) like? Their boundary shapes can be quite diverse:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg7.png' alt='' title='' width='661' height='234'> </div> </p></div> <p>But, OK, can we somehow quantify all these “diseases”? For example, as a kind of “imitation medical test” we might look at how far to the left the boundary of each pattern goes. With single-point perturbations, 84% of the time it’s the same as in the unperturbed case—but there’s a distribution of other, “less healthy” results (here plotted on a log scale)</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg8.png' alt='' title='' width='375' height='162'> </div> </p></div> <p>with extreme examples being:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg9.png' alt='' title='' width='553' height='313'> </div> </p></div> <p>And, yes, we could diagnose any pattern that goes further to the left than the unperturbed one as a case of, say, “leftiness syndrome”. And we might imagine that if we set up enough tests, we could begin to discriminate between many discrete “diseases”. But somehow this seems quite ad hoc. </p> <p>So can we perhaps be more systematic by using machine learning? Let’s say we just look at each whole pattern, then try to place it in an image <a href="https://reference.wolfram.com/language/ref/FeatureSpacePlot.html">feature space</a>, say a 2D one. Here’s an example of what we get:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg10.png' alt='' title='' width='619' height='625'> </div> </p></div> <p>The details of this depend on the particulars of the machine learning method we’ve used (here the default <tt><a href="http://reference.wolfram.com/language/ref/FeatureSpacePlot.html">FeatureSpacePlot</a></tt> method in <a href="https://www.wolfram.com/language/">Wolfram Language</a>). But it’s a fairly robust result that “visually different” patterns end up separated—so that in effect the machine learning is successfully automating some kind of “visual diagnosis”. And there’s at least a little evidence that the machine learning will identify separated clusters of patterns that we can reasonably identify as “truly distinct diseases”—even as the more common situation is that between any two patterns, there are intermediate ones that aren’t neatly classified as one disease or the other.</p> <p>Somewhat in the style of the human “International Classification of Diseases” (ICD), we can try arranging all our patterns in a hierarchy—though it’s basically inevitable that we’ll always be able to subdivide further, and there’ll never be a clear point at which we can say “we’ve classified all the diseases”:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/02/sw02032025treeclusterAimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg11.png' alt='' title='' width='605' height='382'> </div> </p></div> <p>By the way, in addition to talking about possible diseases, we also need to discuss what counts as “healthy”. We could say that our organism is only “healthy” if its pattern is exactly what it would be without any perturbation (“the natural state”). But what probably better captures everyday medical thinking is to say that our organism should be considered “healthy” if it doesn’t have symptoms (or features) that we consider bad. And in particular, at least “after the fact” we might be able to say that it must have been healthy if its lifetime turned out to be long. </p> <p>It’s worth noting that even in our simple model, while there are many perturbations that reduce lifetime, there are also perturbations that increase lifetime. In the course of biological evolution, genetic mutations of the overall underlying rules for our idealized organism might have managed to achieve a certain longevity. But the point is that nothing says “longevity perturbations” applied “during the life of the organism” can’t get further—and indeed here are some examples where they do:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg12.png' alt='' title='' width='565' height='267'> </div> </p></div> <p>And, actually, in a feature that’s not (at least yet) reflected in human medicine, there are perturbations than can make the lifetime very significantly longer. And for the particular idealized organism we’re studying here, the most extreme examples obtained with single-point perturbations are:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg13.png' alt='' title='' width='645' height='256'> </div> </p></div> <p>OK, but what happens if we consider perturbations at multiple points? There are immediately vastly more possibilities. Here are some examples of the 10 million or so possible configurations of two perturbations:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg14.png' alt='' title='' width='599' height='265'> </div> </p></div> <p>And here are examples with three perturbations:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg15_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg15.png' alt='' title='' width='599' height='265'> </div> </p></div> <p>Here are examples if we try to apply five perturbations (though sometimes the organism is “already dead” before we can apply later perturbations):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg16_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg16.png' alt='' title='' width='599' height='264'> </div> </p></div> <p>What happens to the overall distribution of lifetimes in these cases? Already with two perturbations, the distribution gets much broader, and with three or more, the peak at the original lifetime has all but disappeared, with a new peak appearing for organisms that in effect die almost immediately:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg17_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg17.png' alt='' title='' width='625' height='293'> </div> </p></div> <p>In other words, the particular idealized organism that we’re studying is fairly robust against one perturbation, and perhaps even two, but with more perturbations it’s increasingly likely to succumb to “infant mortality”. (And, yes, if one increases the number of perturbations the “life expectancy” progressively decreases.)</p> <p>But what about the other way around? With multiple perturbations, can the organism in effect “live forever”? Here are some examples where it’s still “going strong” after 300 steps:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg18_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg18.png' alt='' title='' width='645' height='242'> </div> </p></div> <p>But after 500 steps most of these have died out:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg19_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg19.png' alt='' title='' width='647' height='363'> </div> </p></div> <p>As is typical in the computational universe (perhaps like in medicine) there are always surprises, courtesy of computational irreducibility. Like the sudden appearance of the obviously periodic case (with period 25):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg20_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg20.png' alt='' title='' width='183' height='249'> </div> </p></div> <p>As well as the much more complicated cases (where in the final pictures the pattern has been “rectified”):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg21_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg21.png' alt='' title='' width='668' height='283'> </div> </p></div> <p>So, yes, in these cases the organism does in effect “live forever”—though not in an “interesting” way. And indeed such cases might remind us of tumor-like behavior in biological organisms. But what about a case that not only lives forever, but also grows forever? Well, needless to say, lurking out in the computational universe, one can find an example:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg22_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025diversityimg22.png' alt='' title='' width='535' height='315'> </div> </p></div> <p>The “incidence” of this behavior is about one in a million for 2 perturbations (or, more precisely, 7 out of 9.6 million possibilities), and one in 300,000 for 3 perturbations. And although there presumably are even more complicated behaviors out there to find, they don’t show up with 2 perturbations, and their incidence with 3 perturbations is below about one in 100 million. </p> <h2 id="diagnosis--prognosis">Diagnosis & Prognosis</h2> <p>A fundamental objective in medicine is to predict from tests we do or symptoms and signs we observe what will happen. And, yes, we now know that computational irreducibility inevitably makes this in general hard. But also know from experience that a certain amount of prediction is possible—which we can now interpret as successfully managing to tap into pockets of computational reducibility.</p> <p>So as an example, let’s ask what the prognosis is for our idealized organism based on the width of its pattern we measure at a certain step. So here, for example, is what happens to the original lifetime distribution (in green) if we consider only cases where the width of the measured pattern after 25 steps is less than its unperturbed (“healthy”) value (and where we’re dropping the 1% of cases when the organism was “already dead” before 25 steps):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025diagnosisimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025diagnosisimg1.png' alt='' title='' width='360' height='164'> </div> </p></div> <p>Our “narrow” cases represent about 5% of the total. Their median lifetime is 57, as compared with the overall median of 106. But clearly the median alone does not tell the whole story. And nor do the two survival curves:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025diagnosisimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025diagnosisimg2.png' alt='' title='' width='358' height='160'> </div> </p></div> <p>And, for example, here are the actual widths as a function of time for all the narrow cases, compared to the sequence of widths for the unperturbed case:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025diagnosisimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025diagnosisimg3.png' alt='' title='' width='617' height='139'> </div> </p></div> <p>These pictures don’t make it look promising that one could predict lifetime from the single test of whether the pattern was narrow at step 25. Like in analogous medical situations, one needs more data. One approach in our case is to look at actual “narrow” patterns (up to step 25)—here sorted by ultimate lifetime—and then to try to identify useful predictive features (though, for example, to attempt any serious machine learning training would require a lot more examples): </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025diagnosisimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025diagnosisimg4.png' alt='' title='' width='634' height='227'> </div> </p></div> <p>But perhaps a simpler approach is not just to do a discrete “narrow or not” test, but rather to look at the actual width at step 25. So here are the lifetimes as a function of width at step 25</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025diagnosisimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025diagnosisimg5.png' alt='' title='' width='357' height='231'> </div> </p></div> <p>and here’s the distribution of outcomes, together with the median in each case:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025diagnosisimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025diagnosisimg6.png' alt='' title='' width='357' height='230'> </div> </p></div> <p>The predictive power of our width measurement is obviously quite weak (though there’s doubtless a way to “hack <em>p</em> values” to get at least something out). And, unsurprisingly, machine learning doesn’t help. Like here’s a <a href="https://reference.wolfram.com/language/ref/Predict.html">machine learning prediction</a> (based on decision tree methods) for lifetime as a function of width (that, yes, is very close to just being the median):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025diagnosisimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025diagnosisimg7.png' alt='' title='' width='311' height='202'> </div> </p></div> <p>Does it help if we use more history? In other words, what happens if we make our prediction not just from the width at a particular step, but from the history of all widths up to that point? As one approach, we can make a collection of “training examples” of what lifetimes particular “width histories” (say up to step 25) lead to:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025diagnosisimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/02/sw01312025diagnosisimg8A.png' alt='' title='' width='663' height='121'> </div> </p></div> <p>There’s already something of an issue here, because a given width history—which, in a sense is a “coarse graining” of the detailed “microscopic” history—can lead to multiple different final lifetimes:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025diagnosisimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/02/sw02032525diagnosisbracesimg9.png' alt='' title='' width='596' height='49'> </div> </p></div> <p>But we can still go ahead and try to use machine learning to predict lifetimes from width histories based on training on (say, half) of our training data—yielding less than impressive results (with the vertical line being associated with multiple lifetimes from a single width history in the training data):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025diagnosisimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025diagnosisimg10.png' alt='' title='' width='357' height='160'> </div> </p></div> <p>So how can we do better? Well, given the underlying setup for our system, if we could determine not just the width but the whole precise sequence of values for all cells, even just at step 25, then in principle we could use this as an “initial condition” and run the system forward to see what it does. But regardless of it being “medically implausible” to do this, it isn’t much of a prediction anyway; it’s more just “watch and see what happens”. And the point is that insofar as there’s computational irreducibility, one can’t expect—at least in full generality—to do much better. (And, as we’ll argue later, there’s no reason to think that organisms produced by biological evolution will avoid computational irreducibility at this level.)</p> <p>But still, within any computationally irreducible system, there are always pockets of computational reducibility. So we can expect that there will be some predictions that can be made. But the question is whether those predictions will be about things we care about (like lifetime) or even about things we can measure. Or, in other words, will they be predictions that speak to things like symptoms? </p> <p>Our <a href="https://www.wolframphysics.org/" target="_blank" rel="noopener">Physics Project</a>, for example, involves all sorts of underlying processes that are computationally irreducible. But the key point there is that what <a href="https://writings.stephenwolfram.com/2023/12/observer-theory/">physical observers like us</a> perceive are aggregate constructs (like overall features of space) that show significant computational reducibility. And in a sense there’s an analogous issue here: there’s computational irreducibility underneath, but what do “medical observers” actually perceive, and are there computationally reducible features related to that? If we could find such things, then in a sense we’d have identified “general laws of medicine” much like we now have “general laws of physics”. </p> <h2 id="the-problem-of-finding-treatments">The Problem of Finding Treatments</h2> <p>We’ve talked a bit about giving a prognosis for what will happen to an idealized organism that’s suffered a perturbation. But what about trying to fix it? What about trying to intervene with another “treatment perturbation” that can “heal” the system, and give it a life history that’s at least close to what it would have had without the original perturbation?</p> <p>Here’s our original idealized organism, together with how it behaves when it “suffers” a particular perturbation that significantly reduces its lifetime:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025problemimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025problemimg1.png' alt='' title='' width='212' height='299'> </div> </p></div> <p>But what happens if we now try applying a second perturbation? Here are a few random examples:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025problemimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025problemimg2.png' alt='' title='' width='652' height='220'> </div> </p></div> <p>None of these examples convincingly “heal” the system. But let’s (as we can in our idealized model) just enumerate all possible second perturbations (here 1554 of them). Then it turns out that a few of these do in fact successfully give us patterns that at least exactly reproduce the original lifetime:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025problemimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025problemimg3.png' alt='' title='' width='649' height='224'> </div> </p></div> <p>Do these represent true examples of “healing”? Well, it depends on what we mean. Yes, they’ve managed to make the lifetime exactly what it would have been without the original “disease-inducing” perturbation. But in essentially all cases we see here that there are various “long-term side effects”—in the sense that the detailed patterns generated end up having obvious differences from the original unperturbed “healthy” form.</p> <p>The one exception here is the very first case, in which the “disease was caught early enough” that the “treatment perturbation” manages to completely heal the effects of the “disease perturbation”:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025problemimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025problemimg4.png' alt='' title='' width='382' height='329'> </div> </p></div> <p>We’ve been talking here about intervening with “treatment perturbations” to “heal” our “disease perturbation”. But actually it turns out that there are plenty of “disease perturbations” which automatically “heal themselves”, without any “treatment” intervention. In fact, of all possible 4383 single perturbations, 380 essentially heal themselves. </p> <p>In many cases, the “healing” happens very locally, after one or two steps:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025problemimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025problemimg5.png' alt='' title='' width='583' height='343'> </div> </p></div> <p>But there are also more complicated cases, where perturbations produce fairly large-scale changes in the pattern—that nevertheless “spontaneously heal themselves”:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025problemimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025problemimg6.png' alt='' title='' width='379' height='373'> </div> </p></div> <p>(Needless to say, in cases where a perturbation “spontaneously heals itself”, adding a “treatment perturbation” will almost always lead to a worse outcome.)</p> <p>So how should we think about perturbations that spontaneously heal themselves? They’re like seeds for diseases that never take hold, or like diseases that quickly burn themselves out. But from a theoretical point of view we can think of them as being where the unperturbed life history of our idealized organism is acting as <a href="https://www.wolframscience.com/nks/chap-6--starting-from-randomness#sect-6-7--the-notion-of-attractors">attractor</a>, to which certain perturbed states inexorably converge—a bit like how friction can dissipate perturbations to patterns of motion in a mechanical system. </p> <p>But let’s say we have a perturbation that doesn’t “spontaneously heal itself”. Then to remediate it we have to “do the medical thing” and in our idealized model try to find a “treatment perturbation”. So how might we systematically set about doing that? Well, in general, computational irreducibility makes it difficult. And as one indication of this, this shows what lifetime is achieved by “treatment perturbations” made at each possible point in the pattern (after the initial perturbation):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025problemimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025problemimg8B.png' alt='' title='' width='380' height='295'> </div> </p></div> <p>We can think of this as providing a map of what the effects of different treatment perturbations will be. Here are some other examples, for different initial perturbations (or, in effect, different “diseases”):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025problemimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025problemimg9A.png' alt='' title='' width='600' height='233'> </div> </p></div> <p>There’s some regularity here. But the main observation is that different detailed choices of treatment perturbations will often have very different effects. In other words, even “nearby treatments” will often lead to very different outcomes. Given computational irreducibility, this isn’t surprising. But in a sense it underscores the difficulty of finding and applying “treatments”. By the way, cells indicated in dark red above are ones where treatment leads to a pattern that lives “excessively long”—or in effect shows tumor-like characteristics. And the fact that these are scattered so seemingly randomly reflects the difficulty of predicting whether such effects will occur as a result of treatment.</p> <p>In what we’ve done so far here, our “treatment” has always consisted of just a single additional perturbation. But what about applying more perturbations? For example, let’s say we do a series of experiments where after our first “treatment perturbation” we progressively try other treatment perturbations. If a given additional perturbation doesn’t get further from the desired lifetime, we keep it. Otherwise we reject it, and try another perturbation. Here’s an example of what happens if we do this:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025problemimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025problemimg9.png' alt='' title='' width='661' height='192'> </div> </p></div> <p>The highlighted panels represent perturbations we kept. An here’s how the overall lifetime “converges” over successive iterations in our experiment:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025problemimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025problemimg10.png' alt='' title='' width='357' height='159'> </div> </p></div> <p>In what we just did, we allowed additional treatment perturbations to be added at any subsequent step. But what if we require treatment perturbations to always be added on successive steps—starting right after the “disease perturbation” occurred? Here’s an example of what happens in this case:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025problemimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025problemimg11.png' alt='' title='' width='503' height='192'> </div> </p></div> <p>And here’s what we see zooming in at the beginning:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025problemimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025problemimg12.png' alt='' title='' width='523' height='196'> </div> </p></div> <p>In a sense this corresponds to “doing aggressive treatment” as soon as the initial “disease perturbation” has occurred. And a notable feature of the particular example here is that when our succession of treatment perturbations have succeeded in “restoring the lifetime” (which happens fairly quickly), the life history they produce is similar (though not identical) to the original unperturbed case. </p> <p>That definitely doesn’t always happen, as this example illustrates—but it’s fairly common:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025problemimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025problemimg13.png' alt='' title='' width='464' height='192'> </div> </p></div> <p>It’s worth pointing out that if we allowed ourselves to do many single perturbations at the same time (i.e. on the same row of the pattern) we could effectively just “define new initial conditions” for the pattern, and, for example, perfectly “regenerate” the original unperturbed pattern after this “reset”. And in general we can imagine in effect “hot-wiring” the organism by applying large numbers of treatment perturbations that just repeatedly direct it back to its unperturbed form.</p> <p>But such extensive and detailed “intervention”—that in effect replaces the whole state of the organism—seems far from what might be practical in typical (current) medicine (except perhaps in some kind of “regenerative treatment”). And indeed in actual (current) medicine one is normally operating in a situation where one does not have anything close to perfect “cell-by-cell” information on the state of an organism—and instead one has to figure out things like what treatment to give based on much coarser “symptom-level” information. (In some ways, though, the immune system does something closer to cell-by-cell “treatment”.)</p> <p>So what can one do given coarse-grained information? As one example, let’s consider trying to predict what treatment perturbation will be best using the kind of pattern-width information we discussed above. Specifically, let’s say that we have the history of the overall width of a pattern up to a particular point, then from this we want to predict what treatment perturbation will lead to the best lifetime outcome for the system. There are a variety of ways we could approach this, but one is to make predictions of where to apply a treatment perturbation using machine learning trained on examples of optimal such perturbations.</p> <p>This is analogous to what we did in the previous section in applying machine learning to predict lifetime from width history. But now we want to predict from width history what treatment perturbation to apply. To generate our training data we can search for treatment perturbations that lead to the unperturbed lifetime when starting from life histories with a given width history. Now we can use a simple neural net to create a predictor that tries to tell us from a width history what “treatment to give”. And here are comparisons between our earlier search results based on looking at complete life histories—and (shown with red arrows) the machine learning predictions based purely on width history before the original disease perturbation:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/02/sw02032025problemnewimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/02/sw02032025problemnewimg14.png' alt='' title='' width='637' height='381'> </div> </p></div> <p>It’s clear that the machine learning is doing something—though it’s not as impressive as perhaps it looks, because a wide range of perturbations all in fact give rather similar life histories. So as a slightly more quantitative indication of what’s going on, here’s the distribution of lifetimes achieved by our machine-learning-based therapy:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/02/sw02032025problemnewimg15_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/02/sw02032025problemnewimg15.png' alt='' title='' width='398' height='174'> </div> </p></div> <p>Our “best treatment” was able to give lifetime 101 in all these cases. And while the distribution we’ve now achieved looks peaked around the unperturbed value, dividing this distribution by what we’d get without any treatment at all makes it clear that not so much was achieved by the machine learning we were able to do: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/02/sw02032025problemnewimg16_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/02/sw02032025problemnewimg16.png' alt='' title='' width='357' height='160'> </div> </p></div> <p>And in a sense this isn’t surprising; our machine learning—based, as it is, on coarse-grained features—is <a href="https://writings.stephenwolfram.com/2024/03/can-ai-solve-science/">quite weak compared to the computational irreducibility</a> of the underlying processes at work.</p> <h2 id="the-effect-of-genetic-diversity">The Effect of Genetic Diversity</h2> <p>In what we’ve done so far, we’ve studied just a single idealized organism—with a single set of underlying “genetic rules”. But in analogy to the situation with humans, we can imagine a whole population of genetically slightly different idealized organisms, with different responses to perturbations, etc. </p> <p>Many changes to the underlying rules for our idealized organism will lead to unrecognizably different patterns, that don’t, for example, have the kind of finite-but-long lifetimes we’ve been interested in. But it turns out that in the rules for our particular idealized organism there are some specific changes that actually don’t have any effect at all—at least on the unperturbed pattern of behavior. And the reason for this is that in generating the unperturbed pattern these particular cases in the rule happen never to be used:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025geneticimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025geneticimg1.png' alt='' title='' width='576' height='115'> </div> </p></div> <p>And the result is that any one of the 4<sup>3</sup> = 64 possible choices of outcomes for those cases in the rule will still yield the same unperturbed pattern. If there’s a perturbation, however, different cases in the rule can be sampled—including these ones. It’s as if cases in the rule that are initially “non-coding” end up being “coding” when the path of behavior is changed by a perturbation. (Or, said differently, it’s like different genes being activated when conditions are different.)</p> <p>So to make an idealized model of something like a population with genetic diversity, we can look at what happens with different choices of our (initially) “non-coding” rule outcomes:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025geneticimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025geneticimg3.png' alt='' title='' width='572' height='372'> </div> </p></div> <p>Before the perturbation, all these inevitably show the same behavior, because they’re never sampling “non-coding” rule cases. But as soon as there’s a perturbation, the pattern is changed, and after varying numbers of steps, previously “non-coding” rule cases do get sampled—and can affect the outcome. </p> <p>Here are the distinct cases of what happens in all 64 “genetic variants”—with the red arrow in each case indicating where the pattern first differs from what it is with our original idealized organism:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025geneticimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025geneticimg4.png' alt='' title='' width='593' height='411'> </div> </p></div> <p>And here is then the distribution of lifetimes achieved—in effect showing the differing consequences of this particular “disease perturbation” on all our genetic variants:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025geneticimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025geneticimg5.png' alt='' title='' width='357' height='161'> </div> </p></div> <p>What happens with other “disease perturbations”? Here’s a sample of distributions of lifetimes achieved (where “__” corresponds to cases where all 64 genetic variants yield the same lifetime):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025geneticimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025geneticimg6.png' alt='' title='' width='530' height='153'> </div> </p></div> <p>OK, so what about the overall lifetime distribution across all (single) perturbations for each of the genetic variants? The detailed distribution we get is different for each variant. But their general shape is always remarkably similar</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025geneticimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025geneticimg7.png' alt='' title='' width='479' height='297'> </div> </p></div> <p>though taking differences from the case of our original idealized organism reveals some structure:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025geneticimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025geneticimg8.png' alt='' title='' width='474' height='295'> </div> </p></div> <p>As another indication of the effect of genetic diversity, we can plot the survival curve averaged over all perturbations, and compare the case for our original idealized organism with what happens if we average equally over all 64 genetic variants. The difference is small, but there is a longer tail for the average of the genetic variants than for our specific original idealized organism:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025geneticimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025geneticimg9.png' alt='' title='' width='513' height='224'> </div> </p></div> <p>We’ve seen how our idealized genetic variation affects “disease”. But how does it affect “treatment”? For the “disease” above, we already saw that there’s a particular “treatment perturbation” that successfully returns our original idealized organism to its “natural lifespan”. So what happens if we apply this same treatment across all the genetic variants? In effect this is like doing a very idealized “clinical trial” of our potential treatment. And what we see is that the results are quite diverse—and indeed more diverse than from the disease on it own:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025geneticimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025geneticimg10.png' alt='' title='' width='613' height='535'> </div> </p></div> <p>In essence what we’re seeing is that, yes, there are some genetic variants for which the treatment still works. But there are many for which there are (often fairly dramatic) side effects. </p> <h2 id="biological-evolution-and-our-model-organism">Biological Evolution and Our Model Organism</h2> <p>So where did the particular rule for the “model organism” we’ve been studying come from? Well, we evolved it—using a slight generalization of the idealized model for biological evolution that <a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/">I recently introduced</a>. The goal of our evolutionary process was to find a rule that generates a pattern that lives as long as possible, but not infinitely long—and that does so robustly even in the presence of perturbations. In essence we used lifetime (or, more accurately, “lifetime under perturbation”) as our “fitness function”, then progressively evolved our rule (or “genome”) by random mutations to try to maximize this fitness function.</p> <p>In more detail, we started from the null (“everything turns white”) rule, then successively made random changes to single cases in the rule (“point mutations”)—keeping the resulting rule whenever the pattern it generated had a lifetime (under perturbation) that wasn’t smaller (or infinite). And with this setup, here’s the particular (random) sequence of rules we got (showing for each rule the outcome for each of its 64 cases):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025evolutionimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025evolutionimg1.png' alt='' title='' width='384' height='224'> </div> </p></div> <p>Many of these rules don’t “make progress” in the sense that they increase the lifetime under perturbation. But every so often there’s a “breakthrough”, and a rule with a longer lifetime under perturbation is reached:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025evolutionimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025evolutionimg2.png' alt='' title='' width='568' height='473'> </div> </p></div> <p>And, as we see, the rule for the particular model organism we’ve been using is what’s reached at the end.</p> <p>In <a href="https://writings.stephenwolfram.com/2024/12/foundations-of-biological-evolution-more-results-more-surprises/">studying my recent idealized model for biological evolution</a>, I considered fitness functions like lifetime that can directly be computed just by running the underlying rule from a certain initial condition. But here I’m generalizing that a bit, and considering as a fitness function not just lifetime, but “lifetime under perturbation”, computed by taking a particular rule, and finding the minimum lifetime of all patterns produced by it with certain random perturbations applied. </p> <p>So, for example, here the “lifetime under perturbation” would be considered to be the minimum of the lifetimes generated with no perturbation, and with certain random perturbations—or in this case 60:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025evolutionimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025evolutionimg3.png' alt='' title='' width='618' height='177'> </div> </p></div> <p>This plot then illustrates how the (lifetime-under-perturbation) fitness (indicated by the blue line) behaves in the course of our adaptive evolution process, right around where the fitness-60 “breakthrough” above occurs:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025evolutionimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025evolutionimg4.png' alt='' title='' width='638' height='233'> </div> </p></div> <p>What’s happening in this plot? At each adaptive step, we’re considering a new rule, obtained by a point mutation from the previous one. Running this rule we get a certain lifetime. If this lifetime is finite, we indicate it by a green dot. Then we apply a certain set of random perturbations—indicating the lifetimes we get by gray dots. (We could imagine using all sorts of schemes for picking the random perturbations; here what we’re doing is to perturb random points on about a tenth of the rows in the unperturbed pattern.) </p> <p>Then the minimum lifetime for any given rule we indicate by a red dot—and this is the fitness we assign to that rule. So now we can see the whole progression of our adaptive evolution process:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025evolutionimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025evolutionimg5.png' alt='' title='' width='667' height='242'> </div> </p></div> <p>One thing that’s notable is that the unperturbed lifetimes (green dots) are considerably larger than the final minimum lifetimes (red dots). And what this means is that our requirement of “robustness”, implemented by looking at lifetime under perturbation rather than just unperturbed lifetime, considerably reduces the lifetimes we can reach. In other words, if our idealized organism is going to be robust, it won’t tend to be able to have as long a lifetime as it could if it didn’t have to “worry about” random perturbations.</p> <p>And to illustrate this, here’s a typical example of a much longer lifetime obtained by adaptive evolution with the same kind of rule we’ve been using (<em>k</em> = 4, <em>r</em> = 1 cellular automaton), but now with no perturbations and with fitness being given purely by the unperturbed lifetime (exactly as in my recent work on biological evolution):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025evolutionimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025evolutionimg6.png' alt='' title='' width='579' height='622'> </div> </p></div> <p>OK, so given that we’re evolving with a lifetime-under-perturbation fitness function, what are some alternatives to our particular model organism? Here are a few examples:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025evolutionimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025evolutionimg7.png' alt='' title='' width='595' height='512'> </div> </p></div> <p>At an overall level, these seem to react to perturbations much like our original model organism: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025evolutionimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025evolutionimg8.png' alt='' title='' width='660' height='670'> </div> </p></div> <p>One notable feature here, though, is that there seems to be a tendency for simpler overall behavior to be less disrupted by perturbations. In other words, our idealized “diseases” seem to have less dramatic effects on “simpler” idealized organisms. And we can see a reflection of this phenomenon if we plot the overall (single-perturbation) lifetime distributions for the four rules above:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01312025evolutionimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01312025evolutionimg9.png' alt='' title='' width='636' height='72'> </div> </p></div> <p>But despite detailed differences, the main conclusion seems to be that there’s nothing special about the particular model organism we’ve used—and that if we repeated our whole analysis for different model organisms (i.e. “different idealized species”) the results we’d get would be very much the same. </p> <h2 id="what-it-means-and-where-to-go-from-here">What It Means and Where to Go from Here</h2> <p>So what does all this mean? At the outset, it wasn’t clear there’d be a way to usefully capture anything about the foundations of medicine in a formalized theoretical way. But in fact what we’ve found is that even the very simple computational model we’ve studied seems to successfully reflect all sorts of features of what we see in medicine. Many of the fundamental effects and phenomena are, it seems, not the result of details of biomedicine, but instead are at their core purely abstract and computational—and therefore accessible to formalized theory and metamodeling. This kind of methodology is very different from what’s been traditional in medicine—and isn’t likely to lead directly to specific practical medicine. But what it can do is to help us develop powerful new general intuition and ways of reasoning—and ultimately an understanding of the conceptual foundations of what’s going on.</p> <p>At the heart of much of what we’ve seen is the very fundamental—and ubiquitous—phenomenon of computational irreducibility. I’ve <a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/">argued recently</a> that computational irreducibility is central to what makes biological evolution work—and that it’s inevitably imprinted on the core “computational architecture” of biological organisms. And it’s this computational irreducibility that inexorably leads to much of the complexity we see so ubiquitously in medicine. Can we expect to find a simple narrative explanation for the consequences of some perturbation to an organism? In general, no—because of computational irreducibility. There are always pockets of computational reducibility, but in general we can have no expectation that, for example, we’ll be able to describe the effects of different perturbations by neatly classifying them into a certain set of distinct “diseases”. </p> <p>To a large extent the core mission of medicine is about “treating diseases”, or in our terms, about remediating or reversing the effects of perturbations. And once again, computational irreducibility implies there’s inevitably a certain fundamental difficulty in doing this. It’s a bit like with the <a href="https://writings.stephenwolfram.com/2023/02/computational-foundations-for-the-second-law-of-thermodynamics/">Second Law of thermodynamics</a>, where there’s enough computational irreducibility in microscopic molecular dynamics that to seriously reverse—or outpredict—this dynamics is something that’s at least far out of range for computationally bounded observers like us. And in our medical setting the analog of that is that “computationally bounded interventions” can only systematically lead to medical successes insofar as they tap into pockets of computational reducibility. And insofar as they are exposed to overall computational irreducibility they will inevitably seem to show a certain amount of apparent randomness in their outcomes. </p> <p>In traditional approaches to medicine one ultimately tends to “give in to the randomness” and go no further than to assign probabilities to things. But an important feature of what we’ve done here is that in our idealized computational models we can always explicitly see what’s happening inside. Often—largely as a consequence of computational irreducibility—it’s complicated. But the fact that we can see it gives us the opportunity to get much more clarity about the fundamental mechanisms involved. And if we end up summarizing what happens by giving probabilities and doing statistics it’s because this is something we’re choosing to do, not something we’re forced to do because of our lack of knowledge of the systems we’re studying. </p> <p>There’s much to do in our effort to explore the computational foundations of medicine. But already there are some implications that are beginning to emerge. Much of the workflow of medicine today is based on classifying things that can go wrong into discrete diseases. But what we’ve seen here (which is hardly surprising given practical experience with medicine) is that when one looks at the details, a huge diversity of things can happen—whose characteristics and outcomes can’t really be binned neatly into discrete “diseases”. </p> <p>And indeed when we try to figure out “treatments” the details matter. As a first approximation, we might base our treatments on coarse graining into discrete diseases. But—as the approach I’ve outlined here can potentially help us analyze—the more we can directly go from detailed measurements to detailed treatments (through computation, machine learning, etc.), the more promising it’s likely to be. Not that it’s easy. Because in a sense we’re trying to <a href="https://writings.stephenwolfram.com/2024/03/can-ai-solve-science/#the-hard-limit-of-computational-irreducibility">beat computational irreducibility</a>—with computationally bounded measurements and interventions.</p> <p>In principle one can imagine a future in which our efforts at treatment have much more computational sophistication (and indeed the immune system presumably already provides an example in nature). We can imagine things like algorithmic drugs and artificial cells that are capable of amounts of computation that are a closer match for the irreducible computation of an organism. And indeed the kind of formalized theory that I’ve outlined here is likely what one needs to begin to get an idea of how such an approach might work. (In the thermodynamic analogy, what we need to do is a bit like reversing entropy increase by sending in large numbers of “smart molecules”.) </p> <p>(By the way, seeing how difficult it potentially is to reverse the effects of a perturbation provides all the more impetus to consider “starting from scratch”—as nature does in successive generations of organisms—and simply wholesale regenerating elements of organisms, rather than trying to “fix what’s there”. And, yes, in our models this is for example like starting to grow again from a new seed, and letting the resulting pattern knit itself into the existing one.)</p> <p>One of the important features of operating at the level of computational foundations is that we can expect conclusions we draw to be very general. And we might wonder whether perhaps the framework we’ve described here could be applied outside of medicine. And to some extent I suspect it can—potentially to areas like robustness of large-scale technological and social systems and specifically things like computer security and computer system failures. (And, yes, much as in medicine one can imagine for example “classifying diseases” for computer systems.) But things likely won’t be quite the same in cases like these—because the underlying systems have much more human-determined mechanisms, and less “blind” adaptive evolution. </p> <p>But when it comes to medicine, the very presence of computational irreducibility introduced by biological evolution is what potentially allows one to develop a robust framework in which one can draw conclusions purely on the basis of abstract computational phenomena. Here I’ve just begun to scratch the surface of what’s possible. But I think we’ve already seen enough that we can be confident that medicine is yet another field whose foundations can be seen as fundamentally rooted in the computational paradigm. </p> <h2 id="thanks--notes" style='font-size:1.2rem'>Thanks & Notes</h2> <p style='font-size:90%'>Thanks to <a href="https://wolframinstitute.org/" target="_blank" rel="noopener">Wolfram Institute</a> researcher Willem Nielsen for extensive help.</p> <p style='font-size:90%'>I’ve never written anything substantial about medicine before, though I’ve had many interactions with the medical research and biomedical communities over the years—that have gradually extended my knowledge and intuition about medicine. (Thanks particularly to Beatrice Golomb, who over the course of more than forty years has helped me understand more about medical reasoning, often emphasizing “Beatrice’s Law” that “Everything in medicine is more complicated than you can possibly imagine, even taking account of Beatrice’s Law”…)</p> ]]></content:encoded> <wfw:commentRss>https://writings.stephenwolfram.com/2025/02/towards-a-computational-formalization-for-foundations-of-medicine/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item> <title>Launching Version 14.2 of Wolfram Language & Mathematica: Big Data Meets Computation & AI</title> <link>https://writings.stephenwolfram.com/2025/01/launching-version-14-2-of-wolfram-language-mathematica-big-data-meets-computation-ai/</link> <comments>https://writings.stephenwolfram.com/2025/01/launching-version-14-2-of-wolfram-language-mathematica-big-data-meets-computation-ai/#respond</comments> <pubDate>Thu, 23 Jan 2025 19:00:09 +0000</pubDate> <dc:creator><![CDATA[Stephen Wolfram]]></dc:creator> <category><![CDATA[Data Science]]></category> <category><![CDATA[Mathematica]]></category> <category><![CDATA[New Technology]]></category> <category><![CDATA[Wolfram Language]]></category> <category><![CDATA[Recent Release]]></category> <category><![CDATA[Version Release]]></category> <guid isPermaLink="false">https://writings.stephenwolfram.com/?p=66116</guid> <description><![CDATA[<span class="thumbnail"><img width="128" height="108" src="https://content.wolfram.com/sites/43/2025/01/icon-14.2.png" class="attachment-thumbnail size-thumbnail wp-post-image" alt="" /></span>The Drumbeat of Releases Continues… Notebook Assistant Chat inside Any Notebook Bring Us Your Gigabytes! Introducing Tabular Manipulating Data in Tabular Getting Data into Tabular Cleaning Data for Tabular The Structure of Tabular Tabular Everywhere Algebra with Symbolic Arrays Language Tune-Ups Brightening Our Colors; Spiffing Up for 2025 LLM Streamlining & Streaming Streamlining Parallel Computation: […]]]></description> <content:encoded><![CDATA[<span class="thumbnail"><img width="128" height="108" src="https://content.wolfram.com/sites/43/2025/01/icon-14.2.png" class="attachment-thumbnail size-thumbnail wp-post-image" alt="" /></span><style type="text/css"> #blog a.bottomstripe { max-width: 100%; } #blog .post_content .inline-table-of-contents { background-color: #f6fcff; border: solid 1px #bbdbe8; padding: 24px 20px 10px 20px; } #blog .post_content .inline-table-of-contents div a { color: #19749a; font-size: 14px; line-height: 1.25; margin-bottom: 12px; } #blog .post_content .inline-table-of-contents div a:active, #blog .post_content .inline-table-of-contents div a:hover { color:#D76A00; } #blog .post_content .inline-table-of-contents div a > span:nth-of-type(1) { width: 0; } #blog .post_content .inline-table-of-contents div a > span:nth-of-type(1)::before { color: #85b7cb; content: '\25FC'; font-size: 10px; position: relative; top: -2px; } #blog .post_content .inline-table-of-contents div a > span:nth-of-type(1):after { margin: 0; opacity: 0; } #blog .post_content .inline-table-of-contents div a span:nth-of-type(2) { font-weight: 400; } #blog .post_content .inline-table-of-contents div.left { padding-right: 0.75rem; } #blog .post_content .inline-table-of-contents div.right { padding-left: 0.75rem; } </style> <div class="inline-table-of-contents"> <div class="grid cols-2 heirs-width-1-2 cols-1__600 heirs-width-full__600"> <div class="left"> <div><a href="https://writings.stephenwolfram.com/2025/01/launching-version-14-2-of-wolfram-language-mathematica-big-data-meets-computation-ai/#the-drumbeat-of-releases-continues"><span></span><span>The Drumbeat of Releases Continues…</a></div> <div><a href="https://writings.stephenwolfram.com/2025/01/launching-version-14-2-of-wolfram-language-mathematica-big-data-meets-computation-ai/#notebook-assistant-chat-inside-any-notebook"><span></span><span>Notebook Assistant Chat inside Any Notebook</a></div> <div><a href="https://writings.stephenwolfram.com/2025/01/launching-version-14-2-of-wolfram-language-mathematica-big-data-meets-computation-ai/#bring-us-your-gigabytes-introducing-tabular"><span></span><span>Bring Us Your Gigabytes! Introducing Tabular</a></div> <div><a href="https://writings.stephenwolfram.com/2025/01/launching-version-14-2-of-wolfram-language-mathematica-big-data-meets-computation-ai/#manipulating-data-in-tabular"><span></span><span>Manipulating Data in Tabular</a></div> <div><a href="https://writings.stephenwolfram.com/2025/01/launching-version-14-2-of-wolfram-language-mathematica-big-data-meets-computation-ai/#getting-data-into-tabular"><span></span><span>Getting Data into Tabular</a></div> <div><a href="https://writings.stephenwolfram.com/2025/01/launching-version-14-2-of-wolfram-language-mathematica-big-data-meets-computation-ai/#cleaning-data-for-tabular"><span></span><span>Cleaning Data for Tabular</a></div> <div><a href="https://writings.stephenwolfram.com/2025/01/launching-version-14-2-of-wolfram-language-mathematica-big-data-meets-computation-ai/#the-structure-of-tabular"><span></span><span>The Structure of Tabular</a></div> <div><a href="https://writings.stephenwolfram.com/2025/01/launching-version-14-2-of-wolfram-language-mathematica-big-data-meets-computation-ai/#tabular-everywhere"><span></span><span>Tabular Everywhere</a></div> <div><a href="https://writings.stephenwolfram.com/2025/01/launching-version-14-2-of-wolfram-language-mathematica-big-data-meets-computation-ai/#algebra-with-symbolic-arrays"><span></span><span>Algebra with Symbolic Arrays</a></div> <div><a href="https://writings.stephenwolfram.com/2025/01/launching-version-14-2-of-wolfram-language-mathematica-big-data-meets-computation-ai/#language-tune-ups"><span></span><span>Language Tune-Ups</a></div> <div><a href="https://writings.stephenwolfram.com/2025/01/launching-version-14-2-of-wolfram-language-mathematica-big-data-meets-computation-ai/#brightening-our-colors-spiffing-up-for-2025"><span></span><span>Brightening Our Colors; Spiffing Up for 2025</a></div> </div> <div class="right"> <div><a href="https://writings.stephenwolfram.com/2025/01/launching-version-14-2-of-wolfram-language-mathematica-big-data-meets-computation-ai/#llm-streamlining--streaming"><span></span><span>LLM Streamlining & Streaming</a></div> <div><a href="https://writings.stephenwolfram.com/2025/01/launching-version-14-2-of-wolfram-language-mathematica-big-data-meets-computation-ai/#streamlining-parallel-computation-launch-all-the-machines"><span></span><span>Streamlining Parallel Computation: Launch All the Machines!</a></div> <div><a href="https://writings.stephenwolfram.com/2025/01/launching-version-14-2-of-wolfram-language-mathematica-big-data-meets-computation-ai/#follow-that--tracking-in-video"><span></span><span>Follow that ____! Tracking in Video</a></div> <div><a href="https://writings.stephenwolfram.com/2025/01/launching-version-14-2-of-wolfram-language-mathematica-big-data-meets-computation-ai/#game-theory"><span></span><span>Game Theory</a></div> <div><a href="https://writings.stephenwolfram.com/2025/01/launching-version-14-2-of-wolfram-language-mathematica-big-data-meets-computation-ai/#computing-the-syzygies-and-other-advances-in-astronomy"><span></span><span>Computing the Syzygies, and Other Advances in Astronomy</a></div> <div><a href="https://writings.stephenwolfram.com/2025/01/launching-version-14-2-of-wolfram-language-mathematica-big-data-meets-computation-ai/#pdes-now-also-for-magnetic-systems"><span></span><span>PDEs Now Also for Magnetic Systems</a></div> <div><a href="https://writings.stephenwolfram.com/2025/01/launching-version-14-2-of-wolfram-language-mathematica-big-data-meets-computation-ai/#new-features-in-graphics-geometry--graphs"><span></span><span>New Features in Graphics, Geometry & Graphs</a></div> <div><a href="https://writings.stephenwolfram.com/2025/01/launching-version-14-2-of-wolfram-language-mathematica-big-data-meets-computation-ai/#user-interface-tune-ups"><span></span><span>User Interface Tune-Ups</a></div> <div><a href="https://writings.stephenwolfram.com/2025/01/launching-version-14-2-of-wolfram-language-mathematica-big-data-meets-computation-ai/#the-beginnings-of-going-native-on-gpus"><span></span><span>The Beginnings of Going Native on GPUs<br /> </a></div> <div><a href="https://writings.stephenwolfram.com/2025/01/launching-version-14-2-of-wolfram-language-mathematica-big-data-meets-computation-ai/#and-even-more"><span></span><span>And Even More…</a></div> </div> </div> </div> <h2 id="the-drumbeat-of-releases-continues">The Drumbeat of Releases Continues…</h2> <p>Just under six months ago (176 days ago, to be precise) we <a href="https://writings.stephenwolfram.com/2024/07/yet-more-new-ideas-and-new-functions-launching-version-14-1-of-wolfram-language-mathematica/">released Version 14.1</a>. Today I’m pleased to announce that we’re releasing Version 14.2, delivering the latest from our R&D pipeline.</p> <p>This is an exciting time for our technology, both in terms of what we’re now able to implement, and in terms of how our technology is now being used in the world at large. A notable feature of these times is the increasing use of <a href="https://www.wolfram.com/language/">Wolfram Language</a> not only by humans, but also by AIs. And it’s very nice to see that all the effort we’ve put into consistent language design, implementation and documentation over the years is now paying dividends in making Wolfram Language uniquely valuable as a tool for AIs—complementing their own intrinsic capabilities.<span id="more-66116"></span></p> <p>But there’s another angle to AI as well. With our <a href="https://www.wolfram.com/notebook-assistant-llm-kit/">Wolfram Notebook Assistant</a> launched last month we’re using AI technology (plus a lot more) to provide what amounts to a conversational interface to Wolfram Language. As I <a href="https://writings.stephenwolfram.com/2024/12/useful-to-the-point-of-being-revolutionary-introducing-wolfram-notebook-assistant/">described when we released Wolfram Notebook Assistant</a>, it’s something extremely useful for experts and beginners alike, but ultimately I think its most important consequence will be to accelerate the ability to go from any field X to “computational X”—making use of the whole tower of technology we’ve built around Wolfram Language.</p> <p>So, what’s <a href="https://reference.wolfram.com/language/guide/\ SummaryOfNewFeaturesIn142.html">new in 14.2</a>? Under the hood there are changes to make Wolfram Notebook Assistant more efficient and more streamlined. But there are also lots of visible extensions and enhancements to the user-visible parts of the Wolfram Language. In total there are 80 completely new functions—along with 177 functions that have been substantially updated.</p> <p>There are continuations of long-running R&D stories, like additional functionality for video, and additional capabilities around symbolic arrays. Then there are completely new areas of built-in functionality, like game theory. But the largest new development in Version 14.2 is around handling tabular data, and particularly, big tabular data. It’s a whole new subsystem for Wolfram Language, with powerful consequences throughout the system. We’ve been working on it for quite a few years, and we’re excited to be able to release it for the first time in Version 14.2.</p> <p>Talking of working on new functionality: starting more than seven years ago we pioneered the concept of open software design, <a href="https://livestreams.stephenwolfram.com/">livestreaming our software design meetings</a>. And, for example, since the release of Version 14.1, we’ve done 43 software design livestreams, for a total of 46 hours (I’ve also done 73 hours of other livestreams in that time). Some of the functionality that’s now in Version 14.2 we started work on quite a few years ago. But we’ve been livestreaming long enough that pretty much anything that’s now in Version 14.2 we designed live and in public on a livestream at some time or another. It’s hard work doing software design (as you can tell if you watch the livestreams). But it’s always exciting to see the fruits of those efforts come to fruition in the system we’ve been progressively building for so long. And so, today, it’s a pleasure to be able to release Version 14.2 and to let everyone use the things we’ve been working so hard to build.</p> <h2 id="notebook-assistant-chat-inside-any-notebook">Notebook Assistant Chat inside Any Notebook</h2> <p>Last month we released the <a href="https://writings.stephenwolfram.com/2024/12/useful-to-the-point-of-being-revolutionary-introducing-wolfram-notebook-assistant/">Wolfram Notebook Assistant</a> to “turn words into computation”—and help experts and novices alike make broader and deeper use of <a href="https://www.wolfram.com/language/">Wolfram Language</a> technology. In <a href="https://reference.wolfram.com/legacy/language/v14.1/">Version 14.1</a> the primary way to use Notebook Assistant is through the separate “side chat” Notebook Assistant window. But in Version 14.2 “chat cells” have become a standard feature of any notebook available to anyone with a <a href="https://www.wolfram.com/notebook-assistant-llm-kit/">Notebook Assistant subscription</a>.</p> <p>Just type <span class="kbd"><kbd>‘</kbd></span> as the first character of any cell, and it’ll become a chat cell:</p> <p><img style="margin:6px;" loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225chatimg1a.png' alt='Chat cell' title='Chat cell' width='620' height='42'/></p> <p>Now you can start chatting with the Notebook Assistant:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225chatimg2d2_copy.txt' data-c2c-type='text/html'> <img src='https://content.wolfram.com/sites/43/2025/01/sw012225chatimg2d.png' width='620' height='auto'/> </div> </p></div> <p>With the side chat you have a “separate channel” for communicating with the Notebook Assistant—that won’t, for example, be saved with your notebook. With chat cells, your chat becomes an integral part of the notebook.</p> <p>We actually <a href="https://writings.stephenwolfram.com/2023/06/introducing-chat-notebooks-integrating-llms-into-the-notebook-paradigm/">first introduced Chat Notebooks</a> in the middle of 2023—just a few months after the <a href="https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/">arrival of ChatGPT</a>. Chat Notebooks defined the interface, but at the time, the actual content of chat cells was purely from external LLMs. Now in Version 14.2, chat cells are not limited to separate Chat Notebooks, but are available in any notebook. And by default they make use of the full Notebook Assistant technology stack, which goes far beyond a raw LLM. In addition, once you have a <a href="https://www.wolfram.com/notebook-assistant-llm-kit/">Notebook Assistant + LLM Kit subscription</a>, you can seamlessly use chat cells; no account with external LLM providers is needed. </p> <p>The chat cell functionality in Version 14.2 inherits all the features of Chat Notebooks. For example, typing <span class="kbd"><kbd>~</kbd></span> in a new cell creates a chat break, that lets you start a “new conversation”. And when you use a chat cell, it’s able to see anything in your notebook up to the most recent chat break. (By the way, when you use Notebook Assistant through side chat it can also see what selection you’ve made in your “focus” notebook.)</p> <p>By default, chat cells are “talking” to the Notebook Assistant. But if you want, you can also use them to talk to external LLMs, just like in our original Chat Notebook—and there’s a convenient menu to set that up. Of course, if you’re using an external LLM, you don’t have all the technology that’s now in the Notebook Assistant, and unless you’re doing LLM research, you’ll typically find it much more useful and valuable to use chat cells in their default configuration—talking to the Notebook Assistant. </p> <h2 id="bring-us-your-gigabytes-introducing-tabular">Bring Us Your Gigabytes! Introducing Tabular</h2> <p>Lists, associations, datasets. These are very flexible ways to represent structured collections of data in the <a href="https://www.wolfram.com/language/">Wolfram Language</a>. But now in Version 14.2 there’s another: <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt>. <tt>Tabular</tt> provides a very streamlined and efficient way to handle tables of data laid out in rows and columns. And when we say “efficient” we mean that it can routinely juggle gigabytes of data or more, both in core and out of core. </p> <p>Let’s do an example. Let’s start off by importing some tabular data:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01202025gigabytesimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01202025gigabytesimg1.png' alt='' title='' width='734' height='303'> </div> </p></div> <p>This is data on trees in New York City, 683,788 of them, each with 45 properties (sometimes missing). <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt> introduces a variety of new ideas. One of them is treating tabular columns much like variables. Here we’re using this to make a histogram of the values of the <tt>"tree_dbh"</tt> column in this <tt>Tabular</tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01202025gigabytesimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01202025gigabytesimg2.png' alt='' title='' width='403' height='237'> </div> </p></div> <p>You can think of a <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt> as being like an optimized form of a list of associations, where each row consists of an association whose keys are column names. Functions like <tt><a href="http://reference.wolfram.com/language/ref/Select.html">Select</a></tt> then just work on <tt>Tabular</tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01202025gigabytesimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01202025gigabytesimg3.png' alt='' title='' width='737' height='302'> </div> </p></div> <p><tt><a href="http://reference.wolfram.com/language/ref/Length.html">Length</a></tt> gives the number of rows:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01202025gigabytesimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01202025gigabytesimg4.png' alt='' title='' width='121' height='43'> </div> </p></div> <p><a href="http://reference.wolfram.com/language/ref/CountsBy.html"><tt>CountsBy</tt></a> treats the <a href="http://reference.wolfram.com/language/ref/Tabular.html"><tt>Tabular</tt></a> as a list of associations, extracting the value associated with the key <tt>"spc_latin"</tt> (“Latin species”) in each association, and counting how many times that value occurs (<tt>"spc_latin"</tt> here is short for <tt><a href="https://reference.wolfram.com/language/ref/Slot.html">#</a>"spc_latin"<a href="https://reference.wolfram.com/language/ref/Function.html">&</a></tt>):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01202025gigabytesimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/02/sw01202025gigabytesimg5c.png' alt='' title='' width='655' height='175'> </div> </p></div> <p>To get the names of the columns we can use the new function <tt><a href="http://reference.wolfram.com/language/ref/ColumnKeys.html">ColumnKeys</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01202025gigabytesimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01202025gigabytesimg6.png' alt='' title='' width='634' height='94'> </div> </p></div> <p>Viewing <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt> as being like a list of associations we can extract parts—giving first a specification of rows, and then a specification of columns:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01202025gigabytesimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01202025gigabytesimg7.png' alt='' title='' width='434' height='162'> </div> </p></div> <p>There are lots of new operations that we’ve been able to introduce now that we have <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt>. An example is <a href="https://reference.wolfram.com/language/ref/AggregateRows.html"><tt>AggregrateRows</tt></a>, which constructs a new <tt>Tabular</tt> from a given <tt>Tabular</tt> by aggregating groups of rows, in this case ones with the same value of <tt>"spc_latin"</tt>, and then applying a function to those rows, in this case finding the mean value of <tt>"tree_dbh"</tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01202025gigabytesimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01202025gigabytesimg8.png' alt='' title='' width='652' height='302'> </div> </p></div> <p>An operation like <tt><a href="http://reference.wolfram.com/language/ref/ReverseSortBy.html">ReverseSortBy</a></tt> then “just works” on this table, here reverse sorting by the value of <tt>"meandbh"</tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01202025gigabytesimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01202025gigabytesimg9.png' alt='' title='' width='394' height='302'> </div> </p></div> <p>Here we’re making an ordinary matrix out of a small slice of data from our <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01202025gigabytesimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01202025gigabytesimg10.png' alt='' title='' width='600' height='115'> </div> </p></div> <p>And now we can plot the result, giving the positions of Virginia pine trees in New York City:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01202025gigabytesimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01202025gigabytesimg11.png' alt='' title='' width='260' height='223'> </div> </p></div> <p>When should you use a <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt>, rather than, say a <tt><a href="http://reference.wolfram.com/language/ref/Dataset.html">Dataset</a></tt>? <tt>Tabular</tt> is specifically set up for data that is arranged in rows and columns—and it supports many powerful operations that make sense for data in this “rectangular” form. <tt>Dataset</tt> is more general; it can have an arbitrary hierarchy of data dimensions, and so can’t in general support all the “rectangular” data operations of <tt>Tabular</tt>. In addition, by being specialized for “rectangular” data, <tt>Tabular</tt> can also be much more efficient, and indeed we’re making use of the latest type-specific methods for large-scale data handling. </p> <p>If you use <tt><a href="http://reference.wolfram.com/language/ref/TabularStructure.html">TabularStructure</a></tt> you can see some of what lets <tt>Tabular</tt> be so efficient. Every column is treated as data of a specific type (and, yes, the types are consistent with the ones in the Wolfram Language compiler). And there’s streamlined treatment of missing data (with several new functions added specifically to handle this):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01202025gigabytesimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01202025gigabytesimg12.png' alt='' title='' width='668' height='302'> </div> </p></div> <p>What we’ve seen so far is <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt> operating with “in-core” data. But you can quite transparently also use <tt>Tabular</tt> on out-of-core data, for example data stored in a relational database. </p> <p>Here’s an example of what this looks like: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01202025gigabytesimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01202025gigabytesimg13.png' alt='' title='' width='486' height='80'> </div> </p></div> <p>It’s a tabular that points to a table in a relational database. It doesn’t by default explicitly display the data in the <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt> (and in fact it doesn’t even get it into memory—because it might be huge and might be changing quickly as well). But you can still specify operations just like on any other <tt>Tabular</tt>. This finds out what columns are there:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01202025gigabytesimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01202025gigabytesimg14.png' alt='' title='' width='649' height='68'> </div> </p></div> <p>And this specifies an operation, giving the result as a symbolic out-of-core <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt> object:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01202025gigabytesimg15_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01202025gigabytesimg15.png' alt='' title='' width='398' height='75'> </div> </p></div> <p>You can “resolve” this, and get an explicit in-memory <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt> using <tt><a href="http://reference.wolfram.com/language/ref/ToMemory.html">ToMemory</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01202025gigabytesimg16_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01202025gigabytesimg16.png' alt='' title='' width='737' height='177'> </div> </p></div> <h2 id="manipulating-data-in-tabular">Manipulating Data in Tabular</h2> <p>Let’s say you’ve got a <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt>—like this one based on penguins:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg1.png' alt='' title='' width='737' height='302'> </div> </p></div> <p>There are lots of operations you can do that manipulate the data in this <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt> in a structured way—giving you back another <tt>Tabular</tt>. For example, you could just take the last 2 rows of the <tt>Tabular</tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg2.png' alt='' title='' width='737' height='153'> </div> </p></div> <p>Or you could sample 3 random rows:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg3.png' alt='' title='' width='737' height='187'> </div> </p></div> <p>Other operations depend on the actual content of the <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt>. And because you can treat each row like an association, you can set up functions that effectively refer to elements by their column names:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg4.png' alt='' title='' width='737' height='187'> </div> </p></div> <p>Note that we can always use <tt><a href="http://reference.wolfram.com/language/ref/Slot.html">#</a></tt><tt>[</tt><em>name</em><tt>]</tt> to refer to elements in a column. If <em>name</em> is an alphanumeric string then we can also use the shorthand <tt>#</tt><em>name</em>. And for other strings, we can use <tt>#"</tt><em>name</em><tt>"</tt>. Some functions let you just use <tt>"</tt><em>name</em><tt>"</tt> to indicate the function <tt>#["</tt><em>name</em><tt>"]</tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg5.png' alt='' title='' width='737' height='302'> </div> </p></div> <p>So far we’ve talked only about arranging or selecting rows in a <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt>. What about columns? Here’s how we can construct a tabular that has just two of the columns from our original <tt>Tabular</tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg6.png' alt='' title='' width='473' height='303'> </div> </p></div> <p>What if we don’t just want existing columns, but instead want new columns that are functions of these? <tt><a href="http://reference.wolfram.com/language/ref/ConstructColumns.html">ConstructColumns</a></tt> lets us define new columns, giving their names and the functions to be used to compute values in them:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg7.png' alt='' title='' width='479' height='350'> </div> </p></div> <p>(Note the trick of writing out <tt><a href="http://reference.wolfram.com/language/ref/Function.html">Function</a></tt> to avoid having to put parentheses, as in <nobr><tt>"species"<img style="margin-bottom: -1px" class='' src="https://content.wolfram.com/uploads/sites/32/2022/10/rightarrow2.png" width='15' height='11' >(</tt><tt><a href="http://reference.wolfram.com/language/ref/StringTake.html">StringTake</a></tt><tt>[<a href="https://reference.wolfram.com/language/ref/Slot.html">#</a>species,1]</tt><tt><a href="http://reference.wolfram.com/language/ref/Function.html">&</a></tt><tt>)</tt>.)</nobr></p> <p><tt><a href="http://reference.wolfram.com/language/ref/ConstructColumns.html">ConstructColumns</a></tt> lets you take an existing <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt> and construct a new one. <tt><a href="http://reference.wolfram.com/language/ref/TransformColumns.html">TransformColumns</a></tt> lets you transform columns in an existing <tt>Tabular</tt>, here replacing species names by their first letters:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg8.png' alt='' title='' width='737' height='302'> </div> </p></div> <p><tt><a href="http://reference.wolfram.com/language/ref/TransformColumns.html">TransformColumns</a></tt> also lets you add new columns, specifying the content of the columns just like in <tt><a href="http://reference.wolfram.com/language/ref/ConstructColumns.html">ConstructColumns</a></tt>. But where does <tt>TransformColumns</tt> put your new columns? By default, they go at the end, after all existing columns. But if you specifically list an existing column, that’ll be used as a marker to determine where to put the new column (<tt>"</tt><em>name</em><tt>"<img style="margin-bottom: -1px" class='' src="https://content.wolfram.com/uploads/sites/32/2022/10/rightarrow2.png" width='15' height='11' ></tt><tt><a href="http://reference.wolfram.com/language/ref/Nothing.html">Nothing</a></tt> removes a column):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg9.png' alt='' title='' width='737' height='350'> </div> </p></div> <p>Everything we’ve seen so far operates separately on each row of a <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt>. But what if we want to “gulp in” a whole column to use in our computation—say, for example, computing the mean of a whole column, then subtracting it from each value. <tt><a href="http://reference.wolfram.com/language/ref/ColumnwiseValue.html">ColumnwiseValue</a></tt> lets you do this, by supplying to the function (here <tt><a href="http://reference.wolfram.com/language/ref/Mean.html">Mean</a></tt>) a list of all the values in whatever column or columns you specify:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg10.png' alt='' title='' width='668' height='350'> </div> </p></div> <p><tt><a href="http://reference.wolfram.com/language/ref/ColumnwiseValue.html">ColumnwiseValue</a></tt> effectively lets you compute a scalar value by applying a function to a whole column. There’s also <tt><a href="http://reference.wolfram.com/language/ref/ColumnwiseThread.html">ColumnwiseThread</a></tt>, which lets you compute a list of values, that will in effect be “threaded” into a column. Here we’re creating a column from a list of accumulated values: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg11.png' alt='' title='' width='610' height='350'> </div> </p></div> <p>By the way, as we’ll discuss below, if you’ve externally generated a list of values (of the right length) that you want to use as a column, you can do that directly by using <tt><a href="http://reference.wolfram.com/language/ref/InsertColumns.html">InsertColumns</a></tt>.</p> <p>There’s another concept that’s very useful in practice in working with tabular data, and that’s grouping. In our penguin data, we’ve got an individual row for each penguin of each species. But what if we want instead to aggregate all the penguins of a given species, for example computing their average body mass? Well, we can do this with <tt><a href="http://reference.wolfram.com/language/ref/AggregateRows.html">AggregateRows</a></tt>. <tt>AggregateRows</tt> works like <tt><a href="http://reference.wolfram.com/language/ref/ConstructColumns.html">ConstructColumns</a></tt> in the sense that you specify columns and their contents. But unlike <tt>ConstructColumns</tt> it creates new “aggregated” rows:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg12.png' alt='' title='' width='635' height='172'> </div> </p></div> <p>What is that first column here? The gray background of its entries indicates that it’s what we call a “key column”: a column whose entries (perhaps together with other key columns) can be used to reference rows. And later, we’ll see how you can use <tt><a href="http://reference.wolfram.com/language/ref/RowKey.html">RowKey</a></tt> to indicate a row by giving a value from a key column: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg13.png' alt='' title='' width='266' height='49'> </div> </p></div> <p>But let’s go on with our aggregation efforts. Let’s say that we want to group not just by species, but also by island. Here’s how we can do that with <tt><a href="http://reference.wolfram.com/language/ref/AggregateRows.html">AggregateRows</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg14.png' alt='' title='' width='558' height='263'> </div> </p></div> <p>In a sense what we have here is a table whose rows are specified by pairs of values (here “species” and “island”). But it’s often convenient to “pivot” things so that these values are used respectively for rows and for columns. And you can do that with <tt><a href="http://reference.wolfram.com/language/ref/PivotTable.html">PivotTable</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg15_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg15.png' alt='' title='' width='598' height='172'> </div> </p></div> <p>Note the —’s, which indicate missing values; apparently there are no Gentoo penguins on Dream island, etc.</p> <p><tt><a href="http://reference.wolfram.com/language/ref/PivotTable.html">PivotTable</a></tt> normally gives exactly the same data as <tt><a href="http://reference.wolfram.com/language/ref/AggregateRows.html">AggregateRows</a></tt>, but in a rearranged form. One additional feature of <tt>PivotTable</tt> is the option <tt><a href="http://reference.wolfram.com/language/ref/IncludeGroupAggregates.html">IncludeGroupAggregates</a></tt> which includes <tt><a href="http://reference.wolfram.com/language/ref/All.html">All</a></tt> entries that aggregate across each type of group: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg16_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg16.png' alt='' title='' width='595' height='229'> </div> </p></div> <p>If you have multiple functions that you’re computing, <tt><a href="http://reference.wolfram.com/language/ref/AggregateRows.html">AggregateRows</a></tt> will just give them as separate columns:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg17_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg17.png' alt='' title='' width='574' height='263'> </div> </p></div> <p><tt><a href="http://reference.wolfram.com/language/ref/PivotTable.html">PivotTable</a></tt> can also deal with multiple functions—by creating columns with “extended keys”:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg18_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg18.png' alt='' title='' width='672' height='226'> </div> </p></div> <p>And now you can use <tt><a href="http://reference.wolfram.com/language/ref/RowKey.html">RowKey</a></tt> and <tt><a href="http://reference.wolfram.com/language/ref/ExtendedKey.html">ExtendedKey</a></tt> to refer to elements of the resulting <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt>: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg19_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01202025manipulatingimg19.png' alt='' title='' width='455' height='49'> </div> </p></div> <h2 id="getting-data-into-tabular">Getting Data into Tabular</h2> <p>We’ve seen some of the things you can do when you have data as a <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt>. But how does one get data into a <tt>Tabular</tt>? There are several ways. The first is just to convert from structures like lists and associations. The second is to import from a file, say a CSV or XLSX (or, for larger amounts of data, Parquet)—or from an external data store (S3, Dropbox, etc.). And the third is to connect to a database. You can also get data for <tt>Tabular</tt> directly from the <a href="https://www.wolfram.com/language/core-areas/knowledgebase/">Wolfram Knowledgebase</a> or from the <a href="https://datarepository.wolframcloud.com/">Wolfram Data Repository</a>.</p> <p>Here’s how you can convert a list of lists into a <tt>Tabular</tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025gettingimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025gettingimg1a.png' alt='' title='' width='270' height='129'> </div> </p></div> <p>And here’s how you can convert back:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025gettingimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025gettingimg2.png' alt='' title='' width='178' height='44'> </div> </p></div> <p>It works with sparse arrays too, here instantly creating a million-row <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt></p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025gettingimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025gettingimg3A.png' alt='' title='' width='604' height='210'> </div> </p></div> <p>that takes 80 MB to store:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025gettingimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025gettingimg4.png' alt='' title='' width='146' height='42'> </div> </p></div> <p>Here’s what happens with a list of associations:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025gettingimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025gettingimg5.png' alt='' title='' width='434' height='129'> </div> </p></div> <p>You can get the same <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt> by entering its data and its column names separately:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025gettingimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025gettingimg6.png' alt='' title='' width='325' height='129'> </div> </p></div> <p>By the way, you can convert a <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt> to a <tt><a href="http://reference.wolfram.com/language/ref/Dataset.html">Dataset</a></tt></p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025gettingimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025gettingimg7.png' alt='' title='' width='128' height='109'> </div> </p></div> <p>and in this simple case you can convert it back to a <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt> too:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025gettingimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025gettingimg8.png' alt='' title='' width='195' height='129'> </div> </p></div> <p>In general, though, there are all sorts of options for how to convert lists, datasets, etc. to <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt> objects—and <tt><a href="http://reference.wolfram.com/language/ref/ToTabular.html">ToTabular</a></tt> is set up to let you control these. For example, you can use <tt>ToTabular</tt> to create a <tt>Tabular</tt> from columns rather than rows:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025gettingimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025gettingimg9.png' alt='' title='' width='470' height='162'> </div> </p></div> <p>How about external data? In Version 14.2 <tt><a href="http://reference.wolfram.com/language/ref/Import.html">Import</a></tt> now supports a <tt>"Tabular"</tt> element for tabular data formats. So, for example, given a CSV file</p> <p><img src='https://content.wolfram.com/sites/43/2025/01/sw01212025gettingimg10.png' alt='CSV file' title='CSV file' width='464' height='110'/></p> <p><tt><a href="http://reference.wolfram.com/language/ref/Import.html">Import</a></tt> can immediately import it as a <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025gettingimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025gettingimg11.png' alt='' title='' width='810' height='211'> </div> </p></div> <p>This works very efficiently even for huge CSV files with millions of entries. It also does well at automatically identifying column names and headers. The same kind of thing works with more structured files, like ones from spreadsheets and statistical data formats. And it also works with modern columnar storage formats like Parquet, ORC and Arrow.</p> <p><tt><a href="http://reference.wolfram.com/language/ref/Import.html">Import</a></tt> transparently handles both ordinary files, and URLs (and URIs), requesting authentication if needed. In Version 14.2 we’re adding the new concept of <tt><a href="http://reference.wolfram.com/language/ref/DataConnectionObject.html">DataConnectionObject</a></tt>, which provides a symbolic representation of remote data, essentially encapsulating all the details of how to get the data. So, for example, here’s a <tt>DataConnectionObject</tt> for an S3 bucket, whose contents we can immediately import:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025gettingimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025gettingimg12.png' alt='' title='' width='737' height='237'> </div> </p></div> <p>(In Version 14.2 we’re supporting Amazon S3, Azure Blob Storage, Dropbox, IPFS—with many more to come. And we’re also planning support for data warehouse connections, APIs, etc.)</p> <p>But what about data that’s too big—or too fast-changing—to make sense to explicitly import? An important feature of <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt> (mentioned above) is that it can transparently handle external data, for example in relational databases. </p> <p>Here’s a reference to a large external database:</p> <p> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025gettingimg13.png' alt='RelationalDatabase' title='RelationalDatabase' width='511' height='80'></p> <p>This defines a <tt>Tabular</tt> that points to a table in the external database:</p> <p> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025gettingimg14.png' alt='tab = Tabular' title='tab = Tabular' width='416' height='75'></p> <p>We can ask for the dimensions of the <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt>—and we see that it has 158 million rows:</p> <p> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025gettingimg15.png' alt='Dimensions' title='Dimensions' width='164' height='44'></p> <p>The table we’re looking at happens to be all the line-oriented data in <a href="https://www.openstreetmap.org" target="_blank" rel="noopener">OpenStreetMap</a>. Here are the first 3 rows and 10 columns:</p> <p> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025gettingimg16.png' alt='ToMemory' title='ToMemory' width='737' height='177'> </p> <p>Most operations on the <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt> will now actually get done in the external database. Here we’re asking to select rows whose “name” field contains <tt>"Wolfram"</tt>:</p> <p> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025gettingimg17.png' alt='Select' title='Select' width='424' height='76'> </p> <p>The actual computation is only done when we use <tt><a href="http://reference.wolfram.com/language/ref/ToMemory.html">ToMemory</a></tt>, and in this case (because there’s a lot of data in the database) it takes a little while. But soon we get the result, as a <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt>:</p> <p> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025gettingimg18.png' alt='ToMemory' title='ToMemory' width='497' height='302'> </p> <p>And we learn that there are 58 Wolfram-named items in the database:</p> <p> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025gettingimg19.png' alt='Length' title='Length' width='121' height='43'> </p> <p>Another source of data for <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt> is the built-in Wolfram Knowledgebase. In Version 14.2 <tt><a href="http://reference.wolfram.com/language/ref/EntityValue.html">EntityValue</a></tt> supports direct output in <tt>Tabular</tt> form:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025gettingimg20_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025gettingimg20.png' alt='' title='' width='577' height='313'> </div> </p></div> <p>The Wolfram Knowledgebase provides lots of good examples of data for <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt>. And the same is true of the Wolfram Data Repository—where you can typically just apply <tt>Tabular</tt> to get data in <tt>Tabular</tt> form:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025gettingimg21_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025gettingimg21.png' alt='' title='' width='733' height='254'> </div> </p></div> <h2 id="cleaning-data-for-tabular">Cleaning Data for Tabular</h2> <p>In many ways it’s the bane of data science. Yes, data is in digital form. But it’s not clean; it’s not computable. The Wolfram Language has long been a uniquely powerful tool for flexibly cleaning data (and, for example, for advancing through the <a href="https://writings.stephenwolfram.com/2017/04/launching-the-wolfram-data-repository-data-publishing-that-really-works/#the-data-curation-hierarchy">ten levels of making data computable</a> that I defined some years ago). </p> <p>But now, in Version 14.2, with <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt>, we have a whole new collection of streamlined capabilities for cleaning data. Let’s start by importing some data “from the wild” (and, actually, this example is cleaner than many):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225cleaningimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225cleaningdataimg1.png' alt='' title='' width='737' height='201'> </div> </p></div> <p>(By the way, if there was really crazy stuff in the file, we might have wanted to use the option <tt><a href="http://reference.wolfram.com/language/ref/MissingValuePattern.html">MissingValuePattern</a></tt> to specify a pattern that would just immediately replace the crazy stuff with <tt><a href="http://reference.wolfram.com/language/ref/Missing.html">Missing</a></tt><tt>[</tt>…<tt>]</tt>.)</p> <p>OK, but let’s start by surveying what came in here from our file, using <tt><a href="http://reference.wolfram.com/language/ref/TabularStructure.html">TabularStructure</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225cleaningimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225cleaningdataimg2.png' alt='' title='' width='592' height='242'> </div> </p></div> <p>We see that <tt><a href="http://reference.wolfram.com/language/ref/Import.html">Import</a></tt> successfully managed to identify the basic type of data in most of the columns—though for example it can’t tell if numbers are just numbers or are representing quantities with units, etc. And it also identifies that some number of entries in some columns are “missing”. </p> <p>As a first step in data cleaning, let’s get rid of what seems like an irrelevant <tt>"id"</tt> column:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225cleaningimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225cleaningdataimg3.png' alt='' title='' width='737' height='195'> </div> </p></div> <p>Next, we see that the elements in the first column are being identified as strings—but they’re really dates, and they should be combined with the times in the second column. We can do this with <tt><a href="http://reference.wolfram.com/language/ref/TransformColumns.html">TransformColumns</a></tt>, removing what’s now an “extra column” by replacing it with <tt><a href="http://reference.wolfram.com/language/ref/Nothing.html">Nothing</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225cleaningimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225cleaningdataimg4a.png' alt='' title='' width='732' height='195'> </div> </p></div> <p>Looking at the various numerical columns, we see that they’re really quantities that should have units. But first, for convenience, let’s rename the last two columns:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225cleaningimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225cleaningdataimg5.png' alt='' title='' width='674' height='147'> </div> </p></div> <p>Now let’s turn the numerical columns into columns of quantities with units, and, while we’re at it, also convert from °C to °F:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225cleaningimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225cleaningdataimg6a.png' alt='' title='' width='684' height='283'> </div> </p></div> <p>Here’s how we can now plot the temperature as a function of time:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225cleaningimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225cleaningdataimg7.png' alt='' title='' width='376' height='233'> </div> </p></div> <p>There’s a lot of wiggling there. And looking at the data we see that we’re getting temperature values from several different weather stations. This selects data from a single station:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225cleaningimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225cleaningdataimg8.png' alt='' title='' width='613' height='252'> </div> </p></div> <p>What’s the break in the curve? If we just scroll to that part of the tabular we’ll see that it’s because of missing data:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225cleaningimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225cleaningdataimg9a.png' alt='' title='' width='684' height='176'> </div> </p></div> <p>So what can we do about this? Well, there’s a powerful function <tt><a href="http://reference.wolfram.com/language/ref/TransformMissing.html">TransformMissing</a></tt> that provides many options. Here we’re asking it to interpolate to fill in missing temperature values:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225cleaningimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225cleaningdataimg10a.png' alt='' title='' width='684' height='290'> </div> </p></div> <p>And now there are no gaps, but, slightly mysteriously, the whole plot extends further:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225cleaningimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225cleaningdataimg11.png' alt='' title='' width='402' height='250'> </div> </p></div> <p>The reason is that it’s interpolating even in cases where basically nothing was measured. We can remove those rows using <tt><a href="http://reference.wolfram.com/language/ref/Discard.html">Discard</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225cleaningimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225cleaningdataimg12a.png' alt='' title='' width='737' height='349'> </div> </p></div> <p>And now we won’t have that “overhang” at the end:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225cleaningimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225cleaningdataimg13.png' alt='' title='' width='618' height='252'> </div> </p></div> <p>Sometimes there’ll explicitly be data that’s missing; sometimes (more insidiously) the data will just be wrong. Let’s look at the histogram of pressure values for our data:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225cleaningimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225cleaningdataimg14.png' alt='' title='' width='296' height='187'> </div> </p></div> <p>Oops. What are those small values? Presumably they’re wrong. (Perhaps they were transcription errors?) We can remove such “anomalous” values by using <tt><a href="http://reference.wolfram.com/language/ref/TransformAnomalies.html">TransformAnomalies</a></tt>. Here we’re telling it to just completely trim out any row where the pressure was “anomalous”:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225cleaningimg15_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225cleaningdataimg15.png' alt='' title='' width='592' height='170'> </div> </p></div> <p>We can also get <tt><a href="http://reference.wolfram.com/language/ref/TransformAnomalies.html">TransformAnomalies</a></tt> to try to “fix” the data. Here we’re just replacing any anomalous pressure by the previous pressure listed in the tabular:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225cleaningimg16_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225cleaningdataimg16.png' alt='' title='' width='622' height='214'> </div> </p></div> <p>You can also tell <tt><a href="http://reference.wolfram.com/language/ref/TransformAnomalies.html">TransformAnomalies</a></tt> to “flag” any anomalous value and make it “missing”. But, if we’ve got missing values what then happens if we try to do computations on them? That’s where <tt><a href="http://reference.wolfram.com/language/ref/MissingFallback.html">MissingFallback</a></tt> comes in. It’s fundamentally a very simple function—that just returns its first non-missing argument:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225cleaningimg17_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225cleaningdataimg17.png' alt='' title='' width='586' height='44'> </div> </p></div> <p>But even though it’s simple, it’s important in making it easy to handle missing values. So, for example, this computes a “northspeed”, falling back to 0 if data needed for the computation is missing:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225cleaningimg18_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225cleaningdataimg18a.png' alt='' title='' width='583' height='163'> </div> </p></div> <h2 id="the-structure-of-tabular">The Structure of Tabular</h2> <p>We’ve said that a <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt> is “like” a list of associations. And, indeed, if you apply <tt><a href="http://reference.wolfram.com/language/ref/Normal.html">Normal</a></tt> to it, that’s what you’ll get:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg1.png' alt='' title='' width='361' height='116'> </div> </p></div> <p>But internally <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt> is stored in a much more compact and efficient way. And it’s useful to know something about this, so you can manipulate <tt>Tabular</tt> objects without having to “take them apart” into things like lists and associations. Here’s our basic sample <tt>Tabular</tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg2.png' alt='' title='' width='280' height='86'> </div> </p></div> <p>What happens if we extract a row? Well, we get a <tt><a href="http://reference.wolfram.com/language/ref/TabularRow.html">TabularRow</a></tt> object:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg3.png' alt='' title='' width='260' height='75'> </div> </p></div> <p>If we apply <tt><a href="http://reference.wolfram.com/language/ref/Normal.html">Normal</a></tt>, we get an association:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg4.png' alt='' title='' width='187' height='44'> </div> </p></div> <p>Here’s what happens if we instead extract a column:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg5.png' alt='' title='' width='325' height='75'> </div> </p></div> <p>Now <tt><a href="http://reference.wolfram.com/language/ref/Normal.html">Normal</a></tt> gives a list:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg6.png' alt='' title='' width='125' height='44'> </div> </p></div> <p>We can create a <tt><a href="http://reference.wolfram.com/language/ref/TabularColumn.html">TabularColumn</a></tt> from a list:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg7.png' alt='' title='' width='325' height='75'> </div> </p></div> <p>Now we can use <tt><a href="http://reference.wolfram.com/language/ref/InsertColumns.html">InsertColumns</a></tt> to insert a symbolic column like this into an existing <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt> (including the <tt>"b"</tt> tells <tt>InsertColumns</tt> to insert the new column after the “b” column):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg8.png' alt='' title='' width='481' height='129'> </div> </p></div> <p>But what actually is a <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt> inside? Let’s look at the example:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025structureAimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025structureAimg10.png' alt='' title='' width='548' height='142'> </div> </p></div> <p><tt><a href="http://reference.wolfram.com/language/ref/TabularStructure.html">TabularStructure</a></tt> gives us a summary of the internal structure here:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg11.png' alt='' title='' width='621' height='264'> </div> </p></div> <p>The first thing to notice is that everything is stated in terms of columns, reflecting the fact that <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt> is a fundamentally column-oriented construct. And part of what makes <tt>Tabular</tt> so efficient is then that within a column everything is uniform, in the sense that all the values are the same type of data. In addition, for things like quantities and dates, we factor the data so that what’s actually stored internally in the column is just a list of numbers, with a single copy of “metadata information” on how to interpret them.</p> <p>And, yes, all this has a big effect. Like here’s the size in bytes of our New York trees <tt>Tabular</tt> from above:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025structureZimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg12.png' alt='' title='' width='214' height='48'> </div> </p></div> <p>But if we turn it into a list of associations using <tt><a href="http://reference.wolfram.com/language/ref/Normal.html">Normal</a></tt>, the result is about 14x larger:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025structureZimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg13.png' alt='' title='' width='281' height='48'> </div> </p></div> <p>OK, but what are those “column types” in the tabular structure? <tt><a href="http://reference.wolfram.com/language/ref/ColumnTypes.html">ColumnTypes</a></tt> gives a list of them:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg14.png' alt='' title='' width='585' height='68'> </div> </p></div> <p>These are low-level types of the kind used in the Wolfram Language compiler. And part of what knowing these does is that it immediately tells us what operations we can do on a particular column. And that’s useful both in low-level processing, and in things like knowing what kind of visualization might be possible.</p> <p>When <tt><a href="http://reference.wolfram.com/language/ref/Import.html">Import</a></tt> imports data from something like a CSV file, it tries to infer what type each column is. But sometimes (as we mentioned above) you’ll want to “cast” a column to a different type, specifying the “destination type” using Wolfram Language type description. So, for example, this casts column “b” to a 32-bit real number, and column “c” to units of meters:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg15_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025structureAimg15.png' alt='' title='' width='602' height='172'> </div> </p></div> <p>By the way, when a <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt> is displayed in a notebook, the column headers indicate the types of data in the corresponding columns. So in this case, there’s a little <img loading='lazy' style="margin-bottom: -4px" src='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg16.png' width='20' height='20'/> in the first column to indicate that it contains strings. Numbers and dates basically just “show what they are”. Quantities have their units indicated. And general symbolic expressions (like column “f” here) are indicated with <img loading='lazy' style="margin-bottom: -4px" src='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg17.png' width='20' height='20'/>. (If you hover over a column header, it gives you more detail about the types.)</p> <p>The next thing to discuss is missing data. <tt>Tabular</tt> always treats columns as being of a uniform type, but keeps an overall map of where values are missing. If you extract the column you’ll see a symbolic <tt><a href="http://reference.wolfram.com/language/ref/Missing.html">Missing</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg18_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg18.png' alt='' title='' width='207' height='44'> </div> </p></div> <p>But if you operate on the tabular column directly it’ll just behave as if the missing data is, well, missing:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg19_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg19.png' alt='' title='' width='192' height='43'> </div> </p></div> <p>By the way, if you’re bringing in data “from the wild”, <tt><a href="http://reference.wolfram.com/language/ref/Import.html">Import</a></tt> will attempt to automatically infer the right type for each column. It knows how to deal with common anomalies in the input data, like <tt>NaN</tt> or <tt>null</tt> in a column of numbers. But if there are other weird things—like, say, <tt>notfound</tt> in the middle of a column of numbers—you can tell <tt>Import</tt> to turn such things into ordinary missing data by giving them as settings for the option <tt><a href="http://reference.wolfram.com/language/ref/MissingValuePattern.html">MissingValuePattern</a></tt>. </p> <p>There are a couple more subtleties to discuss in connection with the structure of <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt> objects. The first is the notion of extended keys. Let’s say we have the following <tt>Tabular</tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025structureAimg20_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg21.png' alt='' title='' width='320' height='145'> </div> </p></div> <p>We can “pivot this to columns” so that the values <em>x</em> and <em>y</em> become column headers, but “under” the overall column header “value”:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg22_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg22.png' alt='' title='' width='382' height='159'> </div> </p></div> <p>But what is the structure of this <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt>? We can use <tt><a href="http://reference.wolfram.com/language/ref/ColumnKeys.html">ColumnKeys</a></tt> to find out:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg23_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg23.png' alt='' title='' width='415' height='44'> </div> </p></div> <p>You can now use these extended keys as indices for the <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg24_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg24.png' alt='' title='' width='306' height='43'> </div> </p></div> <p>In this particular case, because the “subkeys” <tt>"x"</tt> and <tt>"y"</tt> are unique, we can just use those, without including the other part of the extended key:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg25_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg25.png' alt='' title='' width='132' height='43'> </div> </p></div> <p>Our final subtlety (for now) is somewhat related. It concerns key columns. Normally the way we specify a row in a <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt> object is just by giving its position. But if the values of a particular column happen to be unique, then we can use those instead to specify a row. Consider this <tt>Tabular</tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025structureAimg26_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg27.png' alt='' title='' width='348' height='115'> </div> </p></div> <p>The fruit column has the feature that each entry appears only once—so we can create a <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt> that uses this column as a key column:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg28_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg28.png' alt='' title='' width='394' height='163'> </div> </p></div> <p>Notice that the numbers for rows have now disappeared, and the key column is indicated with a gray background. In this <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt>, you can then reference a particular row using for example <tt><a href="http://reference.wolfram.com/language/ref/RowKey.html">RowKey</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg29_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg29.png' alt='' title='' width='275' height='43'> </div> </p></div> <p>Equivalently, you can also use an association with the column name:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg30_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg30.png' alt='' title='' width='297' height='43'> </div> </p></div> <p>What if the values in a single column are not sufficient to uniquely specify a row, but several columns together are? (In a real-world example, say one column has first names, and another has last names, and another has dates of birth.) Well, then you can designate all those columns as key columns:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg31_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg31.png' alt='' title='' width='484' height='162'> </div> </p></div> <p>And once you’ve done that, you can reference a row by giving the values in all the key columns:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg32_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025structureimg32.png' alt='' title='' width='283' height='43'> </div> </p></div> <h2 id="tabular-everywhere">Tabular Everywhere</h2> <p><tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt> provides an important new way to represent structured data in the Wolfram Language. It’s powerful in its own right, but what makes it even more powerful is how it integrates with all the other capabilities in the Wolfram Language. Many functions just immediately work with <tt>Tabular</tt>. But in Version 14.2 hundreds have been enhanced to make use of the special features of <tt>Tabular</tt>.</p> <p>Most often, it’s to be able to operate directly on columns in a <tt>Tabular</tt>. So, for example, given the <tt>Tabular</tt></p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025everywhereimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025everywhereimg1.png' alt='' title='' width='737' height='194'> </div> </p></div> <p>we can immediately make a visualization based on two of the columns: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025everywhereimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025everywhereimg2.png' alt='' title='' width='499' height='299'> </div> </p></div> <p>If one of the columns has categorical data, we’ll recognize that, and plot it accordingly:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025everywhereimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025everywhereimg3.png' alt='' title='' width='388' height='208'> </div> </p></div> <p>Another area where <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt> can immediately be used is machine learning. So, for example, this creates a classifier function that will attempt to determine the species of a penguin from other data about it:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025everywhereimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025everywhereimg4.png' alt='' title='' width='434' height='76'> </div> </p></div> <p>Now we can use this classifier function to predict species from other data about a penguin:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025everywhereimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025everywhereimg5.png' alt='' title='' width='633' height='89'> </div> </p></div> <p>We can also take the whole <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt> and make a feature space plot, labeling with species:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025everywhereimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025everywhereimg6.png' alt='' title='' width='579' height='385'> </div> </p></div> <p>Or we could “learn the distribution of possible penguins”</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025everywhereimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025everywhereimg7.png' alt='' title='' width='418' height='75'> </div> </p></div> <p>and randomly generate 3 “fictitious penguins” from this distribution:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025everywhereimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025everywhereimg8.png' alt='' title='' width='737' height='187'> </div> </p></div> <h2 id="algebra-with-symbolic-arrays">Algebra with Symbolic Arrays</h2> <p>One of the major innovations of <a href="https://reference.wolfram.com/legacy/language/v14.1/">Version 14.1</a> was the<a href="https://writings.stephenwolfram.com/2024/07/yet-more-new-ideas-and-new-functions-launching-version-14-1-of-wolfram-language-mathematica/#symbolic-arrays-and-their-calculus"> introduction of symbolic arrays</a>—and the ability to create expressions involving vector, matrix and array variables, and to take derivatives of them. In Version 14.2 we’re taking the idea of computing with symbolic arrays a step further—for the first time systematically automating what has in the past been the manual process of doing algebra with symbolic arrays, and simplifying expressions involving symbolic arrays.</p> <p>Let’s start by talking about <tt><a href="http://reference.wolfram.com/language/ref/ArrayExpand.html">ArrayExpand</a></tt>. Our longstanding function <tt><a href="http://reference.wolfram.com/language/ref/Expand.html">Expand</a></tt> just deals with expanding ordinary multiplication, effectively of scalars—so in this case it does nothing: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012125algebraAimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025algebraimg1.png' alt='' title='' width='226' height='44'> </div> </p></div> <p>But in Version 14.2 we also have <tt><a href="http://reference.wolfram.com/language/ref/ArrayExpand.html">ArrayExpand</a></tt> which will do the expansion:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012125algebraAimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025algebraimg2.png' alt='' title='' width='263' height='43'> </div> </p></div> <p><tt><a href="http://reference.wolfram.com/language/ref/ArrayExpand.html">ArrayExpand</a></tt> deals with many generalizations of multiplication that aren’t commutative:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012125algebraAimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025algebraimg3.png' alt='' title='' width='367' height='44'> </div> </p></div> <p>In an example like this, we really don’t need to know anything about <em>a</em> and <em>b</em>. But sometimes we can’t do the expansion without, for example, knowing their dimensions. One way to specify those dimensions is as a condition in <tt><a href="http://reference.wolfram.com/language/ref/ArrayExpand.html">ArrayExpand</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012125algebraAimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025algebraimg4.png' alt='' title='' width='487' height='46'> </div> </p></div> <p>An alternative is to use an explicit symbolic array variable:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012125algebraAimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025algebraimg5.png' alt='' title='' width='514' height='46'> </div> </p></div> <p>In addition to expanding generalized products using <tt><a href="http://reference.wolfram.com/language/ref/ArrayExpand.html">ArrayExpand</a></tt>, Version 14.2 also supports general simplification of symbolic array expressions:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012125algebraAimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025algebraimg6.png' alt='' title='' width='480' height='44'> </div> </p></div> <p>The function <tt><a href="http://reference.wolfram.com/language/ref/ArraySimplify.html">ArraySimplify</a></tt> will specifically do simplification on symbolic arrays, while leaving other parts of expressions unchanged. Version 14.2 supports many kinds of array simplifications:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012125algebraAimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025algebraimg7.png' alt='' title='' width='406' height='46'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012125algebraAimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025algebraimg8.png' alt='' title='' width='430' height='43'> </div> </p></div> <p>We could do these simplifications without knowing anything about the dimensions of <em>a</em> and <em>b</em>. But sometimes we can’t go as far without knowing these. For example, if we don’t know the dimensions we get:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012125algebraAimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025algebraimg9.png' alt='' title='' width='252' height='44'> </div> </p></div> <p>But with the dimensions we can explicitly simplify this to an <em>n</em>×<em>n</em> identity matrix:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012125algebraAimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025algebraimg10.png' alt='' title='' width='415' height='43'> </div> </p></div> <p><tt><a href="http://reference.wolfram.com/language/ref/ArraySimplify.html">ArraySimplify</a></tt> can also take account of the symmetries of arrays. For example, let’s set up a symbolic symmetric matrix:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012125algebraAimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025algebraimg11.png' alt='' title='' width='387' height='44'> </div> </p></div> <p>And now <tt><a href="http://reference.wolfram.com/language/ref/ArraySimplify.html">ArraySimplify</a></tt> can immediately resolve this:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012125algebraAimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025algebraimg12.png' alt='' title='' width='296' height='49'> </div> </p></div> <p>The ability to do algebraic operations on complete arrays in symbolic form is very powerful. But sometimes it’s also important to look at individual components of arrays. And in Version 14.2 we’ve added <tt><a href="http://reference.wolfram.com/language/ref/ComponentExpand.html">ComponentExpand</a></tt> to let you get components of arrays in symbolic form. </p> <p>So, for example this takes a 2-component vector and writes it out as an explicit list with two symbolic components:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012125algebraAimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025algebraimg13.png' alt='' title='' width='372' height='47'> </div> </p></div> <p>Underneath, those components are represented using <tt><a href="http://reference.wolfram.com/language/ref/Indexed.html">Indexed</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012125algebraAimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025algebraimg14.png' alt='' title='' width='568' height='57'> </div> </p></div> <p>Here’s the determinant of a 3×3 matrix, written out in terms of symbolic components:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012125algebraAimg15_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025algebraimg15.png' alt='' title='' width='604' height='44'> </div> </p></div> <p>And here’s a matrix power:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012125algebraAimg16_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025algebraimg16.png' alt='' title='' width='568' height='48'> </div> </p></div> <p>Given 3D vectors <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2025/01/sw01212025VectorU.png' alt='' title='' width='13' height='17'> and <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2025/01/sw01212025VectorV.png' alt='' title='' width='13' height='17'> we can also for example form the cross product</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012125algebraAimg17_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025algebraimg17.png' alt='' title='' width='575' height='53'> </div> </p></div> <p>and we can then go ahead and dot it into an inverse matrix:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012125algebraAimg18_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025algebraimg18.png' alt='' title='' width='663' height='240'> </div> </p></div> <h2 id="language-tune-ups">Language Tune-Ups</h2> <p>As a daily user of the <a href="https://www.wolfram.com/language/">Wolfram Language</a> I’m very pleased with how smoothly I find I can translate computational ideas into code. But the more we’ve made it easy to do, the more we can see new places where we can polish the language further. And in Version 14.2—like every version before it—we’ve added a number of “language tune-ups”. </p> <p>A simple one—whose utility becomes particularly clear with <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt>—is <tt><a href="http://reference.wolfram.com/language/ref/Discard.html">Discard</a></tt>. You can think of it as a complement to <tt><a href="http://reference.wolfram.com/language/ref/Select.html">Select</a></tt>: it discards elements according to the criterion you specify:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01162025languageimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01162025languageimg1.png' alt='' title='' width='484' height='44'> </div> </p></div> <p>And along with adding <tt><a href="http://reference.wolfram.com/language/ref/Discard.html">Discard</a></tt>, we’ve also enhanced <tt><a href="http://reference.wolfram.com/language/ref/Select.html">Select</a></tt>. Normally, <tt>Select</tt> just gives a list of the elements it selects. But in Version 14.2 you can specify other results. Here we’re asking for the “index” (i.e. position) of the elements that <tt><a href="http://reference.wolfram.com/language/ref/NumberQ.html">NumberQ</a></tt> is selecting:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01162025languageAimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01162025languageimg2.png' alt='' title='' width='378' height='44'> </div> </p></div> <p>Something that can be helpful in dealing with very large amounts of data is getting a bit vector data structure from <tt><a href="http://reference.wolfram.com/language/ref/Select.html">Select</a></tt> (and <tt><a href="http://reference.wolfram.com/language/ref/Discard.html">Discard</a></tt>), that provides a bit mask of which elements are selected or not: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01162025languageAimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01162025languageimg3.png' alt='' title='' width='397' height='73'> </div> </p></div> <p>By the way, here’s how you can ask for multiple results from <tt><a href="http://reference.wolfram.com/language/ref/Select.html">Select</a></tt> and <tt><a href="http://reference.wolfram.com/language/ref/Discard.html">Discard</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01162025languageAimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01162025languageimg4.png' alt='' title='' width='490' height='44'> </div> </p></div> <p>In talking about <tt><a href="http://reference.wolfram.com/language/ref/Tabular.html">Tabular</a></tt> we already mentioned <tt><a href="http://reference.wolfram.com/language/ref/MissingFallback.html">MissingFallback</a></tt>. Another function related to code robustification and error handling is the new function <tt><a href="http://reference.wolfram.com/language/ref/Failsafe.html">Failsafe</a></tt>. Let’s say you’ve got a list which contains some “failed” elements. If you map a function <tt>f</tt> over that list, it’ll apply itself to the failure elements just as to everything else:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01162025languageAimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01162025languageimg5.png' alt='' title='' width='385' height='45'> </div> </p></div> <p>But quite possibly <tt>f</tt> wasn’t set up to deal with these kinds of failure inputs. And that’s where <tt><a href="http://reference.wolfram.com/language/ref/Failsafe.html">Failsafe</a></tt> comes in. Because <tt>Failsafe</tt>[<tt>f][</tt><em>x</em><tt>]</tt> is defined to give <tt>f[</tt><em>x</em><tt>]</tt> if <em>x</em> is not a failure, and to just return the failure if it is. So now we can map <tt>f</tt> across our list with impunity, knowing it’ll never be fed failure input:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01162025languageAimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01162025languageimg6.png' alt='' title='' width='372' height='45'> </div> </p></div> <p>Talking of tricky error cases, another new function in Version 14.2 is <tt><a href="http://reference.wolfram.com/language/ref/HoldCompleteForm.html">HoldCompleteForm</a></tt>. <tt><a href="http://reference.wolfram.com/language/ref/HoldForm.html">HoldForm</a></tt> lets you display an expression without doing ordinary evaluation of the expression. But—like <tt><a href="http://reference.wolfram.com/language/ref/Hold.html">Hold</a></tt>—it still allows certain transformations to get made. <tt>HoldCompleteForm</tt>—like <tt><a href="http://reference.wolfram.com/language/ref/HoldComplete.html">HoldComplete</a></tt>—prevents all these transformations. So while <tt>HoldForm</tt> gets a bit confused here when the sequence “resolves”</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01162025languageAimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01162025languageimg7.png' alt='' title='' width='259' height='44'> </div> </p></div> <p><tt><a href="http://reference.wolfram.com/language/ref/HoldCompleteForm.html">HoldCompleteForm</a></tt> just completely holds and displays the sequence:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01162025languageAimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01162025languageimg8.png' alt='' title='' width='325' height='44'> </div> </p></div> <p>Another piece of polish added in Version 14.2 concerns <tt><a href="http://reference.wolfram.com/language/ref/Counts.html">Counts</a></tt>. I often find myself wanting to count elements in a list, including getting 0 when a certain element is missing. By default, <tt>Counts</tt> just counts elements that are present:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01162025languageAimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01162025languageimg9.png' alt='' title='' width='208' height='44'> </div> </p></div> <p>But in Version 14.2 we’ve added a second argument that lets you give a complete list of all the elements you want to count—even if they happen to be absent from the list:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01162025languageAimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01162025languageimg10.png' alt='' title='' width='317' height='44'> </div> </p></div> <p>As a final example of language tune-up in Version 14.2 I’ll mention <tt><a href="http://reference.wolfram.com/language/ref/AssociationComap.html">AssociationComap</a></tt>. In Version 14.0 we introduced <tt><a href="http://reference.wolfram.com/language/ref/Comap.html">Comap</a></tt> as a “co-” (as in “co-functor”, etc.) analog of <tt><a href="http://reference.wolfram.com/language/ref/Map.html">Map</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01162025languageAimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01162025languageimg11.png' alt='' title='' width='186' height='44'> </div> </p></div> <p>In Version 14.2 we’re introducing <tt><a href="http://reference.wolfram.com/language/ref/AssociationComap.html">AssociationComap</a></tt>—the “co-” version of <tt><a href="http://reference.wolfram.com/language/ref/AssociationMap.html">AssociationMap</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01162025languageAimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01162025languageimg12.png' alt='' title='' width='265' height='44'> </div> </p></div> <p>Think of it as a nice way to make labeled tables of things, as in:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01162025languageAimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01162025languageimg13.png' alt='' title='' width='358' height='69'> </div> </p></div> <h2 id="brightening-our-colors-spiffing-up-for-2025">Brightening Our Colors; Spiffing Up for 2025</h2> <p>In 2014—for <a href="https://reference.wolfram.com/legacy/language/v10/">Version 10.0</a>—we did a <a href="https://writings.stephenwolfram.com/2014/07/launching-mathematica-10-with-700-new-functions-and-a-crazy-amount-of-rd/#:~:text=practical%20multiple%20undo.-,Another%20very,-obvious%20change%20in">major overhaul of the default colors</a> for all our graphics and visualization functions, coming up with what we felt was a good solution. (And as we’ve just noticed, somewhat bizarrely, it turned out that in the years that followed, many of the graphics and visualization libraries out there seemed to copy what we did!) Well, a decade has now passed, visual expectations (and display technologies) have changed, and we decided it was time to spiff up our colors for 2025.</p> <p>Here’s what a typical plot looked like in <a href="https://reference.wolfram.com/legacy/language/v10/">Versions 10.0</a> through <a href="https://reference.wolfram.com/legacy/language/v14.1/">14.1</a>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025colorsimg1a_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025colorsimg1.png' alt='' title='' width='318' height='179'> </div> </p></div> <p>And here’s the same plot in Version 14.2:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025colorsimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025colorsimg2.png' alt='' title='' width='318' height='183'> </div> </p></div> <p>By design, it’s still completely recognizable, but it’s got a little extra zing to it. </p> <p>With more curves, there are more colors. Here’s the old version:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025colorsimg3a_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025colorsimg3.png' alt='' title='' width='368' height='178'> </div> </p></div> <p>And here’s the new version:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025colorsimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025colorsimg4.png' alt='' title='' width='368' height='184'> </div> </p></div> <p>Histograms are brighter too. The old:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025colorsimg5a_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025colorsimg5.png' alt='' title='' width='458' height='159'> </div> </p></div> <p>And the new:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025colorsimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025colorsimg6.png' alt='' title='' width='458' height='164'> </div> </p></div> <p>Here’s the comparison between old (“2014”) and new (“2025”) colors:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025colorsimg7_copy-1.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025colorsimg7a.png' alt='' title='' width='475' height='113'> </div> </p></div> <p>It’s subtle, but it makes a difference. I have to say that increasingly over the past few years, I’ve felt I had to tweak the colors in almost every <a href="https://www.wolfram.com/language/">Wolfram Language</a> image I’ve published. But I’m excited to say that with the new colors that urge has gone away—and I can just use our default colors again! </p> <h2 id="llm-streamlining--streaming">LLM Streamlining & Streaming</h2> <p>We <a href="https://writings.stephenwolfram.com/2023/05/the-new-world-of-llm-functions-integrating-llm-technology-into-the-wolfram-language/">first introduced programmatic access to LLMs</a> in <a href="https://www.wolfram.com/language/">Wolfram Language</a> in the middle of 2023, with functions like <tt><a href="http://reference.wolfram.com/language/ref/LLMFunction.html">LLMFunction</a></tt> and <tt><a href="http://reference.wolfram.com/language/ref/LLMSynthesize.html">LLMSynthesize</a></tt>. At that time, these functions needed access to external LLM services. But with the release last month of <a href="https://writings.stephenwolfram.com/2024/12/useful-to-the-point-of-being-revolutionary-introducing-wolfram-notebook-assistant/">LLM Kit (along with Wolfram Notebook Assistant)</a> we’ve made these functions seamlessly available for everyone with a <a href="https://www.wolfram.com/notebook-assistant-llm-kit/">Notebook Assistant + LLM Kit subscription</a>. Once you have your subscription, you can use programmatic LLM functions anywhere and everywhere in Version 14.2 without any further set up. </p> <p>There are also two new functions: <tt><a href="http://reference.wolfram.com/language/ref/LLMSynthesizeSubmit.html">LLMSynthesizeSubmit</a></tt> and <tt><a href="http://reference.wolfram.com/language/ref/ChatSubmit.html">ChatSubmit</a></tt>. Both are concerned with letting you get incremental results from LLMs (and, yes, that’s important, at least for now, because LLMs can be quite slow). Like <tt><a href="http://reference.wolfram.com/language/ref/CloudSubmit.html">CloudSubmit</a></tt> and <tt><a href="http://reference.wolfram.com/language/ref/URLSubmit.html">URLSubmit</a></tt>, <tt>LLMSynthesizeSubmit</tt> and <tt>ChatSubmit</tt> are asynchronous functions: you call them to start something that will call an appropriate handler function whenever a certain specified event occurs. </p> <p>Both <tt>LLMSynthesizeSubmit</tt> and <tt>ChatSubmit</tt> support a whole variety of events. An example is <tt>"ContentChunkReceived"</tt>: an event that occurs when there’s a chunk of content received from the LLM. </p> <p>Here’s how one can use that:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01252025llmimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01252025llmimg1.png' alt='' title='' width='497' height='159'> </div> </p></div> <p>The <tt><a href="http://reference.wolfram.com/language/ref/LLMSynthesizeSubmit.html">LLMSynthesizeSubmit</a></tt> returns a <tt><a href="http://reference.wolfram.com/language/ref/TaskObject.html">TaskObject</a></tt>, but then starts to synthesize text in response to the prompt you’ve given, calling the handler function you specified every time a chunk of text comes in. After a few moments, the LLM will have finished its process of synthesizing text, and if you ask for the value of <tt>c</tt> you’ll see each of the chunks it produced:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01252025llmimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01252025llmimg2.png' alt='' title='' width='701' height='42'> </div> </p></div> <p>Let’s try this again, but now setting up a dynamic display for a string <tt>s</tt> and then running <tt>LLMSynthesizeSubmit</tt> to accumulate the synthesized text into this string:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01252025llmimg3_copy.txt' data-c2c-type='text/html'> <video width="620" height="300" autoplay muted loop><source src="https://content.wolfram.com/sites/43/2025/01/sw012325llmimg3.mp4" type="video/mp4"></video> </div> </p></div> <p><tt><a href="http://reference.wolfram.com/language/ref/ChatSubmit.html">ChatSubmit</a></tt> is the analog of <tt><a href="http://reference.wolfram.com/language/ref/ChatEvaluate.html">ChatEvaluate</a></tt>, but asynchronous—and you can use it to create a full chat experience, in which content is streaming into your notebook as soon as the LLM (or tools called by the LLM) generate it.</p> <h2 id="streamlining-parallel-computation-launch-all-the-machines">Streamlining Parallel Computation: Launch All the Machines!</h2> <p>For <a href="https://writings.stephenwolfram.com/2008/11/surprise-mathematica-7-0-released-today/">nearly 20 years</a> we’ve had a streamlined capability to do parallel computation in Wolfram Language, using functions like <tt><a href="http://reference.wolfram.com/language/ref/ParallelMap.html">ParallelMap</a></tt>, <tt><a href="http://reference.wolfram.com/language/ref/ParallelTable.html">ParallelTable</a></tt> and <tt><a href="http://reference.wolfram.com/language/ref/Parallelize.html">Parallelize</a></tt>. The parallel computation can happen on multiple cores on a single machine, or across many machines on a network. (And, for example, in my own current setup I have 7 machines right now with a total of 204 cores.)</p> <p>In the past few years, partly responding to the increasing number of cores typically available on individual machines, we’ve been progressively streamlining the way that parallel computation is provisioned. And in Version 14.2 we’ve, yes, parallelized the provisioning of parallel computation. Which means, for example, that my 7 machines all start their parallel kernels in parallel—so that the whole process is now finished in a matter of seconds, rather than potentially taking minutes, as it did before:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025parallelimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025parallelimg1.png' alt='' title='' width='166' height='14'> </div> </p></div> <p><img src='https://content.wolfram.com/sites/43/2025/01/sw01212025parallelimg2.png' alt='' title='' width='410' height='71'/></p> <p>Another new feature for parallel computation in Version 14.2 is the ability to automatically parallelize across multiple variables in <tt><a href="http://reference.wolfram.com/language/ref/ParallelTable.html">ParallelTable</a></tt>. <tt>ParallelTable</tt> has always had a variety of algorithms for optimizing the way it splits up computations for different kernels. Now that’s been extended so that it can deal with multiple variables:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01212025parallelimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01212025parallelimg3.png' alt='' title='' width='598' height='14'> </div> </p></div> <p><img src='https://content.wolfram.com/sites/43/2025/01/sw01212025parallelimg4.png' alt='' title='' width='465' height='73'/></p> <p>As someone who very regularly does large-scale computations with the <a href="https://www.wolfram.com/language/">Wolfram Language</a> it’s hard to overstate how seamlessly important its parallel computation capabilities have been to me. Usually I’ll first figure out a computation with <tt><a href="http://reference.wolfram.com/language/ref/Map.html">Map</a></tt>, <tt><a href="http://reference.wolfram.com/language/ref/Table.html">Table</a></tt>, etc. Then when I’m ready to do the full version I’ll swap in <tt><a href="http://reference.wolfram.com/language/ref/ParallelMap.html">ParallelMap</a></tt>, <tt><a href="http://reference.wolfram.com/language/ref/ParallelTable.html">ParallelTable</a></tt>, etc. And it’s remarkable how much difference a 200x increase in speed makes (assuming my computation doesn’t have too much communication overhead).</p> <p>(By the way, talking of communication overhead, two new functions in Version 14.2 are <tt><a href="http://reference.wolfram.com/language/ref/ParallelSelect.html">ParallelSelect</a></tt> and <tt><a href="http://reference.wolfram.com/language/ref/ParallelCases.html">ParallelCases</a></tt>, which allow you to select and find cases in lists in parallel, saving communication overhead by sending only final results back to the master kernel. This functionality has actually been available for a while through <tt><a href="https://reference.wolfram.com/language/ref/Parallelize.html">Parallelize</a>[ </tt>…<tt> <a href="https://reference.wolfram.com/language/ref/Select.html">Select</a>[ </tt>…<tt> ] </tt>…<tt> ]</tt> etc., but it’s streamlined in Version 14.2.)</p> <h2 id="follow-that--tracking-in-video">Follow that ____! Tracking in Video</h2> <p>Let’s say we’ve got a video, for example of people walking through a train station. We’ve had the capability for some time to take a single frame of such a video, and find the people in it. But in Version 14.2 we’ve got something new: the capability to track objects that move around between frames of the video.</p> <p>Let’s start with a video:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225followthatimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225followthatimg1.png' alt='' title='' width='89' height='auto'><br /> <span><br /> <video controls width="320" height="180"><source src="https://content.wolfram.com/sites/43/2025/01/sw012225followthatimg1a.mp4" type="video/mp4"></video> </div> </p></div> <p>We could take an individual frame, and find image bounding boxes. But as of Version 14.2 we can just apply <tt><a href="http://reference.wolfram.com/language/ref/ImageBoundingBoxes.html">ImageBoundingBoxes</a></tt> to the whole video at once: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225followthatimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225followthatimg2.png' alt='' title='' width='468' height='97'> </div> </p></div> <p>Then we can apply the data on bounding boxes to highlight people in the video—using the new <tt><a href="http://reference.wolfram.com/language/ref/HighlightVideo.html">HighlightVideo</a></tt> function:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225followthatimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225followthatimg3b.png' alt='' title='' width='343' height='auto'><br /> <span><br /> <video controls width="640" height="360"><source src="https://content.wolfram.com/sites/43/2025/01/sw012225followthatimg3a.mp4" type="video/mp4"></video> </div> </p></div> <p>But this just separately indicates where people are in each frame; it doesn’t connect them from one frame to another. In Version 14.2 we’ve added <tt><a href="http://reference.wolfram.com/language/ref/VideoObjectTracking.html">VideoObjectTracking</a></tt> to follow objects between frames:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225followthatimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225followthatimg5.png' alt='' title='' width='377' height='110'> </div> </p></div> <p>Now if we use <tt><a href="http://reference.wolfram.com/language/ref/HighlightVideo.html">HighlightVideo</a></tt>, different objects will be annotated with different colors:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225followthatimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225followthatimg6.png' alt='' title='' width='234' height='14'><br /> <span><br /> <video controls width="640" height="360"><source src="https://content.wolfram.com/sites/43/2025/01/sw012225followthatimg6a.mp4" type="video/mp4"></video> </div> </p></div> <p>This picks out all the unique objects identified in the course of the video, and counts them:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225followthatimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225followthatimg8.png' alt='' title='' width='575' height='55'> </div> </p></div> <p>“Where’s the dog?”, you might ask. It’s certainly not there for long:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225followthatimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225followthatimg9.png' alt='' title='' width='390' height='65'> </div> </p></div> <p>And if we find the first frame where it is supposed to appear it does seem as if what’s presumably a person on the lower right has been mistaken for a dog:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225followthatimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225followthatimg10.png' alt='' title='' width='519' height='208'> </div> </p></div> <p>And, yup, that’s what it thought was a dog:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225followthatimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225followthatimg11.png' alt='' title='' width='645' height='183'> </div> </p></div> <h2 id="game-theory">Game Theory</h2> <p>“What about game theory?”, people have long asked. And, yes, there has been lots of game theory done with the <a href="https://www.wolfram.com/language/">Wolfram Language</a>, and lots of packages written for particular aspects of it. But in Version 14.2 we’re finally introducing built-in system functions for doing game theory (both matrix games and tree games).</p> <p>Here’s how we specify a (zero-sum) 2-player matrix game:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01162025gameimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01162025gameimg1.png' alt='' title='' width='347' height='75'> </div> </p></div> <p>This defines payoffs when each player takes each action. We can represent this by a dataset:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01162025gameimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01162025gameimg2.png' alt='' title='' width='414' height='113'> </div> </p></div> <p>An alternative is to “plot the game” using <tt><a href="http://reference.wolfram.com/language/ref/MatrixGamePlot.html">MatrixGamePlot</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01162025gameimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01162025gameimg3.png' alt='' title='' width='457' height='166'> </div> </p></div> <p>OK, so how can we “solve” this game? In other words, what action should each player take, with what probability, to maximize their average payoff over many instances of the game? (It’s assumed that in each instance the players simultaneously and independently choose their actions.) A “solution” that maximizes expected payoffs for all players is called a Nash equilibrium. (As a small footnote to history, <a href="https://www.wolframalpha.com/input?i=john+nash">John Nash</a> was a long-time user of Mathematica and what’s now the Wolfram Language—though many years after he came up with the concept of Nash equilibrium.) Well, now in Version 14.2, <tt><a href="http://reference.wolfram.com/language/ref/FindMatrixGameStrategies.html">FindMatrixGameStrategies</a></tt> computes optimal strategies (AKA Nash equilibria) for matrix games:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01162025gameimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01162025gameimg4.png' alt='' title='' width='530' height='63'> </div> </p></div> <p>This result means that for this game player 1 should play action 1 with probability <img style="margin-bottom: -5px" src='https://content.wolfram.com/sites/43/2025/01/sw01162025frac511.png' width= '12' height='22' > and action 2 with probability <img style="margin-bottom: -5px" src='https://content.wolfram.com/sites/43/2025/01/sw01162025frac611.png' width= '12' height='23' >, and player 2 should do these with probabilities <img style="margin-bottom: -5px" src='https://content.wolfram.com/sites/43/2025/01/sw01162025frac411.png' width= '12' height='22' > and <img style="margin-bottom: -5px" src='https://content.wolfram.com/sites/43/2025/01/sw01162025frac711.png' width= '12' height='22' >. But what are their expected payoffs? <tt><a href="http://reference.wolfram.com/language/ref/MatrixGamePayoff.html">MatrixGamePayoff</a></tt> computes that:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01162025gameimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01162025gameimg9.png' alt='' title='' width='495' height='63'> </div> </p></div> <p>It can get pretty hard to keep track of the different cases in a game, so <tt><a href="http://reference.wolfram.com/language/ref/MatrixGame.html">MatrixGame</a></tt> lets you give whatever labels you want for players and actions: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01162025gameimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01162025gameimg10.png' alt='' title='' width='589' height='138'> </div> </p></div> <p>These labels are then used in visualizations:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01162025gameimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01162025gameimg11.png' alt='' title='' width='252' height='166'> </div> </p></div> <p>What we just showed is actually a standard example game—the “<a href="https://www.wolframalpha.com/input?i=prisoner%27s+dilemma">prisoner’s dilemma</a>”. In the Wolfram Language we now have <tt><a href="http://reference.wolfram.com/language/ref/GameTheoryData.html">GameTheoryData</a></tt> as a repository of about 50 standard games. Here’s one, specified to have 4 players:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01162025gameimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01162025gameimg12.png' alt='' title='' width='700' height='196'> </div> </p></div> <p>And it’s less trivial to solve this game, but here’s the result—with 27 distinct solutions:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01162025gameimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01162025gameimg13C.png' alt='' title='' width='670' height='233'> </div> </p></div> <p>And, yes, the visualizations keep on working, even when there are more players (here we’re showing the 5-player case, indicating the 50th game solution):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01162025gameimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01162025gameimg14.png' alt='' title='' width='647' height='319'> </div> </p></div> <p>It might be worth mentioning that the way we’re solving these kinds of games is by using our latest polynomial equation solving capabilities—and not only are we able to routinely find all possible Nash equilibria (not just a single fixed point), but we’re also able to get exact results:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01162025gameimg15_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01162025gameimg15.png' alt='' title='' width='588' height='109'> </div> </p></div> <p>In addition to matrix games, which model games in which players simultaneously pick their actions just once, we’re also supporting tree games, in which players take turns, producing a tree of possible outcomes, ending with a specified payoff for each of the players. Here’s an example of a very simple tree game:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01162025gameimg16_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01162025gameimg16.png' alt='' title='' width='652' height='198'> </div> </p></div> <p>We can get at least one solution to this game—described by a nested structure that gives the optimal probabilities for each action of each player at each turn:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01162025gameimg17_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01162025gameimg17.png' alt='' title='' width='556' height='68'> </div> </p></div> <p>Things with tree games can get more elaborate. Here’s an example—in which other players sometimes don’t know which branches were taken (as indicated by states joined by dashed lines):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01162025gameimg18_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01162025gameimg18.png' alt='' title='' width='388' height='268'> </div> </p></div> <p>What we’ve got in Version 14.2 represents rather complete coverage of the basic concepts in a typical introductory game theory course. But now, in typical Wolfram Language fashion, it’s all computable and extensible—so you can study more realistic games, and quickly do lots of examples to build intuition. </p> <p>We’ve so far concentrated on “classic game theory”, notably with the feature (relevant to many current applications) that all action nodes are the result of a different sequence of actions. However, games like tic-tac-toe (that I happened to <a href="https://writings.stephenwolfram.com/2022/06/games-and-puzzles-as-multicomputational-systems/">recently study using multiway graphs</a>) can be simplified by merging equivalent action nodes. Multiple sequences of actions may lead to the same game of tic-tac-toe, as is often the case for iterated games. These graph structures don’t fit into the kind of classic game theory trees we’ve introduced in Version 14.2—though (as my own efforts I think demonstrate) they’re uniquely amenable to analysis with the Wolfram Language.</p> <h2 id="computing-the-syzygies-and-other-advances-in-astronomy">Computing the Syzygies, and Other Advances in Astronomy</h2> <p>There are lots of “coincidences” in astronomy—situations where things line up in a particular way. <a href="https://writings.stephenwolfram.com/2024/03/computing-the-eclipse-astronomy-in-the-wolfram-language/">Eclipses are one example</a>. But there are many more. And in Version 14.2 there’s now a general function <tt><a href="http://reference.wolfram.com/language/ref/FindAstroEvent.html">FindAstroEvent</a></tt> for finding these “coincidences”, technically called syzygies (“sizz-ee-gees”), as well as other “special configurations” of astronomical objects.</p> <p>A simple example is the September (autumnal) equinox:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01152025astronomyimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01152025astronomyimg1.png' alt='' title='' width='313' height='48'> </div> </p></div> <p>Roughly this is when day and night are of equal length. More precisely, it’s when the sun is at one of the two positions in the sky where the plane of the ecliptic (i.e. the orbital plane of the earth around the sun) crosses the celestial equator (i.e. the projection of the earth’s equator)—as we can see here (the ecliptic is the yellow line; the celestial equator the blue one):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01152025astronomyimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01152025astronomyimg2.png' alt='' title='' width='543' height='356'> </div> </p></div> <p>As another example, let’s find the next time over the next century when Jupiter and Saturn will be closest in the sky:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01152025astronomyimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01152025astronomyimg3.png' alt='' title='' width='619' height='94'> </div> </p></div> <p>They’ll get close enough to see their moons together:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01152025astronomyimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01152025astronomyimg4.png' alt='' title='' width='567' height='411'> </div> </p></div> <p>There are an incredible number of astronomical configurations that have historically been given special names. There are equinoxes, solstices, equiluxes, culminations, conjunctions, oppositions, quadratures—as well as periapses and apoapses (specialized to perigee, perihelion, periareion, perijove, perikrone, periuranion, periposeideum, etc.). In Version 14.2 we support all these. </p> <p>So, for example, this gives the next time Triton will be closest to Neptune:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01152025astronomyimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01152025astronomyimg5.png' alt='' title='' width='444' height='58'> </div> </p></div> <p>A famous example has to do with the perihelion (closest approach to the Sun) of Mercury. Let’s compute the position of Mercury (as seen from the Sun) at all its perihelia in the first couple of decades of the nineteenth century:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01152025astronomyimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01152025astronomyimg6.png' alt='' title='' width='651' height='269'> </div> </p></div> <p>We see that there’s a systematic “advance” (along with some wiggling):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01152025astronomyimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01152025astronomyimg7.png' alt='' title='' width='619' height='317'> </div> </p></div> <p>So now let’s quantitatively compute this advance. We start by finding the times for the first perihelia in 1800 and 1900:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01152025astronomyimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01152025astronomyimg8.png' alt='' title='' width='484' height='94'> </div> </p></div> <p>Now we compute the angular separation between the positions of Mercury at these times:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01152025astronomyimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01152025astronomyimg9.png' alt='' title='' width='499' height='95'> </div> </p></div> <p>Then divide this by the time difference</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01152025astronomyimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01152025astronomyimg10.png' alt='' title='' width='270' height='52'> </div> </p></div> <p>and convert units:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01152025astronomyimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01152025astronomyimg11.png' alt='' title='' width='356' height='49'> </div> </p></div> <p>Famously, 43 arcseconds per century of this is the result of deviations from the inverse square law of gravity introduced by general relativity—and, of course, accounted for by our astronomical computation system. (The rest of the advance is the result of traditional gravitational effects from Venus, Jupiter, Earth, etc.)</p> <h2 id="pdes-now-also-for-magnetic-systems">PDEs Now Also for Magnetic Systems</h2> <p>More than a decade and a half ago we made the commitment to make the <a href="https://www.wolfram.com/language/">Wolfram Language</a> a full strength PDE modeling environment. Of course it helped that we could rely on all the other capabilities of the Wolfram Language—and what we’ve been able to produce is immeasurably more valuable because of its synergy with the rest of the system. But over the years, with great effort, we’ve been steadily building up symbolic PDE modeling capabilities across all the standard domains. And at this point I think it’s fair to say that we can handle—at an industrial scale—a large part of the PDE modeling that arises in real-world situations. </p> <p>But there are always more cases for which we can build in capabilities, and in Version 14.2 we’re adding built-in modeling primitives for static and quasistatic magnetic fields. So, for example, here’s how we can now model an hourglass-shaped magnet. This defines boundary conditions—then solves the equations for the magnetic scalar potential:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01152025pdesimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01162025pdesAimg1.png' alt='' title='' width='583' height='315'> </div> </p></div> <p>We can then take that result, and, for example, immediately plot the magnetic field lines it implies:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01152025pdesimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01152025pdesimg2.png' alt='' title='' width='673' height='434'> </div> </p></div> <p>Version 14.2 also adds the primitives to deal with slowly varying electric currents, and the magnetic fields they generate. All of this immediately integrates with our other modeling domains like heat transfer, fluid dynamics, acoustics, etc. </p> <p><a href="https://reference.wolfram.com/language/PDEModels/tutorial/PDEModelsOverview.html">There’s much to say about PDE modeling and its applications</a>, and in Version 14.2 we’ve added more than 200 pages of additional textbook-style documentation about PDE modeling, including some research-level examples.</p> <h2 id="new-features-in-graphics-geometry--graphs">New Features in Graphics, Geometry & Graphs</h2> <p><tt><a href="http://reference.wolfram.com/language/ref/Graphics.html">Graphics</a></tt> has always been a strong area for the <a href="https://www.wolfram.com/language/">Wolfram Language</a>, and over the past decade we’ve also built up very strong computational geometry capabilities. Version 14.2 adds some more “icing on the cake”, particularly in connecting graphics to geometry, and connecting geometry to other parts of the system.</p> <p>As an example, Version 14.2 adds geometry capabilities for more of what were previously just graphics primitives. For example, this is a geometric region formed by filling a Bezier curve:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01152025geometryimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01152025geometryimg1.png' alt='' title='' width='569' height='207'> </div> </p></div> <p>And we can now do all our usual computational geometry operations on it:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01152025geometryimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01162025geometryAimg2.png' alt='' title='' width='105' height='42'> </div> </p></div> <p>Something like this now works too:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01152025geometryimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01152025geometryimg3.png' alt='' title='' width='185' height='109'> </div> </p></div> <p>Something else new in Version 14.2 is <tt><a href="http://reference.wolfram.com/language/ref/MoleculeMesh.html">MoleculeMesh</a></tt>, which lets you build computable geometry from molecular structures. Here’s a graphical rendering of a molecule:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01152025geometryimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01152025geometryimg4.png' alt='' title='' width='382' height='203'> </div> </p></div> <p>And here now is a geometric mesh corresponding to the molecule:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01152025geometryimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01152025geometryimg5.png' alt='' title='' width='371' height='173'> </div> </p></div> <p>We can then do computational geometry on this mesh:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01162025geometryAimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01152025geometryimg6.png' alt='' title='' width='228' height='98'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01162025geometryAimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01152025geometryimg7.png' alt='' title='' width='579' height='157'> </div> </p></div> <p>Another new feature in Version 14.2 is an additional method for <a href="https://reference.wolfram.com/language/tutorial/GraphDrawingOverview.html">graph drawing</a> that can make use of symmetries. If you make a layered graph from a symmetrical grid, it won’t immediately render in a symmetrical way:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01152025geometryimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01152025geometryimg8.png' alt='' title='' width='298' height='120'> </div> </p></div> <p>But with the new <tt>"SymmetricLayeredEmbedding"</tt> graph layout, it will:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01152025geometryimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01152025geometryimg9.png' alt='' title='' width='518' height='134'> </div> </p></div> <h2 id="user-interface-tune-ups">User Interface Tune-Ups</h2> <p>Making a great user interface is always a story of continued polishing, and we’ve now been doing that for the notebook interface for nearly four decades. In Version 14.2 there are several notable pieces of polish that have been added. One concerns <a href="https://reference.wolfram.com/language/workflow/AutocompleteSymbolsWhenTyping.html.en">autocompletion</a> for option values. </p> <p>We’ve long shown completions for options that have a discrete collection of definite common settings (such as <tt><a href="http://reference.wolfram.com/language/ref/All.html">All</a></tt>, <tt><a href="http://reference.wolfram.com/language/ref/Automatic.html">Automatic</a></tt>, etc.). In Version 14.2 we’re adding “template completions” that give the structure of settings, and then let you tab through to fill in particular values. In all these years, one of the places I pretty much always find myself going to in the documentation is the settings for <tt><a href="http://reference.wolfram.com/language/ref/FrameLabel.html">FrameLabel</a></tt>. But now autocompletion immediately shows me the structure of these settings:</p> <p><img src='https://content.wolfram.com/sites/43/2025/01/sw01152025interfaceimg1.png' alt='Interface settings autocompletion' title='Interface settings autocompletion' width='427' height='88'/></p> <p>Also in autocompletion, we’ve added the capability to autocomplete context names, context aliases, and symbols that include contexts. And in all cases, the autocompletion is “fuzzy” in the sense that it’ll trigger not only on characters at the beginning of a name but on ones anywhere in the name—which means that you can just type characters in the name of a symbol, and relevant contexts will appear as autocompletions.</p> <p>Another small convenience added in Version 14.2 is the ability to drag images from one notebook to any other notebook, or, for that matter, to any other application that can accept dragged images. It’s been possible to drag images from other applications into notebooks, but now you can do it the other way too.</p> <p>Something else that’s for now specific to macOS is enhanced support for icon preview (as well as Quick Look). So now if you have a folder full of notebooks and you select <span class="promptformatted">Icon</span> view, you’ll see a little representation of each notebook as an icon of its content:</p> <p><img src='https://content.wolfram.com/sites/43/2025/01/sw01152025interfaceimg2.png' alt='Notebook icon preview' title='Notebook icon preview' width='522' height='85'/></p> <p>Under the hood in Version 14.2 there are also some infrastructural developments that will enable significant new features in subsequent versions. Some of these involve generalized support for dark mode. (Yes, one might initially imagine that dark mode would somehow be trivial, but when you start thinking about all the graphics and interface elements that involve colors, it’s clear it’s not. Though, for example, after significant effort we did recently release dark mode for <a href="https://www.wolframalpha.com/">Wolfram|Alpha</a>.)</p> <p>So, for example, in Version 14.2 you’ll find the new symbol <tt><a href="http://reference.wolfram.com/language/ref/LightDarkSwitched.html">LightDarkSwitched</a></tt>, which is part of the mechanism for specifying styles that will automatically switch for light and dark modes. And, yes, there is a style option <tt><a href="http://reference.wolfram.com/language/ref/LightDark.html">LightDark</a></tt> that will switch modes for notebooks—and which is at least experimentally supported.</p> <p>Related to light/dark mode is also the notion of theme colors: colors that are defined symbolically and can be switched together. And, yes, there’s an experimental symbol <tt><a href="http://reference.wolfram.com/language/ref/ThemeColor.html">ThemeColor</a></tt> related to these. But the full deployment of this whole mechanism won’t be there until the next version.</p> <h2 id="the-beginnings-of-going-native-on-gpus">The Beginnings of Going Native on GPUs</h2> <p>Many important pieces of functionality inside the Wolfram Language automatically make use of GPUs when they are available. And already 15 years ago we introduced primitives for low-level GPU programming. But in Version 14.2 we’re beginning the process of <a href="https://reference.wolfram.com/language/guide/GPUComputing.html.en">making GPU capabilities more readily available</a> as a way to optimize general Wolfram Language usage. The key new construct is <tt><a href="http://reference.wolfram.com/language/ref/GPUArray.html">GPUArray</a></tt>, which represents an array of data that will (if possible) be stored so as to be immediately and directly accessible to your GPU. (On some systems, it will be stored in separate “GPU memory”; on others, such as modern Macs, it will be stored in shared memory in such a way as to be directly accessible by the GPU.)</p> <p>In Version 14.2 we’re supporting an initial set of operations that can be performed directly on GPU arrays. The operations available vary slightly from one type of GPU to another. Over time, we expect to use or create many additional GPU libraries that will extend the set of operations that can be performed on GPU arrays. </p> <p>Here is a random ten-million-element vector stored as a GPU array: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01152025nativeimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01152025nativeimg1.png' alt='' title='' width='399' height='97'> </div> </p></div> <p>The GPU on the Mac on which I am writing this supports the necessary operations to do this purely in its GPU, giving back a <tt><a href="http://reference.wolfram.com/language/ref/GPUArray.html">GPUArray</a></tt> result:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01152025nativeimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01152025nativeimg2.png' alt='' title='' width='299' height='97'> </div> </p></div> <p>Here’s the timing:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01152025nativeimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01152025nativeimg3.png' alt='' title='' width='308' height='44'> </div> </p></div> <p>And here’s the corresponding ordinary (CPU) result:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01152025nativeimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01152025nativeimg4.png' alt='' title='' width='664' height='44'> </div> </p></div> <p>In this case, the <tt><a href="http://reference.wolfram.com/language/ref/GPUArray.html">GPUArray</a></tt> result is about a factor of 2 faster. What factor you get will vary with the operations you’re doing, and the particular hardware you’re using. So far, the largest factors I’ve seen are around 10x. But as we build more GPU libraries, I expect this to increase—particularly when what you’re doing involves a lot of compute “inside the GPU”, and not too much memory access. </p> <p>By the way, if you sprinkle <tt>GPUArray</tt> in your code it’ll normally never affect the results you get—because operations always default to running on your CPU if they’re not supported on your GPU. (Usually <tt>GPUArray</tt> will make things faster, but if there are too many “GPU misses” then all the “attempts to move data” may actually slow things down.) It’s worth realizing, though, that GPU computation is still not at all well standardized or uniform. Sometimes there may only be support for vectors, sometimes also matrices—and there may be different data types with different numerical precision supported in different cases. </p> <h2 id="and-even-more">And Even More…</h2> <p>In addition to all the things we’ve discussed here so far, there are also a host of other “little” new features in Version 14.2. But even though they may be “little” compared to other things we’ve discussed, they’ll be big if you happen to need just that functionality.</p> <p>For example, there’s <tt><a href="https://reference.wolfram.com/language/ref/MidDate">MidDate</a></tt>—that computes the midpoint of dates:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225andevenmoreimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225andevenmoreimg1.png' alt='' title='' width='208' height='52'> </div> </p></div> <p>And like almost everything involving dates, <tt><a href="https://reference.wolfram.com/language/ref/MidDate">MidDate</a></tt> is full of subtleties. Here it’s computing the week 2/3 of the way through this year:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225andevenmoreimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225andevenmoreimg2.png' alt='' title='' width='279' height='52'> </div> </p></div> <p>In math, functions like <tt><a href="https://reference.wolfram.com/language/ref/DSolve">DSolve</a></tt> and <tt><a href="https://reference.wolfram.com/language/ref/SurfaceIntegrate">SurfaceIntegrate</a></tt> can now deal with symbolic array variables:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225andevenmoreimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225andevenmoreimg3.png' alt='' title='' width='294' height='75'> </div> </p></div> <p><tt><a href="https://reference.wolfram.com/language/ref/SumConvergence">SumConvergence</a></tt> now lets one specify the range of summation, and can give conditions that depend on it:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225andevenmoreimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225andevenmoreimg4.png' alt='' title='' width='273' height='67'> </div> </p></div> <p>A little convenience that, yes, I asked for, is that <tt><a href="https://reference.wolfram.com/language/ref/DigitCount">DigitCount</a></tt> now lets you specify how many digits altogether you want to assume your number has, so that it appropriately counts leading 0s:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225andevenmoreimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225andevenmoreimg5.png' alt='' title='' width='317' height='44'> </div> </p></div> <p>Talking of conveniences, for functions like <tt><a href="https://reference.wolfram.com/language/ref/MaximalBy">MaximalBy</a></tt> and <tt><a href="https://reference.wolfram.com/language/ref/TakeLargest">TakeLargest</a></tt> we added a new argument that says how to sort elements to determine “the largest”. Here’s the default numerical order</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225andevenmoreimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225andevenmoreimg6.png' alt='' title='' width='478' height='45'> </div> </p></div> <p>and here’s what happens if we use “symbolic order” instead:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225andevenmoreimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225andevenmoreimg7.png' alt='' title='' width='407' height='45'> </div> </p></div> <p>There are always so many details to polish. Like in Version 14.2 there’s an update to <tt><a href="https://reference.wolfram.com/language/ref/MoonPhase">MoonPhase</a></tt> and related functions, both new things to ask about, and new methods to compute them:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225andevenmoreimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225andevenmoreimg8.png' alt='' title='' width='293' height='43'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw012225andevenmoreimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225andevenmoreimg9.png' alt='' title='' width='508' height='43'> </div> </p></div> <p>In another area, in addition to major new import/export formats (particularly to support Tabular) there’s an update to <tt><a href="https://reference.wolfram.com/language/ref/format/Markdown.html">"Markdown"</a></tt> import that gives results in plaintext, and there’s an update to <tt><a href="https://reference.wolfram.com/language/ref/format/PDF.html">"PDF"</a></tt> import that gives a mixed list of text and images.</p> <p>And there are lots of other things too, as you can find in the “<a href="https://reference.wolfram.com/language/guide/SummaryOfNewFeaturesIn142.html">Summary of New and Improved Features in 14.2</a>”. By the way, it’s worth mentioning that if you’re looking at a particular documentation page for a function, you can always find out what’s new in this version just by pressing <span class="promptformatted">show changes</span>:</p> <p><a href="https://reference.wolfram.com/language/ref/MinimalBy.html"><img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw012225andevenmoreimg10.png' alt='Show changes' title='Show changes' width='620' height='200'></a></p> <p style="font-style: italic; color: #555;"> <style type="text/css"> div.bottomstripe { max-width:620px; margin-bottom:10px; background-color: #fff39a; border: solid 2px #ffd400; padding: 7px 10px 7px 10px; line-height: 1.2;} #blog .post_content .bottomstripe a, #blog .post_content .bottomstripe a:link, #blog .post_content .bottomstripe a:visited { font-family:"Source Sans Pro",Arial,Sans Serif; font-size:11pt; color:#aa0d00;} </style> <div class="bottomstripe"> <a href="https://www.wolfram.com/download-center/"><strong>Download your 14.2 now! » </strong> (It’s already live in the Wolfram Cloud!)</a> </div> ]]></content:encoded> <wfw:commentRss>https://writings.stephenwolfram.com/2025/01/launching-version-14-2-of-wolfram-language-mathematica-big-data-meets-computation-ai/feed/</wfw:commentRss> <slash:comments>0</slash:comments> <enclosure url="https://content.wolfram.com/sites/43/2025/01/sw012225followthatimg1a.mp4" length="836025" type="video/mp4" /> <enclosure url="https://content.wolfram.com/sites/43/2025/01/sw012225followthatimg3a.mp4" length="2438584" type="video/mp4" /> <enclosure url="https://content.wolfram.com/sites/43/2025/01/sw012225followthatimg6a.mp4" length="3163197" type="video/mp4" /> <enclosure url="https://content.wolfram.com/sites/43/2025/01/sw012325llmimg3.mp4" length="329726" type="video/mp4" /> </item> <item> <title>Who Can Understand the Proof? A Window on Formalized Mathematics</title> <link>https://writings.stephenwolfram.com/2025/01/who-can-understand-the-proof-a-window-on-formalized-mathematics/</link> <comments>https://writings.stephenwolfram.com/2025/01/who-can-understand-the-proof-a-window-on-formalized-mathematics/#respond</comments> <pubDate>Thu, 09 Jan 2025 22:42:31 +0000</pubDate> <dc:creator><![CDATA[Stephen Wolfram]]></dc:creator> <category><![CDATA[Artificial Intelligence]]></category> <category><![CDATA[Computational Science]]></category> <category><![CDATA[Mathematics]]></category> <category><![CDATA[Ruliology]]></category> <guid isPermaLink="false">https://writings.stephenwolfram.com/?p=65170</guid> <description><![CDATA[<span class="thumbnail"><img width="128" height="108" src="https://content.wolfram.com/sites/43/2025/01/proof-icon-1-v2.png" class="attachment-thumbnail size-thumbnail wp-post-image" alt="" /></span>Related writings: “Logic, Explainability and the Future of Understanding” (2018) » “The Physicalization of Metamathematics and Its Implications for the Foundations of Mathematics” (2022) » “Computational Knowledge and the Future of Pure Mathematics” (2014) » The Simplest Axiom for Logic Theorem (Wolfram with Mathematica, 2000): The single axiom ((a•b)•c)•(a•((a•c)•a))c is a complete axiom system for Boolean algebra (and […]]]></description> <content:encoded><![CDATA[<span class="thumbnail"><img width="128" height="108" src="https://content.wolfram.com/sites/43/2025/01/proof-icon-1-v2.png" class="attachment-thumbnail size-thumbnail wp-post-image" alt="" /></span><div class="article-links"><span class="links-intro">Related writings:</span><br /> <span><a class="website" href="https://writings.stephenwolfram.com/2018/11/logic-explainability-and-the-future-of-understanding/"><strong><em>“Logic, Explainability and the Future of Understanding”</em> (2018) »</strong></a></span><br /> <span><a class="website" href="https://writings.stephenwolfram.com/2022/03/the-physicalization-of-metamathematics-and-its-implications-for-the-foundations-of-mathematics/"><strong><em>“The Physicalization of Metamathematics and Its Implications for the Foundations of Mathematics”</em> (2022) »</strong></a></span><br /> <span><a class="website" href="https://writings.stephenwolfram.com/2014/08/computational-knowledge-and-the-future-of-pure-mathematics/"><strong><em>“Computational Knowledge and the Future of Pure Mathematics”</em> (2014) »</strong></a></span> </div> <h2 id="the-simplest-axiom-for-logic">The Simplest Axiom for Logic</h2> <link rel="preconnect" href="https://fonts.googleapis.com"> <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin> <link href="https://fonts.googleapis.com/css2?family=Source+Serif+4:ital,opsz,wght@0,8..60,200..900;1,8..60,200..900&display=swap" rel="stylesheet"> <p style="font-size: 1.2rem;background: #e5f2f8b5;padding: 15px 23px;border: 2px solid #91c5dcfa;max-width:620px;margin:25px 0px;line-height: 1.6;color: #0b354a;font-family: "Source Serif 4",Georgia,serif;"><a href="https://www.wolframscience.com/nks/p808--implications-for-mathematics-and-its-foundations/" style="font-weight: 700;font-optical-sizing: auto;">Theorem <span style="font-weight: 400;">(Wolfram with Mathematica, 2000)</span></a>: <br />The single axiom <span style="font-weight: 700;color: #24536a;">((<em>a</em>•<em>b</em>)•<em>c</em>)•(<em>a</em>•((<em>a</em>•<em>c</em>)•<em>a</em>))<span class="special-character"></span><em>c</em></span> is a complete axiom system for Boolean algebra (and is the simplest possible)</p> <p>For more than a century <a href="https://writings.stephenwolfram.com/2018/11/logic-explainability-and-the-future-of-understanding/#the-history">people had wondered</a> how simple the axioms of logic (Boolean algebra) could be. <a href="https://writings.stephenwolfram.com/2018/11/logic-explainability-and-the-future-of-understanding/#a-discovery-about-basic-logic">On January 29, 2000, I found the answer</a>—and made the surprising discovery that they could be about twice as simple as anyone knew. (I also showed that what I found was <a href="https://www.wolframscience.com/nks/notes-12-9--searching-for-logic-axioms/">the simplest possible</a>.) </p> <p>It was an interesting result—that gave new intuition about just how simple the foundations of things can be, and for example helped inspire my efforts to find a <a href="https://www.wolframphysics.org/" target="_blank" rel="noopener">simple underlying theory of physics</a>. </p> <p>But how did I get the result? Well, I used automated theorem proving (specifically, what’s now <tt><a href="http://reference.wolfram.com/language/ref/FindEquationalProof.html">FindEquationalProof</a></tt> in <a href="https://www.wolfram.com/language/">Wolfram Language</a>). Automated theorem proving is something that’s <a href="https://www.wolframscience.com/nks/notes-12-9--automated-theorem-proving/">been around since at least the 1950s</a>, and its core methods haven’t changed in a long time. But in the rare cases it’s been used in mathematics it’s typically been to confirm things that were already believed to be true. And in fact, to my knowledge, my Boolean algebra axiom is actually the only truly unexpected result that’s ever been found for the first time using automated theorem proving.<span id="more-65170"></span></p> <p>But, OK, so we know it’s true. And that’s interesting. But what about the proof? Does the proof, for example, show us why the result is true? Well, actually, in a quarter of a century, nobody (including me) has ever made much headway at all in understanding the proof (which, at least in the form we currently know it, is long and complicated). So is that basically inevitable—say as a consequence of <a href="https://www.wolframscience.com/nks/chap-12--the-principle-of-computational-equivalence#sect-12-6--computational-irreducibility">computational irreducibility</a>? Or is there some way—perhaps <a href="https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/">using modern AI</a>—to “humanize” the proof to a point where one can understand it?</p> <p>It is, I think, an interesting challenge—that gets at the heart of what one can (and can’t) expect to achieve with formalized mathematics. In what follows, I’ll discuss what I’ve been able to figure out—and how it relates to foundational questions about what mathematics is and how it can be done. And while I think I’ve been able to clarify some of the issues, the core problem is still out there—and I’d like to issue it here as a challenge:</p> <p style="font-size: 1.1rem;background: #e5f2f8b5;padding: 10px 20px;border: 2px solid #91c5dcfa;max-width:620px;margin: 0px 0px 15px 0px;line-height: 1.6;font-family: 'Source Serif 4',Georgia,serif;color: #383838;font-weight:600;font-optical-sizing:auto;"><span style="color: #555;font-weight: 700;">Challenge:</span> Understand the proof of the Theorem</p> <p>What do I mean by “understand”? Inevitably, “understand” has to be defined in human terms. Something like “so a human can follow and reproduce it”—and, with luck, feel like saying “aha!” at some point, the kind of way they might on hearing a proof of the Pythagorean theorem (or, in logic, something like de Morgan’s law <tt><a href="http://reference.wolfram.com/language/ref/Not.html">Not</a></tt>[<tt><a href="http://reference.wolfram.com/language/ref/And.html">And</a></tt>[<em>p</em>, <em>q</em>]]<span class="special-character"></span><tt><a href="http://reference.wolfram.com/language/ref/Or.html">Or</a></tt>[<tt><a href="http://reference.wolfram.com/language/ref/Not.html">Not</a></tt>[<em>p</em>], <tt><a href="http://reference.wolfram.com/language/ref/Not.html">Not</a></tt>[<em>q</em>]]). </p> <p>It should be said that it’s certainly not clear that such an understanding would ever be possible. After all, as we’ll discuss, it’s a basic metamathematical fact that out of all possible theorems almost none have short proofs, at least in terms of any particular way of stating the proofs. But what about an “interesting theorem” like the one we’re considering here? Maybe that’s different. Or maybe, at least, there’s some way of building out a “higher-level mathematical narrative” for a theorem like this that will take one through the proof in human-accessible steps.</p> <p>In principle one could always imagine a somewhat bizarre scenario in which people would just rote learn chunks of the proof, perhaps giving each chunk some name (a bit like how people learned bArbArA and cElArEnt syllogisms in the Middle Ages). And in terms of these chunks there’d presumably then be a “human way” to talk about the proof. But learning the chunks—other than as some kind of recreational or devotional activity—doesn’t seem to make much sense unless there’s metamathematical structure that somehow connects the chunks to “general concepts” that are widely useful elsewhere. </p> <p>But of course it’s still conceivable that there might be a “big theory” that would lead us to the theorem in an “understandable way”. And that could be a traditional mathematical theory, built up with precise, if potentially very abstract, constructs. But what about something more like a theory in natural science? In which we might treat our automatically generated proof as an object for empirical study—exploring its characteristics, trying to get intuition about it, and ultimately trying to deduce the analog of “natural laws” that give us a “human-level” way of understanding it. </p> <p>Of course, for many purposes it doesn’t really matter why the theorem is true. All that matters is that it is true, and that one can deduce things on the basis of it. But as one thinks about the future of mathematics, and the future of doing mathematics, it’s interesting to explore to what extent it might or might not ultimately be possible to understand in a human-accessible way the kind of seemingly alien result that the theorem represents. </p> <h2 id="the-proof-as-we-know-it">The Proof as We Know It</h2> <p>I first presented a version of the proof on <a href="https://www.wolframscience.com/nks/p810--implications-for-mathematics-and-its-foundations/">two pages</a> of my 2002 book <em><a href="https://www.wolframscience.com/nks/">A New Kind of Science</a></em>, printing it in 4-point type to make it fit: </p> <p><a href="https://files.wolframcdn.com/pub/www.wolframscience.com/nks/nks-ch12-sec9.pdf" ><img src='https://content.wolfram.com/sites/43/2025/01/sw01082025proofimg1.png' alt='Axiom proof' title='Axiom proof' width='619' height='373'/></a></p> <p>Today, generating a very similar proof is a one-liner in Wolfram Language (as we’ll discuss below, the · dot here can be thought of as representing the <tt><a href="http://reference.wolfram.com/language/ref/Nand.html">Nand</a></tt> operation):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025FEPimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025proofimg2.png' alt='' title='' width='455' height='100'> </div> </p></div> <p>The proof involves 307 (mostly rather elaborate) steps. And here’s one page of it (out of about 30)—presented in the form of a computable Wolfram Language dataset:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025proofCLOUDXimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025proofimg4.png' alt='Example proof steps page' title='Example proof steps page' width='619' height='385'/></div> </p></div> <p>What’s the basic idea of this proof? Essentially it’s to perform a sequence of purely structural symbolic operations that go from our axiom to <a href="https://www.wolframscience.com/nks/p808--implications-for-mathematics-and-its-foundations/">known axioms of Boolean algebra</a>. And the proof does this by proving a series of lemmas which can be combined to eventually give what we want: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025proofCLOUDAimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025proofimg5A.png' alt='' title='' width='689' height='1012'> </div> </p></div> <p>The highlighted “targets” here are the standard Sheffer axioms for Boolean algebra from 1913:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025proofimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025proofimg6.png' alt='' title='' width='278' height='60'> </div> </p></div> <p>And, yes, even though these are quite short, the intermediate lemmas involved in the proof get quite long—the longest involving 60 symbols (i.e. having <tt><a href="http://reference.wolfram.com/language/ref/LeafCount.html">LeafCount</a></tt> 60):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025proofCLOUDAimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025proofimg7.png' alt='' title='' width='488' height='40'> </div> </p></div> <p>It’s as if to get to where it’s going, the proof ends up having to go through the wilds of metamathematical space. And indeed one gets a sense of this if one plots the sizes (i.e. <tt><a href="http://reference.wolfram.com/language/ref/LeafCount.html">LeafCount</a></tt>) of successive lemmas:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025proofCLOUDAimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025proofimg8.png' alt='' title='' width='674' height='142'> </div> </p></div> <p>Here’s the distribution of these sizes, showing that while they’re often small, there’s a long tail (note, by the way, that if dot · appears <em>n</em> times in a lemma, the <tt><a href="http://reference.wolfram.com/language/ref/LeafCount.html">LeafCount</a></tt> will be 2<em>n</em> + 3):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025proofCLOUDAimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025proofimg9.png' alt='' title='' width='259' height='122'> </div> </p></div> <p>So how are these lemmas related? Here’s a graph of their interdependence (with the size of each dot being proportional to the size of the lemma it represents):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025proofCLOUDAimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025proofimg10.png' alt='' title='' width='565' height='710'> </div> </p></div> <p>Zooming in on the top we see more detail:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025proofCLOUDAimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025proofimg11.png' alt='' title='' width='595' height='375'> </div> </p></div> <p>We start from our axiom, then derive a whole sequence of lemmas—as we’ll see later, always <a href="https://www.wolframscience.com/metamathematics/proofs-in-accumulative-systems/">combining two lemmas to create a new one</a>. (And, yes, we could equally well call these things theorems—but we generate so many of them it seems more natural to call them “lemmas”.) </p> <p>So, OK, we’ve got a complicated proof. But how can we check that it’s correct? Well, from the symbolic representation of the proof in the Wolfram Language we can immediately generate a “proof function” that in effect contains executable versions of all the lemmas—implemented using simple structural operations:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025proofCLOUDXimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025proofCLOUDAimg11A.png' alt='' title='' width='551' height='453'> </div> </p></div> <p>And when you run this function, it applies all these lemmas and checks that the result comes out right:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025proofCLOUDZimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025proofimg14.png' alt='' title='' width='259' height='43'> </div> </p></div> <p>And, yes, this is basically what one would do in a proof assistant system (like <a href="https://lean-lang.org/" target="_blank" rel="noopener">Lean</a> or <a href="https://us.metamath.org/index.html" target="_blank" rel="noopener">Metamath</a>)—except that here the steps in the proof were generated purely automatically, without any human guidance (or effort). And, by the way, the fact that we can readily translate our symbolic proof representation into a function that we can run provides an operational manifestation of the equivalence between proofs and programs. </p> <p>But let’s look back at our lemma-interdependence “proof graph”. One notable feature is that we see several nodes with high out-degree—corresponding to what we can think of as “pivotal lemmas” from which many other lemmas end up directly being proved. So here’s a list of the “most pivotal” lemmas in our proof:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01102025outdegreeimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01102025outdegreeimg1.png' alt='' title='' width='330' height='178'> </div> </p></div> <p>Or, more graphically, here are the results for all lemmas that occur:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01102025outdegreeimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01102025outdegreeimg2.png' alt='' title='' width='341' height='237'> </div> </p></div> <p>So what are the “pivotal lemmas”? <em>a</em> · <em>b</em> = <em>b</em> · <em>a</em> we readily recognize as commutativity. But the others—despite their comparative simplicity—don’t seem to correspond to things that have specifically shown up before in the mathematical literature (or, as we’ll <a href="https://writings.stephenwolfram.com/2025/01/who-can-understand-the-proof-a-window-on-formalized-mathematics/#llms-to-the-rescue">discuss later</a>, that’s at least what the current generation of LLMs tell us).</p> <p>But looking at our proof graph something we can conclude is that a large fraction of the “heavy lifting” needed for the whole proof has already happened by the time we can prove <em>a</em> · <em>b</em> = <em>b</em> · <em>a</em>. So, for the sake of avoiding at least some of hairy detail in the full proof, in most of what follows, we’ll concentrate on the proof of <em>a</em> · <em>b</em> = <em>b</em> · <em>a</em>—which <tt><a href="http://reference.wolfram.com/language/ref/FindEquationalProof.html">FindEquationalProof</a></tt> tells us we can accomplish in 104 steps, with a proof graph of the form</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025proofCLOUDQimg15_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025proofimg17.png' alt='' title='' width='666' height='824'> </div> </p></div> <p>with the sizes of successive lemmas (in what is basically a breadth-first traversal of the proof graph) being:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025proofCLOUDAimg16_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025proofimg18A.png' alt='' title='' width='509' height='117'> </div> </p></div> <h2 id="the-machine-code-of-the-proof">The “Machine Code” of the Proof</h2> <p>It’s already obvious from the previous section that the proof as we currently know it is long, complicated, and fiddly—and in many ways reminiscent of something at a “machine-code” level. But to get a grounded sense of what’s going on in the proof, it’s useful to dive into the details—even if, yes, they can be seriously hard to wrap one’s head around. </p> <p>At a fundamental level, the way the proof—say of <em>a</em> · <em>b</em> = <em>b</em> · <em>a</em>—works is by starting from our axiom, and then progressively deducing new lemmas from pairs of existing lemmas. In the simplest case, that deduction works by <a href="https://www.wolframscience.com/metamathematics/the-metamodeling-of-axiomatic-mathematics/">straightforward symbolic substitution</a>. So, for example, let’s say we have the lemmas</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg1.png' alt='' title='' width='121' height='14'> </div> </p></div> <p>and </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg2.png' alt='' title='' width='121' height='14'> </div> </p></div> <p>Then it turns out that from these lemmas we can deduce:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg3.png' alt='' title='' width='121' height='14'> </div> </p></div> <p>Or, in other words, knowing that the first two lemmas hold for any <em>a</em> gives us enough information about · that the third lemma must inevitably also hold. So how do we derive this?</p> <p>Our lemmas in effect <a href="https://www.wolframscience.com/metamathematics/rules-applied-to-rules/">define two-way equivalences</a>: their left-hand sides are defined as equal to their right-hand sides, which means that if we see an expression that (structurally) matches one side of a lemma, we can always replace it by the other side of the lemma. And to implement this, we can write our second lemma explicitly as a rule—where to avoid confusion we’re using <em>x</em> rather than <em>a</em>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg4.png' alt='' title='' width='141' height='14'> </div> </p></div> <p>But if we now look at our first lemma, we see that there’s part of it (indicated with a frame) that matches the left-hand side of our rule:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg5.png' alt='' title='' width='132' height='25'> </div> </p></div> <p>If we replace this part (which is at position {2,2}) using our rule we then get</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg6.png' alt='' title='' width='121' height='14'> </div> </p></div> <p>which is precisely the lemma we wanted to deduce. </p> <p>We can summarize what happened here as a fragment of our proof graph—in which a “substitution event” node takes our first two lemmas as input, and “outputs” our final lemma:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025KDCLOUDimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025KDCLOUDimg1.png' alt='' title='' width='277' height='99'> </div> </p></div> <p>As always, the symbolic expressions we’re working with here can be represented as trees:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg8.png' alt='' title='' width='391' height='158'> </div> </p></div> <p>The substitution event then corresponds to a tree rewriting:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025KDCLOUDimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025KDCLOUDimg2.png' alt='' title='' width='275' height='242'> </div> </p></div> <p>The <a href="https://www.wolframscience.com/metamathematics/relations-to-automated-theorem-proving/">essence of automated theorem proving</a> is to find a particular sequence of substitutions etc. that get us from whatever axioms or lemmas we’re starting with, to whatever lemmas or theorems we want to reach. Or in effect to find a suitable “path” through the multiway graph of all possible substitutions etc. that can be made. </p> <p>So, for example, in the particular case we’re considering here, this is the graph that represents all possible transformations that can occur through a single substitution event:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg10.png' alt='' title='' width='582' height='285'> </div> </p></div> <p>The particular transformation (or “path”) we’ve used to prove <em>a</em> · <em>a</em> = <em>a</em> · ((<em>a</em> · <em>a</em>) · <em>a</em>) is highlighted. But as we can see, there are many other possible lemmas that can be generated, or in other words that can be proved from the two lemmas we’ve given as input. Put another way, we can think of our input lemmas as implying or entailing all the other lemmas shown here. And, by analogy to the concept of a light cone in physics, we can view the collection of everything entailed by given lemmas or given events as the (future) “<a href="https://www.wolframscience.com/metamathematics/metamathematical-space/#p-28">entailment cone</a>” of those lemmas or events. A proof that reaches a particular lemma is then effectively a path in this entailment cone—analogous in physics to a world line that reaches a particular spacetime point.</p> <p>If we continue building out the entailment cone from our original lemmas, then after two (substitution) events we get:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg11.png' alt='' title='' width='701' height='454'> </div> </p></div> <p>There are 21 lemmas generated here. But it turns out that beyond the lemma we already discussed there are only three (highlighted here) that appear in the proof we are studying here:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025machineCLOUDimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg12.png' alt='' title='' width='281' height='60'> </div> </p></div> <p>And indeed the <a href="https://writings.stephenwolfram.com/2018/11/logic-explainability-and-the-future-of-understanding/#the-mechanics-of-proof">main algorithmic challenge of theorem proving</a> is to figure out which lemmas to generate in order to get a path to the theorem one’s trying to prove. And, yes, as we’ll discuss later, there are typically many paths that will work, and different algorithms will yield different paths and therefore different proofs.</p> <p>But, OK, seeing how new lemmas can be derived from old by substitution is already quite complicated. But actually there’s something even more complicated we need to discuss: deriving lemmas not only by substitution but also by what we’ve called <a href="https://www.wolframscience.com/metamathematics/beyond-substitution-cosubstitution-and-bisubstitution/">bisubstitution</a>. </p> <p>We can think of both substitution and bisubstitution as turning one lemma X == Y into a transformation rule (either X <img style="margin-bottom: -1px" class='' src="https://content.wolfram.com/uploads/sites/32/2022/10/rightarrow2.png" width='15' height='11' > Y or Y <img style="margin-bottom: -1px" class='' src="https://content.wolfram.com/uploads/sites/32/2022/10/rightarrow2.png" width='15' height='11' > X), and then applying this rule to another lemma, to derive a new lemma. In ordinary substitution, the left-hand side of the rule directly matches (in a Wolfram Language pattern-matching sense) a subexpression in the lemma we’re transforming. But the key point is that all the variables that appear in both our lemmas are really “pattern variables” (<tt>x_</tt> etc. in Wolfram Language). So that means there’s another way that one lemma can transform another, in which in effect replacements are made not only in the lemma being transformed, but also in the lemma that’s doing the transforming. </p> <p>The net effect, though, is still to take two lemmas and derive another, as in:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025KDCLOUDimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025KDCLOUDimg3.png' alt='' title='' width='414' height='130'> </div> </p></div> <p>But in tracing through the details of our proof, we need to distinguish “substitution events” (shown yellowish) from “bisubstitution” ones (shown reddish). (In <tt><a href="http://reference.wolfram.com/language/ref/FindEquationalProof.html">FindEquationalProof</a></tt> in Wolfram Language, lemmas produced by ordinary substitution are called “substitution lemmas”, while lemmas produced by bisubstitution are called “critical pair lemmas”.)</p> <p>OK, so how does bisubstitution work? Let’s look at an example. We’re going to be transforming the lemma </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg14.png' alt='' title='' width='235' height='14'> </div> </p></div> <p>using the lemma (which in this case happens to be our original axiom)</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025machineCLOUDimg15_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg15.png' alt='' title='' width='183' height='14'> </div> </p></div> <p>to derive the new lemma:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025machineCLOUDimg16_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg16.png' alt='' title='' width='235' height='14'> </div> </p></div> <p>We start by creating a rule from the second lemma. In this case, the rule we need happens to be reversed relative to the way we wrote the lemma, and this means that (in the canonical form we’re using) it’s convenient to rename the variables that appear:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025machineCLOUDimg17_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg17.png' alt='' title='' width='237' height='14'> </div> </p></div> <p>To do our bisubstitution we’re going to apply this rule to a subterm of our first lemma. We can write that first lemma with explicit pattern variables:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg18_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg18.png' alt='' title='' width='311' height='14'> </div> </p></div> <p>As always, the particular names of those variables don’t matter. And to avoid confusion, we’re going to rename them:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg19_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg19.png' alt='' title='' width='306' height='14'> </div> </p></div> <p>Now look at this subterm of this lemma (which is part {2,1,1,2} of the expression):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg20_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg20.png' alt='' title='' width='102' height='14'> </div> </p></div> <p>It turns out that with appropriate bindings for pattern variables this can be matched (or “unified”) with the left-hand side of our rule. This provides a way to find such bindings:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg21_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg21.png' alt='' title='' width='345' height='14'> </div> </p></div> <p>(Note that in these bindings things like c_ stand only for explicit expressions, like c_, not for expressions that the ordinary Wolfram Language pattern <tt>c_</tt> would match.)</p> <p>Now if we apply the bindings we’ve found to the left-hand side of our rule</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg22_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg22.png' alt='' title='' width='277' height='14'> </div> </p></div> <p>and to the subterm we picked out from our lemma</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg23_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg23.png' alt='' title='' width='277' height='14'> </div> </p></div> <p>we see that we get the same expression. Which means that with these bindings the subterm matches the left-hand side of our rule, and we can therefore replace this subterm with the right-hand side of the rule. To see all this in operation, we first apply the bindings we’ve found to the lemma we’re going to transform (and, as it happens, the binding for y_ is the only one that matters here):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg24_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg24.png' alt='' title='' width='615' height='14'> </div> </p></div> <p>Now we take this form and apply the rule at the position of the subterm we identified:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg25_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg25.png' alt='' title='' width='342' height='14'> </div> </p></div> <p>Renaming variables</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg26_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg26.png' alt='' title='' width='345' height='14'> </div> </p></div> <p> we now finally get exactly the lemma that we were trying to derive:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025machineCLOUDimg27_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg27.png' alt='' title='' width='235' height='14'> </div> </p></div> <p>And, yes, getting here was a pretty complicated process. But with the symbolic character of our lemmas, it’s one that is inevitably possible, and so can be used in our proof. And in the end, out of the 101 lemmas used in the proof, 47 were derived by ordinary substitution, while 54 were derived by bisubstitution.</p> <p>And indeed the first few steps of the proof turn out to use only bisubstituion. An example is the first step—which effectively applies the original axiom to itself using bisubsitution:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025KDCLOUDimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025KDCLOUDimg4.png' alt='' title='' width='243' height='149'> </div> </p></div> <p>And, yes, even this very first step is pretty difficult to follow. </p> <p>If we start from the original axiom, there are 16 lemmas that can be derived purely by a single ordinary substitution (effectively of the axiom into itself)—resulting in the following entailment cone:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg29_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg29.png' alt='' title='' width='693' height='323'> </div> </p></div> <p>As it happens, though, none of the 16 new lemmas here actually get used in our proof. On the other hand, in the bisubstitution entailment cone</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg30_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg30.png' alt='' title='' width='693' height='251'> </div> </p></div> <p>there are 24 new lemmas, and 4 of them get used in the proof—as we can see from the first level of the proof graph (here rotated for easier rendering):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092925NLCLOUDimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg31.png' alt='' title='' width='657' height='101'> </div> </p></div> <p>At the next level of the entailment cone from ordinary substitution, there are 5062 new lemmas—none of which get used in the proof. But of the 31431 new lemmas in the (pure) bisubstitution entailment cone, 13 do get used:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092925NLCLOUDimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg32.png' alt='' title='' width='663' height='112'> </div> </p></div> <p>At the next level, lemmas generated by ordinary substitution also start to get used:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092925NLCLOUDimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg33.png' alt='' title='' width='657' height='126'> </div> </p></div> <p>Here’s another rendering of these first few levels of the proof graph:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025machineCLOUDimg34_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg34.png' alt='' title='' width='295' height='198'> </div> </p></div> <p>Going to another couple of levels we’re starting to see quite a few independent chains of lemmas developing</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092925NLCLOUDimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg35.png' alt='' title='' width='454' height='289'> </div> </p></div> <p>which eventually join up when we assemble the whole proof graph:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092925NLCLOUDimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg36.png' alt='' title='' width='605' height='631'> </div> </p></div> <p>A notable feature of this proof graph is that it has more bisubstitution events at the top, and more ordinary substitution events at the bottom. So why is that? Essentially it seems to be because bisubstitution events tend to produce larger lemmas, and ordinary substitution events tend to produce smaller ones—as we can see if we plot input and output lemma sizes for all events in the proof:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025machineCLOUDimg37_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg37.png' alt='' title='' width='451' height='479'> </div> </p></div> <p>So in effect what seems to be happening is that the proof first has to “spread out in <a href="https://www.wolframscience.com/metamathematics/metamathematical-space/">metamathematical space</a>”, using bisubstitution to generate large lemmas “far out in metamathematical space”. Then later the proof has to “corral things back in”, using ordinary substitution to generate smaller lemmas. And for example, at the very end, it’s a substitution event that yields the final theorem we’re trying to prove:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025machineCLOUDimg38_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg38.png' alt='' title='' width='277' height='100'> </div> </p></div> <p>And earlier in the graph, there’s a similar “collapse” to a small (and rather pivotal) lemma:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025machineCLOUDimg39_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025machineCLOUDimg39.png' alt='' title='' width='406' height='120'> </div> </p></div> <p>As the plot above indicates, ordinary substitution can lead to large lemmas, and indeed bisubstitution can also lead to smaller ones, as in</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025machineCLOUDimg40_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025machineCLOUDimg40.png' alt='' title='' width='367' height='87'> </div> </p></div> <p>or slightly more dramatically:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025machineCLOUDimg41_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025machineCLOUDimg41.png' alt='' title='' width='630' height='140'> </div> </p></div> <p>But, OK, so this is some of what’s going on at a “machine-code” level inside the proof we have. Of course, given our axiom and the operations of substitution and bisubstitution there are inevitably a huge number of different possible proofs that could be given. The particular proof we’re considering is what the Wolfram Language <tt><a href="http://reference.wolfram.com/language/ref/FindEquationalProof.html">FindEquationalProof</a></tt> gives. (In the Appendix, we’ll also look at results from some other automated theorem proving systems. The results will be very comparable, if usually a little lengthier.) </p> <p>We won’t discuss the detailed (and rather elaborate) algorithms inside <tt><a href="http://reference.wolfram.com/language/ref/FindEquationalProof.html">FindEquationalProof</a></tt>. But fundamentally what they’re doing is to try constructing certain lemmas, then to find sequences of lemmas that eventually form a “path” to what we’re trying to prove. And as some indication of what’s involved in this, here’s a plot of the number of “candidate lemmas” that are being maintained as possible when different lemmas in the proof are generated:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025machineCLOUDimg42_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025machineimg42.png' alt='' title='' width='354' height='145'> </div> </p></div> <p>And, yes, for a while there’s roughly exponential growth, leveling off at just over a million when we get to the “pulling everything together” stage of the proof.</p> <h2 id="unrolling-the-proof">Unrolling the Proof</h2> <p>In what we’ve done so far, we’ve viewed our proof as working by starting from an axiom, then <a href="https://www.wolframscience.com/nks/notes-12-9--proof-structures/">progressively building up lemmas</a>, until eventually we get to the theorem we want. But there’s an alternative view that’s in some ways useful in getting a more direct, “mechanical” intuition about what’s going on in the proof.</p> <p>Let’s say we’re trying to prove that our axiom implies that <em>p</em> · <em>q</em> = <em>q</em> · <em>p</em>. Well, then there must be some way to start from the expression <em>p</em> · <em>q</em> and just keep on judiciously applying the axiom until eventually we get to the expression <em>q</em> · <em>p</em>. And, yes, the number of axiom application steps required might be very large. But ultimately, if it’s true that the axiom implies <em>p</em> · <em>q</em> = <em>q</em> · <em>p</em> there must be a path that gets from <em>p</em> · <em>q</em> to <em>q</em> · <em>p</em>.</p> <p>But before considering the case of our full proof, let’s start with something simpler. Let’s assume that we’ve already established the lemmas:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg1.png' alt='' title='' width='121' height='37'> </div> </p></div> <p>Then we can treat them as axioms, and ask a question like whether they imply the lemma</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg2.png' alt='' title='' width='121' height='14'> </div> </p></div> <p>or, in our current approach, whether they can be used to form a path from <em>a</em> · <em>a</em> to <em>a</em> · (<em>a</em> · (<em>a</em> · <em>a</em>)). </p> <p>Well, it’s not too hard to see that in fact there is such a path. Apply our second lemma to <em>a</em> · <em>a</em> to get:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg3.png' alt='' title='' width='77' height='14'> </div> </p></div> <p>But now this subterm</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg4.png' alt='' title='' width='88' height='25'> </div> </p></div> <p>matches the left-hand of the first lemma, so that it can be replaced by the right-hand side of that lemma (i.e. by <em>a</em> · (<em>a</em> · <em>a</em>)), giving in the end the desired <em>a</em> · (<em>a</em> · (<em>a</em> · <em>a</em>)).</p> <p>So now we can summarize this process as:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg5.png' alt='' title='' width='252' height='138'> </div> </p></div> <p>In what follows, it’ll be convenient to label lemmas. We’ll call our original axiom A1, we’ll call our successive lemmas generated by ordinary substitution S<em>n</em> and the ones generated by bisubsitution B<em>n:</em></p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg6.png' alt='' title='' width='399' height='621'> </div> </p></div> <p>In our proof we’ll also use <img style="margin-bottom: -8px" class='' src="https://content.wolfram.com/sites/43/2025/01/rightgreenarrow.png" width='18' height='25' > and <img style="margin-bottom: -8px" class='' src="https://content.wolfram.com/sites/43/2025/01/leftpinkarrow.png" width='18' height='25' > to indicate whether we’re going to use the lemma (say <nobr>X = Y)</nobr> in the “forward direction” X <img style="margin-bottom: -1px" class='' src="https://content.wolfram.com/uploads/sites/32/2022/10/rightarrow2.png" width='15' height='11' > Y or the “reverse direction” X <img style="margin-bottom: -1px" class='' src="https://content.wolfram.com/uploads/sites/32/2022/10/leftarrow.png" width='15' height='11' > Y. And with this labeling, the proof we just gave (which is for the lemma S23) becomes:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg7.png' alt='' title='' width='168' height='138'> </div> </p></div> <p>Each step here is a pure substitution, and requires no replacement in the rule (i.e. “axiom”) being used. But proofs like this can also be done with bisubstitution, where replacements are applied to the rule to get it in a form where it can directly be applied to transform an expression:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg8.png' alt='' title='' width='529' height='136'> </div> </p></div> <p>OK, so how about the first lemma in our full proof? Here’s a proof that its left-hand side can be transformed to its right-hand side just by judiciously applying the original axiom:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg9.png' alt='' title='' width='504' height='140'> </div> </p></div> <p>Here’s a corresponding proof for the second lemma:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg10.png' alt='' title='' width='531' height='136'> </div> </p></div> <p>Both these involve bisubstitution. Here’s a proof of the first lemma derived purely by ordinary substitution:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg11.png' alt='' title='' width='611' height='136'> </div> </p></div> <p>This proof is using not only the original axiom but also the lemma B5. Meanwhile, B5 can be proved using the original axiom together with B2:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg12.png' alt='' title='' width='693' height='181'> </div> </p></div> <p>But now, inserting the proof we just gave above for B2, we can give a proof of B5 just in terms of the original axiom:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg13.png' alt='' title='' width='693' height='278'> </div> </p></div> <p>And recursively continuing this unrolling process, we can then prove S1 purely using the original axiom:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg14.png' alt='' title='' width='693' height='329'> </div> </p></div> <p>What about the whole proof? Well, at the very end we have:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg15_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg15.png' alt='' title='' width='222' height='137'> </div> </p></div> <p>If we “unroll” one step we have</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg16_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg16.png' alt='' title='' width='275' height='349'> </div> </p></div> <p>and after 2 steps:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg17_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg17.png' alt='' title='' width='435' height='444'> </div> </p></div> <p>In principle we could go on with this unrolling, in effect recursively replacing each rule by the sequence of transformations that represents its proof. Typically this process will, however, generate exponentially longer proof sequences. But say for lemma S5</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg18_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg18.png' alt='' title='' width='334' height='14'> </div> </p></div> <p>the result is still very easily manageable:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg19_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg19.png' alt='' title='' width='670' height='274'> </div> </p></div> <p>We can summarize this result by in effect plotting the sizes of the intermediate expressions involved—and indicating what part of each expression is replaced at each step (with <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2025/01/01082025redbox.png' alt='' title='' width='15' height='15'> as above indicating “forward” use of the axiom A1 <img style="margin-bottom: -1px" class='' src="https://content.wolfram.com/uploads/sites/32/2022/10/rightarrow2.png" width='15' height='11' > and <img style="margin-bottom: -2px" loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/greenbox.png'' title='' width='15' height='15'> “backward” A1 <img style="margin-bottom: -1px" class='' src="https://content.wolfram.com/uploads/sites/32/2022/10/leftarrow.png" width='15' height='11' >):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg20_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg20.png' alt='' title='' width='357' height='160'> </div> </p></div> <p>For lemma B33</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg21_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg21.png' alt='' title='' width='681' height='14'> </div> </p></div> <p>the unrolled proof is now 30 steps long</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg22_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg22.png' alt='' title='' width='357' height='160'> </div> </p></div> <p>while for lemma S11</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg23_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg23.png' alt='' title='' width='467' height='14'> </div> </p></div> <p>the unrolled proof is 88 steps long:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg24_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg24.png' alt='' title='' width='412' height='182'> </div> </p></div> <p>But here there is a new subtlety. Doing a direct substitution of the “proof paths” for the lemmas used to prove S11 in our original proof gives a proof of length 104:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg25_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg25.png' alt='' title='' width='466' height='177'> </div> </p></div> <p>But this proof turns out to be repetitive, with the whole gray section going from one copy to another of:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg26_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg26.png' alt='' title='' width='237' height='14'> </div> </p></div> <p>As an example of a larger proof, we can consider lemma B47:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg27_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg27.png' alt='' title='' width='157' height='14'> </div> </p></div> <p>And despite the simplicity of this lemma, our proof for it is 1008 steps long: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg28_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg28.png' alt='' title='' width='609' height='204'> </div> </p></div> <p>If we don’t remove repetitive sections, it’s 6805 steps:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg29_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg29.png' alt='' title='' width='460' height='157'> </div> </p></div> <p>Can we unroll the whole proof of <em>a</em> · <em>b</em> = <em>b</em> · <em>a</em>? We can get closer by considering lemma S36:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg30_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg30.png' alt='' title='' width='121' height='14'> </div> </p></div> <p>Its proof is 27105 steps long:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg31_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg31A.png' alt='' title='' width='621' height='204'> </div> </p></div> <p>The distribution of expression sizes follows a roughly exponential distribution, with a maximum of 20107:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg32_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg32.png' alt='' title='' width='274' height='122'> </div> </p></div> <p>Plotting the expression sizes on a log scale one gets: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg33_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg33.png' alt='' title='' width='409' height='142'> </div> </p></div> <p>And what stands out most here is a kind of recursive structure—which is the result of long sequences that basically represent the analog of “subroutine calls” that go back and repeatedly prove lemmas that are needed.</p> <p>OK, so what about the whole proof of <em>a</em> · <em>b</em> = <em>b</em> · <em>a</em>? Yes, it can be unrolled—in terms of 83,314 applications of the original axiom. The sequence of expression sizes is:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg34_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg34.png' alt='' title='' width='571' height='187'> </div> </p></div> <p>Or on a log scale:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg35_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg35.png' alt='' title='' width='519' height='175'> </div> </p></div> <p>The distribution of expression sizes now shows clear deviation from being exponential:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg36_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025unrollCLOUDimg36A.png' alt='' title='' width='359' height='160'> </div> </p></div> <p>The maximum is 63245, which occurs just 81 steps after the exact midpoint of the proof. In other words, in the middle, the proof has wandered incredibly far out in metamathematical space (there are altogether <tt><a href="http://reference.wolfram.com/language/ref/CatalanNumber.html">CatalanNumber</a></tt>[63245] ≈ 10<sup>38070</sup> possible expressions of the size it reaches). </p> <p>The proof returns to small expressions just a few times; here are all the cases in which the size is below 10:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/img1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092024JGCLOUDimg8.png' alt='' title='' width='399' height='274'> </div> </p></div> <p>So, yes, it is possible to completely unroll the proof into a sequence of applications of the original axiom. But if one does this, it inevitably involves repeating lots of work. Being able to use intermediate lemmas in effect lets one “share common subparts” in the proof. So that one ends up with just 104 “rule applications”, rather than 83314. Not that it’s easy to understand those 104 steps…</p> <h2 id="is-there-a-better-notation">Is There a Better Notation?</h2> <p> Looking at our proof—either in its original “lemma” form, or in its “unrolled” form—the most striking aspect of it is how complicated (and incomprehensible) it seems to be. But one might wonder whether much of that complexity is just the result of not “using the right notation”. In the end, we’ve got a huge number of expressions written in terms of · operations that we can interpret as <tt><a href="http://reference.wolfram.com/language/ref/Nand.html">Nand</a></tt> (or <tt><a href="http://reference.wolfram.com/language/ref/Nor.html">Nor</a></tt>). And maybe it’s a little like seeing the operation of a microprocessor down at the level of individual gates implementing <tt>Nand</tt>s or <tt>Nor</tt>s. And might there perhaps be an analog of a higher-level representation—with higher-level operations (even like arithmetic) that are more accessible to us humans?</p> <p>It perhaps doesn’t help that <tt>Nand</tt> itself is a rather non-human construct. For example, not a single natural human language seems to have a word for <tt>Nand</tt>. But there are combinations of <tt>Nand</tt>s that have more <a href="https://www.wolframscience.com/nks/p807--implications-for-mathematics-and-its-foundations/">familiar interpretations</a>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01072025notationimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01072025notationimg1.png' alt='' title='' width='228' height='127'> </div> </p></div> <p>But what combinations actually occur in our proof? Here are the most common subexpressions that appear in lemmas in the proof:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092925NLCLOUDimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092925NLCLOUDimg12.png' alt='' title='' width='290' height='316'> </div> </p></div> <p>And, yes, we could give the most common of these special names. But it wouldn’t really help in “compressing” the proof—or making it easier to understand.</p> <p>What about “upgrading” our “laws of inference”, i.e. the way that we can derive new lemmas from old? Perhaps instead of substitution and bisubstitution, which both take two lemmas and produce one more, we could set up more elaborate “tactics” that for example take in more input lemmas. We’ve seen that if we completely unroll the proof, it gets much longer. So perhaps there is a “higher-order” setup that for example dramatically shortens the proof. </p> <p>One way one might identify this is by seeing commonly repeating structures in the subgraphs that lead to lemmas. But in fact these subgraphs are quite diverse:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025notationCLOUDAimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01072025notationimg3.png' alt='' title='' width='589' height='130'> </div> </p></div> <h2 id="what-are-the-popular-lemmas">What Are the Popular Lemmas?</h2> <p>A typical feature of human-written mathematical proofs is that they’re “anchored” by famous theorems or lemmas. They may have fiddly technical pieces. But usually there’s a backbone of “theorems people know”. </p> <p>We have the impression that the proof we’re discussing here “spends most of its time wandering around the wilds of metamathematical space”. But perhaps it visits waypoints that are somehow recognizable, or at least should be. Or in other words, perhaps out in the metamathematical space of lemmas there are ones that are somehow sufficiently popular that they’re worth giving names to, and learning—and can then be used as “reference points” in terms of which our proof becomes simpler and more human accessible.</p> <p>It’s a story very much like what happens with human language. There are things out there in the world, but when there’s a certain category of them that are somehow common or important enough, we make a word for them in our language, which we can then use to “compactly” refer to them. (It’s again the same story when it comes to computational language, and in particular the Wolfram Language, except that in that case it’s been my personal responsibility to come up with the appropriate definitions and names for functions to represent “common lumps of computation”.) </p> <p>But, OK, so what are the “popular lemmas” of <tt><a href="http://reference.wolfram.com/language/ref/Nand.html">Nand</a></tt> proofs? One way to explore this is to enumerate statements that are “true about <tt>Nand</tt>”—then to look at proofs of these statements (say found with <tt><a href="http://reference.wolfram.com/language/ref/FindEquationalProof.html">FindEquationalProof</a></tt> from our axiom) and see what lemmas show up frequently in them. </p> <p><a href="https://www.wolframscience.com/nks/p818--implications-for-mathematics-and-its-foundations/">Enumerating statements “true about <tt>Nand</tt>”</a>, starting from the smallest, we get</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025lemmasCLOUDimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025lemmasimg1.png' alt='' title='' width='598' height='222'> </div> </p></div> <p>where we have highlighted statements from this list that appear as lemmas in our proof.</p> <p>Proving each of these statements from our original axiom, here are the <a href="https://www.wolframscience.com/nks/notes-12-9--proof-lengths-in-logic/">lengths of proofs we find</a> (for all 1341 distinct theorems with up to <tt><a href="http://reference.wolfram.com/language/ref/LeafCount.html">LeafCount</a></tt> 4 on each side):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025lemmasCLOUDimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025lemmasimg2.png' alt='' title='' width='471' height='204'> </div> </p></div> <p>A histogram shows that it’s basically a bimodal distribution</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025lemmasCLOUDimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025lemmasimg3.png' alt='' title='' width='359' height='134'> </div> </p></div> <p>with the smallest “long-proof” theorem being:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025lemmasimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025lemmasimg4.png' alt='' title='' width='173' height='14'> </div> </p></div> <p>In aggregate, all these proofs use about 200,000 lemmas. But only about 1200 of these are distinct. And we can plot which lemmas are used in which proofs—and we see that there are indeed many lemmas that are used across wide ranges of proofs, while there are a few others that are “special” to each proof (the diagonal stripe is associated with lemmas close to the statement being proved):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025lemmasCLOUDimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025lemmasimg5.png' alt='' title='' width='428' height='401'> </div> </p></div> <p>If we rank all distinct lemmas from most frequently to least frequently used, we get the following distribution of lemma usage frequencies across all our proofs: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025lemmasCLOUDimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025lemmasimg6.png' alt='' title='' width='355' height='155'> </div> </p></div> <p>It turns out that there is a “common core” of 49 lemmas that are used in every single one of the proofs. So what are these lemmas? Here’s a plot of the usage frequency of lemmas against their size—with the “common ones” populating the top line: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025lemmasCLOUDimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025lemmasimg7.png' alt='' title='' width='438' height='192'> </div> </p></div> <p>And at first this might seem surprising. We might have expected that short lemmas would be the most frequent, but instead we’re seeing long lemmas that always appear, the very longest being:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025lemmasimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025lemmasimg8.png' alt='' title='' width='516' height='40'> </div> </p></div> <p>So why is this? Basically it’s that these long lemmas are being used at the beginning of every proof. They’re the result of applying bisubstitution to the original axiom, and in some sense they seem to be laying down a kind of net in metamathematical space that then allows more diverse—and smaller—lemmas to be derived. </p> <p>But how are these “common core” popular lemmas distributed within proofs? Here are a few examples:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025lemmasCLOUDXimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025lemmasCLOUDXimg9.png' alt='' title='' width='566' height='579'> </div> </p></div> <p>And what we see is that while, yes, the common core lemmas are always at the beginning, they don’t seem to have a uniform way of “plugging into” the rest of the proof. And it doesn’t, for example, seem as if there’s just some small set of (perhaps simple) “waypoint” lemmas that one can introduce that will typically shorten these proofs.</p> <p>If one effectively allows all the common core lemmas to be used as axioms, then inevitably proofs will be shortened; for example, the proof of <em>a</em> · <em>b</em> = <em>b</em> · <em>a</em>—which only ends up using 5 of the common core lemmas—is now shortened to 51 lemmas:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/img3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025lemmasCLOUDZimg10B.png' alt='' title='' width='282' height='479'> </div> </p></div> <p>It doesn’t seem to become easier to understand, though. And if it’s unrolled, it’s still 5013 steps. </p> <p>Still, one can ask what happens if one just introduces particular “recognizable” lemmas as additional axioms. For example, if we include “commutativity” <em>a</em> · <em>b</em> = <em>b</em> · <em>a</em> then we find that, yes, we do manage to <a href="https://www.wolframscience.com/nks/notes-12-9--proof-lengths-in-logic/">reduce the lengths of some proofs</a>, but certainly not all:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025lemmasCLOUDXimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025lemmasimg11.png' alt='' title='' width='554' height='280'> </div> </p></div> <p>Are there any other “pivotal” lemmas we could add? In particular, what about lemmas that can help with the length-200 or more proofs? It turns out that all of these proofs involve the lemma: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025lemmasimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025lemmasimg12.png' alt='' title='' width='130' height='14'> </div> </p></div> <p>So what happens if we add this? Well, it definitely reduces proof lengths:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025lemmasimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025lemmasimg13.png' alt='' title='' width='607' height='316'> </div> </p></div> <p>And sometimes it even seems like it brings proofs into “human range”. For example, a proof of</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025lemmasimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025lemmasimg14.png' alt='' title='' width='104' height='14'> </div> </p></div> <p>from our original axiom has length 56. Adding in commutativity reduces it to length 18. And adding our third lemma reduces it to just length 9—and makes it not even depend directly on the original axiom:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025lemmasCLOUDZimg15_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025lemmasCLOUDZimg15.png' alt='' title='' width='272' height='249'> </div> </p></div> <p>But despite the apparent simplicity here, the steps involved—particularly when bisubstitution is used—are remarkably hard to follow. (Note the use of <em>a </em>= <em>a</em> as a kind of “implicit axiom”—something that has actually also appeared, without comment, in many of our other proofs.)</p> <h2 id="can-we-get-a-shorter-proof">Can We Get a Shorter Proof?</h2> <p>The proof that we’ve been studying can be seen in some ways as a rather arbitrary artifact. It’s the output of <tt><a href="http://reference.wolfram.com/language/ref/FindEquationalProof.html">FindEquationalProof</a></tt>, with all its specific detailed internal algorithms and choices. In the Appendix, we’ll see that other automated theorem proving systems give very similar results. But we still might wonder whether actually the complexity of the proof as we’ve been studying it is just a consequence of the details of our automated theorem proving—and that in fact there’s a much shorter (and perhaps easier to understand) proof that exists.</p> <p>One approach we could take—reminiscent of higher category theory—is to think about just simplifying the proof we have, effectively using proof-to-proof transformations. And, yes, this is technically difficult, though it doesn’t seem impossible. But what if there are <a href="https://www.wolframscience.com/metamathematics/the-topology-of-proof-space/">“holes” in proof space</a>? Then a “continuous deformation” of one proof into another will get stuck, and even if there is a much shorter proof, we’re liable to get “topologically stuck” before we find it.</p> <p>One way to be sure we’re getting the shortest proof of a particular lemma is to explicitly find the first place that lemma appears in the (future) entailment cone of our original axiom. For example, as we saw above, a single substitution event leads to the entailment cone:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01072025shorterimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01072025shorterimg1.png' alt='' title='' width='645' height='285'> </div> </p></div> <p>Every lemma produced here is, by construction, in principle derivable by a proof involving a single substitution event. But if we actually use <tt><a href="http://reference.wolfram.com/language/ref/FindEquationalProof.html">FindEquationalProof</a></tt> to prove these lemmas, the proofs we get most involve 2 events (and in one case 4):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025shorterAimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025shorterAimg2.png' alt='' title='' width='645' height='75'> </div> </p></div> <p>If we take another step in the entailment cone, we get a total of 5062 lemmas. From the way we generated them, we know that all these lemmas can in principle be reached by proofs of length 2. But if we run <tt><a href="http://reference.wolfram.com/language/ref/FindEquationalProof.html">FindEquationalProof</a></tt> on them, we find a distribution of proof lengths:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01072025shorterimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01072025shorterimg3.png' alt='' title='' width='314' height='141'> </div> </p></div> <p>And, yes, there is one lemma (with <tt><a href="http://reference.wolfram.com/language/ref/LeafCount.html">LeafCount</a></tt> 183) that is found only by a proof of length 15. But most often the proof length is 4—or about double what it could be. </p> <p>If we generate the entailment cone for lemmas using bisubstitution rather than just ordinary substitution, there are slightly more cases where <tt><a href="http://reference.wolfram.com/language/ref/FindEquationalProof.html">FindEquationalProof</a></tt> does worse at getting minimal proofs. </p> <p>For example, the lemma</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01072025shorterimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01072025shorterimg4.png' alt='' title='' width='671' height='14'> </div> </p></div> <p>and 3 others can be generated by a single bisubstitution from the original axiom, but <tt><a href="http://reference.wolfram.com/language/ref/FindEquationalProof.html">FindEquationalProof</a></tt> gives only proofs of length 4 for all of these.</p> <p>What about unrolled proofs, in which one can generate an entailment cone by starting from a particular expression, and then applying the original axiom in all possible ways? For example, let’s say we start with:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01072025shorterimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01072025shorterimg5.png' alt='' title='' width='77' height='14'> </div> </p></div> <p>Then applying bisubstitution with the original axiom once in all possible ways gives:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01072025shorterimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01072025shorterimg6.png' alt='' title='' width='601' height='186'> </div> </p></div> <p>Applying bisubstitution a second time gives a larger entailment cone: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025shorterBimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025shorterCimg7.png' alt='' title='' width='451' height='406'> </div> </p></div> <p>But now it turns out that—as indicated—one of the expressions in this cone is: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01072025shorterimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01072025shorterimg8.png' alt='' title='' width='262' height='14'> </div> </p></div> <p>So this shows that the lemma</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01072025shorterimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01072025shorterimg9.png' alt='' title='' width='359' height='14'> </div> </p></div> <p>can in principle be reached with just two steps of “unrolled” proof:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025NMCLOUDimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01072025shorterimg10.png' alt='' title='' width='441' height='41'> </div> </p></div> <p>And in this particular case, if we use <tt><a href="http://reference.wolfram.com/language/ref/FindEquationalProof.html">FindEquationalProof</a></tt> and then unroll the resulting proof we also get a proof of length 3—but it goes through a different intermediate expression:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01072025shorterimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01072025shorterimg11.png' alt='' title='' width='542' height='41'> </div> </p></div> <p>As it happens, this intermediate expression is also reached in the entailment cone that we get by starting from our “output” expression and then applying two bisubsitutions:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025LVCLoudimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025shorterEimg12-min.png' alt='' title='' width='528' height='417'> </div> </p></div> <h2 id="what-actually-is-the--models-and-the-proof">What Actually Is the “·”? Models and the Proof</h2> <p>We can think of logic (or Boolean algebra) as being associated with a certain collection of theorems. And what our axiom does is to provide something from which all theorems of logic (and nothing but theorems of logic) can be derived. At some level, we can think of it as just being about symbolic expressions. But in our effort to understand what’s going on—say with our proof—it’s sometimes useful to ask how we can “concretely” interpret these expressions.</p> <p>For example, we might ask what the · operator actually is. And what kinds of things can our symbolic variables be? In effect we’re asking for what in model theory are called <a href="https://www.wolframscience.com/metamathematics/the-model-theoretic-perspective/">“models” of our axiom system</a>. And in aligning with logic the most obvious model to discuss is one in which variables can be <tt><a href="http://reference.wolfram.com/language/ref/True.html">True</a></tt> or <tt><a href="http://reference.wolfram.com/language/ref/False.html">False</a></tt>, and the · represents either the logical operator <tt><a href="http://reference.wolfram.com/language/ref/Nand.html">Nand</a></tt> or the logical operator <tt><a href="http://reference.wolfram.com/language/ref/Nor.html">Nor</a></tt>.</p> <p>The truth table, say for <tt>Nand</tt>, is:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025actuallyimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025actuallyimg1.png' alt='' title='' width='156' height='152'> </div> </p></div> <p>And as expected, with this model for ·, we can confirm that our original axiom holds:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025actuallyimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025actuallyimg2.png' alt='' title='' width='379' height='274'> </div> </p></div> <p>In general, though, our original axiom allows two size-2 models (that we can interpret as <tt>Nand</tt> and <tt>Nor</tt>):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025actuallyimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025actuallyimg4.png' alt='' title='' width='98' height='52'> </div> </p></div> <p>It allows no size-3 models, and in fact in general <a href="https://www.wolframscience.com/nks/notes-12-9--operators-on-sets/">allows only models of size 2<sup><em>n</em></sup></a>; for example, for size 4 its models are:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025actuallyimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025actuallyimg6.png' alt='' title='' width='551' height='210'> </div> </p></div> <p>So what about <em>a</em> · <em>b</em> = <em>b</em> · <em>a</em>? What models does it allow? For size 2, it’s all 8 possible models with symmetric “multiplication tables”:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025actuallyimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01082025actuallyimg7.png' alt='' title='' width='467' height='52'> </div> </p></div> <p>But the crucial point is that the 2 models for our original axiom system are part of these. In other words, at least for size-2 models, satisfying the original axiom system implies satisfying <nobr><em>a</em> · <em>b</em> = <em>b</em> · <em>a</em>.</nobr></p> <p>And indeed any lemma derived from our axiom system must allow the models associated with our original axiom system. But it may also allow more—and sometimes many more. So here’s a map of our proof, showing how many models (out of 16 possible) each lemma allows:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025KDCLOUDimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025KDCLOUDimg5.png' alt='' title='' width='329' height='674'> </div> </p></div> <p>Here are the results for size-3 models:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01092025KDCLOUDimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01092025KDCLOUDimg6.png' alt='' title='' width='478' height='972'> </div> </p></div> <p>And, once again, these look complicated. We can think of models as defining—in some sense—<a href="https://www.wolframscience.com/nks/p804--implications-for-mathematics-and-its-foundations/">what lemmas are “about”</a>. So, for example, our original axiom is “about” <tt>Nand</tt> and <tt>Nor</tt>. The lemma <em>a</em> · <em>b</em> = <em>b</em> · <em>a</em> is “about” symmetric functions. And so on. And we might have hoped that we could gain some understanding of our proof by looking at how different lemmas that occur in it “sculpt” what is being talked about. But in fact we just seem to end up with complicated descriptions of sets that don’t seem to have any obvious relationship with each other.</p> <h2 id="what-about-a-higher-level-abstraction">What about a Higher-Level Abstraction?</h2> <p>If there’s one thing that stands out about our proof—and the analysis we’ve given of it here—it’s how fiddly and “in the weeds” it seems to be. But is that because we’re missing some big picture? Is there actually a more abstract way of discussing things, that gets to our result without having to go through all the details? </p> <p>In the history of mathematics many of the most important themes have been precisely about finding such higher-level abstractions. We could start from the <a href="https://www.wolframscience.com/nks/notes-12-9--groups-and-axioms/">explicit symbolic axioms</a></p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01072025abstractionimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01072025abstractionimg1.png' alt='' title='' width='120' height='59'> </div> </p></div> <p>or even</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01072025abstractionimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01072025abstractionimg2.png' alt='' title='' width='165' height='22'> </div> </p></div> <p>and start building up theorems much as we’ve done here. Or we could recognize that these are axioms for group theory, and then start using the abstract ideas of group theory to derive our theorems.</p> <p>So is there some higher-level version of what we’re discussing here? Remember that the issue is not about the overall structure of Boolean algebra; rather it’s about the more metamathematical question of how one can prove that all of Boolean algebra can be generated from the axiom:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01072025abstractionimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01072025abstractionimg3.png' alt='' title='' width='183' height='14'> </div> </p></div> <p>In the last few sections we’ve tried a few semi-empirical approaches to finding higher-level representations. But they haven’t gotten very far. And to get further we’re probably going to need a serious new idea.</p> <p>And, if history is a guide, we’re going to need to come up with an abstraction that somehow “goes outside of the system” before “coming back”. It’s like trying to figure out the real roots of a cubic equation, and realizing that the best way to do this is to introduce complex numbers, even though the imaginary parts will cancel at the end. </p> <p>In the direct exploration of our proof, it feels as if the intermediate lemmas we generate “wander off into the wilds of metamathematical space” before coming back to establish our final result. And if we were using a higher-level abstraction, we’d instead be “wandering off” into the space of that abstraction. But what we might hope is that—at least with the concepts we would use in discussing that abstraction—the path that would be involved would be “short enough to be accessible to human understanding”.</p> <p>Will we be able to find such an abstraction? It’s a subtle question. Because in effect it asks whether we can reduce the computational effort needed for the proof—or, in other words, whether we can find a pocket of computational reducibility in what in general will be a computationally irreducible process. But it’s not a question that can really be answered just for our specific proof on it own. After all, our “abstraction” could in principle just involve introducing a primitive that represents our whole proof or a large part of it. But to make it what we can think of as a real abstraction we need something that spans many different specific examples—and, in our case, likely many axiomatic systems or symbolic proofs.</p> <p>So is such an abstraction possible? In the history of mathematics the experience has been that after enough time (often measured in centuries) has passed, abstractions tend to be found. But at some level this has been self fulfilling. Because the areas that are considered to have remained “interesting for mathematics” tend to be just those where general abstractions have in fact been found. </p> <p>In <a href="https://writings.stephenwolfram.com/2021/09/charting-a-course-for-complexity-metamodeling-ruliology-and-more/">ruliology</a>, though, the typical experience has been different. Because there it’s been routine to <a href="https://www.wolframscience.com/nks/">sample the computational universe of possible simple programs</a> and encounter computational irreducibility. In the end it’s still inevitable that among the computational irreducibility there must be pockets of computational reducibility. But the issue is that these pockets of computational reducibility may not involve features of our system that we care about. </p> <p>So is a proof of the kind we’re discussing here more like ruliology, or more like “typical mathematics”? Insofar as it’s a mathematical-style proof of a mathematical statement it feels more like typical mathematics. But insofar as it’s something found by the computational process of automated theorem proving it perhaps seems more ruliology. </p> <p>But what might a higher-level abstraction for it look like? Figuring that out is probably tantamount to finding the abstraction. But perhaps one can at least expect that in some ways it will be metamathematical, and more about the structure and character of proofs than about their content. Perhaps it will be something related to the framework of higher category theory, or some form of meta-algebra. But as of now, we really don’t know—and we can’t even say that such an abstraction with any degree of generality is possible.</p> <h2 id="llms-to-the-rescue">LLMs to the Rescue?</h2> <p>The unexpected success of LLMs in language generation and related tasks has led to the idea that perhaps eventually <a href="https://writings.stephenwolfram.com/2024/03/can-ai-solve-science/">systems like LLMs will be able to “do everything”</a>—including for example math. We already know—not least thanks to Wolfram Language—that <a href="https://www.wolfram.com/mathematica/">lots of math can be done computationally</a>. But often the computations are hard—and, as in the example of the proof we’re discussing here, incomprehensible to humans. So the question really is: can LLMs “humanize” what has to be done in math, turning everything into a human-accessible narrative? And here our proof seems like an excellent—if challenging—test case. </p> <p>But what happens if we just ask a current LLM to generate the proof from scratch? It’s not a good picture. Very often the LLM will eagerly generate a proof, but it’ll be completely wrong, often with the same kind of mistakes that a student somewhat out of their depth might make. Here’s a typical response where an LLM simply assumes that the · operator is associative (which it isn’t in Boolean algebra) then produces a proof that on first blush looks at least vaguely plausible, but is in fact completely wrong:</p> <p><img src='https://content.wolfram.com/sites/43/2025/01/sw01072025rescueimg1.png' alt='Inadequate LLM proof' title='Inadequate LLM proof' width='611' height='489'/></p> <p>Coming up with an explanation for what went wrong is basically an exercise in “LLM psychology”. But in a first approximation one might say the following. LLMs are trained to “fill in what’s typical”, where “typical” is defined by what appears in the training set. But (absent some recent Wolfram Language and <a href="https://www.wolframalpha.com/">Wolfram|Alpha</a> based technology of ours) what’s been available as a training set has been human-generated mathematical texts, where, yes, operators are often associative, and typical proofs are fairly short. And in the “psychology of LLMs” an LLM is much more likely to “do what’s typical” than to “rigorously follow the rules”. </p> <p>If you press the LLM harder, then it might just “abdicate”, and suggest using the <a href="https://writings.stephenwolfram.com/2023/03/chatgpt-gets-its-wolfram-superpowers/">Wolfram Language as a tool</a> to generate the proof. So what happens if we do that, then feed the finished proof to the LLM and ask it to explain? Well, typically it just does what LLMs do so well, and writes an essay:</p> <p><img src='https://content.wolfram.com/sites/43/2025/01/sw01072025rescueimg2.png' alt='LLM proof essay' title='LLM proof essay' width='614' height='516'/></p> <p>So, yes, it does fine in “generally framing the problem”. But not on the details. And if you press it for details, it’ll typically eventually just start parroting what it was given as input. </p> <p>How else might we try to get the LLM to help? One thing I’ve certainly wondered is how the lemmas in the proof relate to known theorems—perhaps in quite different areas of mathematics. It’s something one might imagine one would be able to answer by searching the literature of mathematics. But, for example, textual search won’t be sufficient: it has to be some form of <a href="https://writings.stephenwolfram.com/2024/07/yet-more-new-ideas-and-new-functions-launching-version-14-1-of-wolfram-language-mathematica/#vector-databases-and-semantic-search">semantic search</a> based on the meaning or symbolic structure of lemmas, not their (fairly arbitrary) textual presentation. A vector database might be all one needs, but one can certainly ask an LLM too:</p> <p><img src='https://content.wolfram.com/sites/43/2025/01/sw01072025rescueimg3.png' alt='LLM semantic search results' title='LLM semantic search results' width='619' height='485'/></p> <p>It’s not extremely helpful, though, charmingly, it correctly identifies the source of our original axiom. I’ve tried similar queries for our whole set of lemmas across a variety of LLMs, with a variety of RAG systems. Often the LLM will talk about an interpretation for some lemma—but the lemma isn’t actual present in our proof. But occasionally the LLM will mention possible connections (“band theory”; “left self-distributive operations in quandles”; “Moufang loops”)—though so far none have seemed to quite hit the mark.</p> <p>And perhaps this failure is itself actually a result—telling us that the lemmas that show up in our proof really are, in effect, out in the wilds of metamathematical space, probing places that haven’t ever been seriously visited before by human mathematics.</p> <p>But beyond LLMs, what about more general machine learning and neural net approaches? Could we imagine <a href="https://writings.stephenwolfram.com/2024/03/can-ai-solve-science/#science-as-narrative">using a neural net as a probe to find “exploitable regularities”</a> in our proof? It’s certainly possible, but I suspect that the systematic algorithmic methods we’ve already discussed for finding optimal notations, popular lemmas, etc. will tend to do better. I suppose it would be one thing if our systematic methods had failed to even find a proof. Then we might have wanted something like neural nets to try to guess the right paths to follow, etc. But as it is, our systematic methods rather efficiently do manage to successfully find a proof. </p> <p>Of course, there’s still the issue that we’re discussing here that the proof is very “non-human”. And perhaps we could imagine that neural nets, etc.—especially when trained on existing human knowledge—could be used to “form concepts” that would help us humans to understand the proof. </p> <p>We can get at least a rough analogy for how this might work by looking at <a href="https://writings.stephenwolfram.com/2023/07/generative-ai-space-and-the-mental-imagery-of-alien-minds/">visual images produced by a generative AI system</a> trained from billions of human-selected images. There’s a concept (like “a cube”) that exists somewhere in the feature space of possible images. But “around” that concept are other things—<a href="https://writings.stephenwolfram.com/2023/07/generative-ai-space-and-the-mental-imagery-of-alien-minds/#the-notion-of-interconcept-space">“out in interconcept space”</a>—that we don’t (at least yet) explicitly have words for:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/interconceptimg1_copy.txt' data-c2c-type='text/html'> <img src='https://content.wolfram.com/sites/43/2025/01/sw01072025rescueimg4.png' alt='Interconcept space' title='Interconcept space' width='494' height='494'/></div> </p></div> <p>And it’ll presumably be similar for math, though harder to represent in something like a visual way. There’ll be existing math concepts. But these will be embedded in a vast domain of “mathematical interconcept space” that we humans haven’t yet “colonized”. And what we can imagine is that—perhaps with the help of neural nets, etc.—we can identify a limited number of “points in interconcept space” that we can introduce as new concepts that will, for example, provide useful “waypoints” in understanding our proof.</p> <h2 id="but-why-is-the-theorem-true">But Why Is the Theorem True?</h2> <p>It’s a common human urge to think that anything that’s true must be true for a reason. But what about our theorem? Why is it true? Well, we’ve seen a proof. But somehow that doesn’t seem satisfactory. We want “an explanation we can understand”. But we know that in general we can’t always expect to get one.</p> <p>It’s a fundamental implication of computational irreducibility that things can happen where the only way to “see how they happen” is just to “watch them happen”; there’s no way to “compress the explanation”.</p> <p>Consider the following patterns. They’re all generated by cellular automata. And all <a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/">live exactly 100 steps before dying out</a>. But why?</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01072025theoremimg1_copy-1.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01072025theoremimg1-1.png' alt='' title='' width='597' height='509'> </div> </p></div> <p>In a few cases it seems like we can perhaps at least begin to imagine “narratively describing” a mechanism. But most of the time all we can say is basically that they “live 100 steps because they do”. </p> <p>It’s a quintessential consequence of computational irreducibility. It might not be what we’d expect, or hope for. But it’s reality in the computational universe. And it seems very likely that our theorem—and its proof—is like this too. The theorem in effect “just happens to be true”—and if you run the steps in the proof (or find the appropriate path in the entailment cone) you’ll find that it is. But there’s no “narrative explanation”. No “understanding of why it’s true”. </p> <h2 id="intuition-and-automated-theorem-proving">Intuition and Automated Theorem Proving</h2> <p>We’ve been talking a lot about the proof of our theorem. But where did the theorem to prove come from in the first place? Its immediate origin was an <a href="https://www.wolframscience.com/nks/notes-12-9--searching-for-logic-axioms/">exhaustive search I did of simple axiom systems</a>, filtering for ones that could conceivably generate Boolean algebra, followed by testing each of the candidates using automated theorem proving. </p> <p>But how did I even get the idea of searching for a simple axiom system for Boolean algebra? Based on the axiom systems for Boolean algebra known before—and the historical difficulty of finding them—one might have concluded that it was quite hopeless to find an axiom system for Boolean algebra by exhaustive search. But by 2000 I had nearly two decades of experience in exploring the computational universe—and I was well used to the <a href="https://www.wolframscience.com/nks/chap-2--the-crucial-experiment/">remarkable phenomenon</a> that even very simple computational rules can lead to behavior of great complexity. So the result was that when I came to think about axiom systems and the foundations of mathematics my intuition led me to imagine that perhaps the simplest axiom system for something like Boolean algebra might be simple enough to exhaustively search for.</p> <p>And indeed discovering the axiom system we’ve discussed here helped further expand and deepen my intuition about the consequences of simple rules. But what about the proof? What intuition might one get from the proof as we now know it, and as we’ve discussed here?</p> <p>There’s much intuition to be got from observing the world as it is. But for nearly half a century I’ve had another crucial source of intuition: observing the computational universe—and doing computational experiments. I was recently reflecting on how I came to start developing intuition in this way. And what it might mean for intuition I could now develop from things like automated theorem proving and AI.</p> <p>Back in the mid-1970s <a href="https://www.stephenwolfram.com/publications/academic/particle-physics">my efforts in particle physics</a> led me to start using computers to do not just numerical, but <a href="https://writings.stephenwolfram.com/2013/06/there-was-a-time-before-mathematica/">also algebraic computations</a>. In numerical computations it was usual to just get a few numbers out, that perhaps one could plot to make a curve. But in algebraic computations one instead got out formulas—and <a href="https://content.wolfram.com/sw-publications/2020/07/effective-coupling-qcd.pdf" target="_blank" rel="noopener">often very ornate ones full of structure and detail</a>. And for me it was routine to get not just one formula, but many. And looking at these formulas I started to develop intuition about them. What functions would they involve? What algebraic form would they take? What kind of numbers would they involve? </p> <p>I don’t think I ever consciously realized that I was developing a new kind of computationally based intuition. But I soon began to take it for granted. And when—at the beginning of the 1980s—<a href="https://www.wolframscience.com/nks/chap-1--the-foundations-for-a-new-kind-of-science#sect-1-4--the-personal-story-of-the-science-in-this-book">I started to explore the consequences of simple abstract systems</a> like cellular automata it was natural to expect that I would get intuition from just “seeing” how they behaved. And here there was also another important element. Because part of the reason I concentrated on cellular automata was precisely because one could readily visualize their behavior on a computer. </p> <p>I don’t think I would have learned much if I’d just been printing out “numerical summaries” of what cellular automata do. But as it was, I was seeing their behavior in full detail. And—surprising though what I saw was—I was soon able to start getting an intuition for what could happen. It wasn’t a matter of knowing what the value of every cell would be. But I started doing things like identifying four general classes of cellular automata, and then recognizing the phenomenon of computational irreducibility. </p> <p>By the 1990s I was much more broadly exploring the computational universe—always trying to see what could happen there. And in almost all cases it was a story of defining simple rules, then running them, and making an explicit step-by-step visualization of what they do—and thereby in effect “seeing computation in action”.</p> <p>In recent years—spurred by our <a href="https://www.wolframphysics.org" target="_blank" rel="noopener">Physics Project</a>—I’ve increasingly explored not just computational processes, but also <a href="https://writings.stephenwolfram.com/2021/09/multicomputation-a-fourth-paradigm-for-theoretical-science/">multicomputational ones</a>. And although it’s more difficult I’ve made every effort to visualize the behavior of multiway systems—and to get intuition about what they do. </p> <p>But what about automated theorem proving? In effect, automated theorem proving is about finding a particular path in a multiway system that leads to a theorem we want. We’re not getting to see “complete behavior”; we’re in effect just seeing one particular “solution” for how to prove a theorem. </p> <p>And after one’s seen many examples, the challenge once again is to develop intuition. And that’s a large part of what I’ve been trying to do here. It’s crucial, I think, to have some way to visualize what’s happening—in effect because visual input is the most efficient way to get information into our brains. And while the visualizations we’ve developed here aren’t as direct and complete as, say, for cellular automaton evolution, I think they begin to give some overall sense of our proof—and other proofs like it.</p> <p>In studying simple programs like cellular automata, the intuition I developed led me to things like my <a href="https://www.wolframscience.com/nks/chap-6--starting-from-randomness#sect-6-2--four-classes-of-behavior">classification of cellular automaton behavior</a>, as well as to bigger ideas like the <a href="https://www.wolframscience.com/nks/chap-12--the-principle-of-computational-equivalence/">Principle of Computational Equivalence</a> and computational irreducibility. So having now exposed myself to automated theorem proving as I exposed myself to algebraic computation and the running of simple rules in the past, what general principles might I begin to see? And might they, for example, somehow make the fact that our proof works ultimately seem “obvious”?</p> <p>In some ways yes, but in other ways no. Much as with simple programs, there are axiom systems so simple that, for example, the <a href="https://www.wolframscience.com/metamathematics/axiom-systems-in-the-wild/">multiway systems they generate are highly regular</a>. But beyond a low threshold, it’s common to get very complicated—and in many ways seemingly random—multiway system structures. Typically an infinite number of lemmas are generated, with little or no obvious regularity in their forms.</p> <p>And one can expect that—following the ideas of universal computation—it’ll typically be possible to encode in any one such multiway system the behavior of any other multiway system. In terms of axioms what one’s saying is that if one sets up the right translation between theorems, one will be able to use any one such axiom system to generate the theorems of any other. But the issue is that the translation will often make major changes to the structure of the theorems, and in effect define not just a “mathematical translation” (like between geometry and algebra) but a <a href="https://www.wolframscience.com/metamathematics/uniformity-and-motion-in-metamathematical-space/#p-146">metamathematical one (as one would need to get from Peano arithmetic to set theory)</a>. </p> <p>And what this means is that it isn’t surprising that even a very simple axiom system can generate a complicated set of possible lemmas. But knowing this doesn’t immediately tell one whether those lemmas will align with some particular existing theory—like Boolean algebra. And in a sense that’s a much more detailed question.</p> <p>At some metamathematical level it might not be a natural question. But at a “mathematical level” it is. And it’s what we have to address in connection with the theorem—and proof—we’re discussing here. Many aspects of the overall form and properties of the proof will be quite generic, and won’t depend on the particulars of the axiom system we’re using. But some will. And quite what intuition we may be able to get about these isn’t clear. And perhaps it’ll necessarily be fragmented and specific—in effect responding to the presence of computational irreducibility.</p> <p>It’s perhaps worth commenting that LLMs—and machine learning in general—represent another potential source of intuition. That intuition may well be more about the general features of us as observers and thinkers. But such intuition is potentially critical in framing just what we can experience, not only in the natural world, but also in the mathematical and metamathematical worlds. And perhaps the apparent impotence of LLMs when faced with the proof we’ve been discussing already tells us something significant about the nature of “mathematical observers” like us.</p> <h2 id="so-what-does-it-mean-for-the-future-of-mathematics">So What Does It Mean for the Future of Mathematics?</h2> <p>Let’s say we never manage to “humanize” the proof we’ve been discussing here. Then in effect we’ll end up with a “black-box theorem”—that we can be sure is true—but we’ll never know quite how or why. So what would that mean for mathematics?</p> <p>Traditionally, mathematics has tended to operate in a “white box” kind of way, trying to build narrative and understanding along with “facts”. And in this respect it’s very different from natural science. Because in natural science much of our knowledge has traditionally been empirical—derived from observing the world or experimenting on it—and without any certainty that we can “understand its origins”. </p> <p>Automated theorem proving of the kind we’re discussing here—or, for that matter, pretty much any exploratory computational experimentation—aligns mathematics much more with natural science, deriving what’s true without an expectation of having a narrative explanation of why. </p> <p>Could one imagine practicing mathematics that way? One’s already to some extent following such a path as soon as one introduces axiom systems to base one’s mathematics on. Where do the axiom systems come from? In <a href="https://writings.stephenwolfram.com/2020/09/the-empirical-metamathematics-of-euclid-and-beyond/">the time of Euclid</a> perhaps they were thought of as an idealization of nature. But in more modern times they are realistically much more the result of human choice and human aesthetics.</p> <p>So let’s say we determine (given a particular axiom system) that some black-box theorem is true. Well, then we can just add it, just as we could another axiom. Maybe one day it’ll be possible to prove <a href="https://www.wolframscience.com/nks/p765--undecidability-and-intractability/">P≠NP</a> or the <a href="https://writings.stephenwolfram.com/2021/03/after-100-years-can-we-finally-crack-posts-problem-of-tag-a-story-of-computational-irreducibility-and-more/#classic-unsolved">Riemann Hypothesis</a> from existing axioms of mathematics (if they don’t in fact turn out to be independent). And—black box or not—we can expect to add them to what we assume in subsequent mathematics we do, much as they’re routinely added right now, even though their status isn’t yet known. </p> <p>But it’s one thing to add one or two “black-box theorems”. But what happens when black-box theorems—that we can think of as “experimentally determined”—start to dominate the landscape of mathematics? </p> <p>Well, then mathematics will take on much more of the character of ruliology—or of an experimental science. When it comes to the applications of mathematics, this probably won’t make much difference, except that in effect mathematics will be able to become much more powerful. But the “inner experience” of mathematics will be quite different—and much less “human”.</p> <p>If one indeed starts from axioms, it’s not at the outset obvious why everything in mathematics should not be mired in the kind of alien-seeming metamathematical complexity that we’ve encountered in the discussion of our proof here. But <a href="https://www.wolframscience.com/metamathematics/mathematics-and-physics-have-the-same-foundations/">what I’ve argued elsewhere</a> is that the fact that in our experience of doing mathematics it’s not is a reflection of how “mathematical observers like us” sample the raw metamathematical structure generated by axioms (or ultimately by the <a href="https://www.wolframscience.com/metamathematics/going-below-axiomatic-mathematics/">subaxiomatic structure of the ruliad</a>). </p> <p>The physics analogy I’ve used is that we succeed in doing mathematics at a “fluid dynamics level”, far above the detailed “molecular dynamics level” of things like the proof we’ve discussed here. Yes, we can ask questions—like ones about the structure of our proof—that probe the axiomatic “molecular dynamics level”. But it’s an important fact that in doing what we normally think of as mathematics we almost never have to; there’s a coherent way to operate purely at the “fluid dynamics level”.</p> <p>Is it useful to “dip down” to the molecular dynamics? Definitely yes, because that’s where we can readily do computations—like those in our proof, or in general those going on in the internals of the Wolfram Language. But a key idea in the design of the Wolfram Language is to provide a computational language that can express concepts at a humanized “fluid dynamics” level—in effect bridging between the way humans can think and understand things, and the way raw computation can be done with them.</p> <p>And it’s notable that while we’ve had great success over the years in defining “human-accessible” high-level representations for what amount to the “inputs” and “outputs” of computations, that’s been much less true of the “ongoing processes” of computation—or, for example, of the innards of proofs. </p> <p>Is there a good “human-level” way to represent proofs? If the proofs are short, it’s not too difficult (and the <a href="https://www.wolframalpha.com/pro/step-by-step-math-solver">step-by-step solutions technology of Wolfram|Alpha</a> provides a good large-scale example of what can be done). But—as we’ve discussed—computational irreducibility implies that some proofs will inevitably be long. </p> <p>If they’re not too long, then at least some parts of them might be constructed by human effort, say in a system like a proof assistant. But as soon as there’s much automation (whether with automated theorem proving or with LLMs) it’s basically inevitable that one will end up with things that at least approach what we’ve seen with the proof we’re discussing here. </p> <p>What can then be done? Well, that’s the challenge. Maybe there is some way to simplify, abstract or otherwise “humanize” the proof we’ve been discussing. But I rather doubt it. I think this is likely one of those cases where we inevitably find ourselves face to face with computational irreducibility. </p> <p>And, yes, there’s important science (particularly ruliology) to do on the structures we see. But it’s not mathematics as it’s traditionally been practiced. But that’s not to say that the results that come out of things like our proof won’t be useful for mathematics. They will be. But they make mathematics more like an experimental science—where what matters most is in effect the input and output rather than a “publishable” or human-readable derivation in between. And where the key issue in making progress is less in the innards of derivations than in defining clear computational ways to express input and output. Or, in effect, in capturing “human-level mathematics” in the primitives and structure of <a href="https://writings.stephenwolfram.com/2019/05/what-weve-built-is-a-computational-language-and-thats-very-important/">computational language</a>. </p> <h2 id="appendix-what-about-a-different-theorem-proving-system">Appendix: What about a Different Theorem Proving System?</h2> <p>The proof we’ve been discussing here was created using <tt><a href="http://reference.wolfram.com/language/ref/FindEquationalProof.html">FindEquationalProof</a></tt> in the Wolfram Language. But what if we were to use a different automated theorem proving system? How different would the results be? In the spectrum of things that automated theorem proving systems do, our proof here is on the difficult end. And many existing automated theorem proving systems don’t manage to do it all. But some of the stronger ones do. And in the end—despite their different internal algorithms and heuristics—it’s remarkable how similar the results they give are to those from the Wolfram Language <tt>FindEquationalProof</tt> (differences in the way lemmas vs. inference steps, etc. are identified make detailed quantitative comparisons difficult):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2025/01/sw01082025appendixCLOUDZimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2025/01/sw01072025appendiximg1.png' alt='' title='' width='650' height='433'> </div> </p></div> <h2 id='thanks' style='font-size:1.2rem'>Thanks</h2> <p style='font-size:90%'>Thanks to Nik Murzin of the <a href="https://wolframinstitute.org/" target="_blank" rel="noopener">Wolfram Institute</a> for his extensive help as part of the Wolfram Institute Empirical Metamathematics Project. Also Roger Germundsson, Sergio Sandoval, Adam Strzebonski, Michael Trott, Liubov Tupikina, James Wiles and Carlos Zapata for input. Thanks to Arnim Buch and Thomas Hillenbrand for their work in the 1990s on Waldmeister which is now part of <tt><a href="http://reference.wolfram.com/language/ref/FindEquationalProof.html">FindEquationalProof</a></tt> (also to Jonathan Gorard for his 2017 work on the interface for <tt>FindEquationalProof)</tt>. I was first seriously introduced to automated theorem proving in the late 1980s by Dana Scott, and have interacted with many people about it over the years, including Richard Assar, Bruno Buchberger, David Hillman, Norm Megill, Todd Rowland and Matthew Szudzik. (I’ve also interacted with many people about proof assistant, proof presentation and proof verification systems, both recently and in the past.)</p> ]]></content:encoded> <wfw:commentRss>https://writings.stephenwolfram.com/2025/01/who-can-understand-the-proof-a-window-on-formalized-mathematics/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item> <title>Useful to the Point of Being Revolutionary: Introducing Wolfram Notebook Assistant</title> <link>https://writings.stephenwolfram.com/2024/12/useful-to-the-point-of-being-revolutionary-introducing-wolfram-notebook-assistant/</link> <comments>https://writings.stephenwolfram.com/2024/12/useful-to-the-point-of-being-revolutionary-introducing-wolfram-notebook-assistant/#respond</comments> <pubDate>Mon, 09 Dec 2024 18:38:15 +0000</pubDate> <dc:creator><![CDATA[Stephen Wolfram]]></dc:creator> <category><![CDATA[Artificial Intelligence]]></category> <category><![CDATA[Computational Science]]></category> <category><![CDATA[Computational Thinking]]></category> <category><![CDATA[Education]]></category> <category><![CDATA[New Technology]]></category> <category><![CDATA[Software Design]]></category> <category><![CDATA[Wolfram Language]]></category> <guid isPermaLink="false">https://writings.stephenwolfram.com/?p=63574</guid> <description><![CDATA[<span class="thumbnail"><img width="128" height="108" src="https://content.wolfram.com/sites/43/2024/12/nba-icon-1.png" class="attachment-thumbnail size-thumbnail wp-post-image" alt="" /></span>Note: As of today, copies of Wolfram Version 14.1 are being auto-updated to allow subscription access to the capabilities described here. [For additional installation information see here.] Just Say What You Want! Turning Words into Computation Nearly a year and a half ago—just a few months after ChatGPT burst on the scene—we introduced the first […]]]></description> <content:encoded><![CDATA[<span class="thumbnail"><img width="128" height="108" src="https://content.wolfram.com/sites/43/2024/12/nba-icon-1.png" class="attachment-thumbnail size-thumbnail wp-post-image" alt="" /></span><style> /****************************/ /* special scrolly lightbox */ /****************************/ .mfp-figure figure { overflow:auto; max-height:calc(100vh - 50px); display: block; } img.mfp-img { width: 100% !important; max-width: 670px !important; max-height: unset !important; padding: 10px !important; background: white; } .mfp-content button.mfp-close { top: -10px !important; } .mfp-container .mfp-content { max-width: 685px !important; } </style> <p><img class="aligncenter" title="Useful to the Point of Being Revolutionary: Introducing Wolfram Notebook Assistant" src="https://content.wolfram.com/sites/43/2024/12/nba-hero-min.png" alt="Useful to the Point of Being Revolutionary: Introducing Wolfram Notebook Assistant" width="620" height="439" /></p> <p style="font-size:14px;background:#e5f2f85c;padding:5px 15px;border:1px solid #cfdde3c7;max-width:620px;margin:25px 0px;"><em>Note: As of today, copies of Wolfram Version 14.1 are being auto-updated to allow subscription access to the capabilities described here. [For additional installation information see <a href="https://support.wolfram.com/67504">here</a>.]</em></p> <h2 id="just-say-what-you-want-turning-words-into-computation">Just Say What You Want! Turning Words into Computation</h2> <p>Nearly a year and a half ago—just a few months after <a href="https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/">ChatGPT burst on the scene</a>—we introduced the <a href="https://writings.stephenwolfram.com/2023/06/introducing-chat-notebooks-integrating-llms-into-the-notebook-paradigm/">first version of our Chat Notebook technology</a> to integrate LLM-based chat into <a href="https://www.wolfram.com/notebooks/">Wolfram Notebooks</a>. For the past year and a half we’ve been building on those foundations. And today I’m excited to be able to announce that we’re releasing the fruits of those efforts: the first version of our <a href="https://www.wolfram.com/notebook-assistant-llm-kit/">Wolfram Notebook Assistant</a>.</p> <p>There are all sorts of gimmicky AI assistants out there. But Notebook Assistant isn’t one of them. It’s a serious, deep piece of new technology, and what’s more important, it’s really, really useful! In fact, I think it’s so useful as to be revolutionary. Personally, I thought I was a pretty efficient user of <a href="https://www.wolfram.com/language/">Wolfram Language</a>—but Notebook Assistant has immediately made me not only significantly more efficient, but also more ambitious in what I try to do. I hadn’t imagined just how useful Notebook Assistant was going to be. But seeing it now I can say for sure that it’s going to raise the bar for what everyone can do. And perhaps most important of all, it’s going to open up computational language and computational thinking to a vast range of new people, who in the past assumed that those things just weren’t accessible to them. </p> <p>Leveraging the decades of work we’ve done on the <a href="https://livestreams.stephenwolfram.com/category/live-ceoing/">design and implementation</a> of the Wolfram Language (and <a href="https://www.wolframalpha.com/">Wolfram|Alpha</a>), Notebook Assistant lets people just say in their own words what they want to do; then it does its best to crispen it up and give a computational implementation. Sometimes it goes all the way and just delivers the answer. But even when there’s no immediate “answer” it does remarkably well at building up structures where things can be represented computationally and tackled concretely. People really don’t need to know anything about <a href="https://writings.stephenwolfram.com/2019/05/what-weve-built-is-a-computational-language-and-thats-very-important/">computational language</a>—or <a href="https://writings.stephenwolfram.com/2016/09/how-to-teach-computational-thinking/#what-is-computational-thinking">computational thinking</a> to get started; Notebook Assistant will take their ideas, rough as they may be, and frame them in computational language terms.<span id="more-63574"></span></p> <p>I’ve long seen Wolfram Language as uniquely providing the infrastructure and “notation” to enable “computational X” for all fields X. I’m excited to say that I think Notebook Assistant now bridges “the last mile” to let anyone—at almost any level—access the power of computational language, and “do computational X”. In its original conception, Wolfram Notebook Assistant was just intended to be “useful”. But it’s emerging as something much more than that; something positively revolutionary.</p> <p>“I can’t believe it’ll do anything useful with <em>that</em>”, I’ll think. But then I’ll try it. And, very often, something amazing will happen. Something that gets me past some sticking point or over some confusion. Something that gives me an unexpected new building block—or new idea—for what I’m trying to do. And that uses the medium of our computational language to take me beyond where I would ever have reached before.</p> <p>So how does one use Notebook Assistant? Once you’ve <a href="https://www.wolfram.com/notebook-assistant-llm-kit/#pricing">signed up</a> you can just go to the toolbar of any notebook, and open a Notebook Assistant chat window:</p> <p><img src='https://content.wolfram.com/sites/43/2024/12/sw12072024wordsAimg1.png' alt='Notebook Assistant chat window' title='Notebook Assistant chat window' width='608' height='292'/></p> <p>Now tell Notebook Assistant what you want to do. The more precise and explicit you are, the better. But you don’t have to have thought things through. Just type what comes into your mind. Imagine you’ve been working in a notebook, and (somehow) you’ve got a picture of some cats. You wonder “How can I find the cats in this picture?” Well, just ask Notebook Assistant!</p> <p><img src='https://content.wolfram.com/sites/43/2024/12/sw12072024wordsAimg2.png' alt='How can I find the cats in this picture?' title='How can I find the cats in this picture?' width='617' height='436'/></p> <p>Notebook Assistant gives some narrative text, and then a piece of Wolfram Language code—which you can just run in your notebook (by pressing <img loading='lazy' style="margin-bottom: -7px" src='https://content.wolfram.com/sites/43/2024/12/sw12072024wordsAimg3.png' alt='Click to enlarge' title='Click to enlarge' width='126' height='23'/>):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12072024wordsAimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/cats-v3.png' alt='' title='' width='570' height='307'> </div> </p></div> <p>It seems a bit like magic. You say something vague, and Notebook Assistant turns it into something precise and computational—which you can then run. It’s not always as straightforward as in this example. But the important thing is that in practice (at least in my rather demanding experience) Notebook Assistant essentially always does spectacularly well at being useful—and at telling me things that move forward what I’m trying to do.</p> <h2 id="big-or-small-just-try-it">Big or Small, Just Try It!</h2> <p>Imagine that sitting next to you, you had someone very knowledgeable about <a href="https://wolfram.com/language/">Wolfram Language</a> and about computational thinking in general. Think what you might ask them. That’s what you should ask Notebook Assistant. And if there’s one thing to communicate here, it’s “Just try it!” You might think what you’re thinking about is too vague, or too specific, or too technical. But just try asking Notebook Assistant. In my experience, you’ll be amazed at what it’s able to do, and how helpful it’s able to be.</p> <p>Maybe you’re an experienced Wolfram Language user who “knows there must be a way to do something”, but can’t quite remember how. Just ask Notebook Assistant. And not only will it typically be able to find the function (or whatever) you need; it’ll also usually be able to create a code fragment that does the very specific thing you asked about. And, by the way, it’ll save you lots of typing (and debugging) by filling in those fiddly options and so on just how you need them. And even if it doesn’t quite nail it, it’ll have given a skeleton of what you need, that you can then readily edit. (And, yes, the fact that it’s realistic to edit it relies on the fact that Wolfram Language represents it in a way that humans can readily read as well as write.)</p> <p>What if you’re a novice, who’s never used Wolfram Language before, and never really been exposed to computational thinking, or for that matter, “techie stuff” at all? Well, the remarkable thing is that Notebook Assistant will still be able to help you—a lot. You can ask it something very vague, that doesn’t even seem particularly computational. It does remarkably well at “computationalizing things”. Taking what you’ve said, and finding a way to address it computationally—and to lead you into the kind of computational thinking that’ll be needed for the particular thing you’re trying to do.</p> <p>In what follows, we’ll see a whole range of different ways to use Notebook Assistant. In fact, even as I’ve been writing this, I’ve discovered quite a few new ways to use it that I’d never thought of before. </p> <p>There are some general themes, though. The most important is the way Notebook Assistant pivotally relies on the Wolfram Language. In a sense, the main mission of Notebook Assistant is to make things computational. And the whole reason it can so successfully do that is that it has the Wolfram Language as its target. It’s leveraging the unique nature of the Wolfram Language as a full-scale computational language, able to coherently represent abstract and real-world things in a computational way.</p> <p>One might think that the Wolfram Language would in the end be mainly an “implementation layer”—serving to make what Notebook Assistant produces runnable. But in reality it’s very, very much more than that. In particular, it’s basically the medium—the language—in which computational ideas are communicated. When Notebook Assistant generates Wolfram Language, it’s not just something for the computer to run; it’s also something for humans to read. Yes, Notebook Assistant can produce text, and that’s useful, especially for contextualizing things. But the most concentrated and poignant communication comes in the Wolfram Language it produces. Want the TL;DR? Just look at the Wolfram Language code!</p> <p>Part of how Wolfram Language code manages to communicate so much so efficiently is that it’s precise. You can just mention the name of a function, and you know precisely what it does. You don’t have to “scaffold” it with text to make its meaning clear. </p> <p>But there’s something else as well. With its symbolic character—and with all the coverage and consistency that we’ve spent so much effort on over the decades—the Wolfram Language is uniquely able to “communicate in fragments”. Any fragment of Wolfram Language code can be run, and more important, it can smoothly fit into a larger structure. And that means that even small fragments of code that Notebook Assistant generates can be used as building blocks. </p> <p>It produces Wolfram Language code. You read the code (and it’s critical that it’s set up to be read). You figure out if it’s what you want. (And if it’s not, you edit it, or ask Notebook Assistant to do that.) Then you can use that code as a robust building block in whatever structure—large or small—that you might be building.</p> <p>In practice, a critical feature is that you don’t have to foresee how Notebook Assistant is going to respond to what you asked. It might nail the whole thing. Or it might just take steps in the right direction. But then you just look at what it produced, and decide what to do next. Maybe in the end you’ll have to “break the problem down” to get Notebook Assistant to deal with it. But there’s no need to do that in advance—and Notebook Assistant will often surprise you by how far it’s able to get on its own.</p> <p>You might imagine that Notebook Assistant would usually need you to break down what you’re asking into “pure computational questions”. But in effect it has good enough “general knowledge” that it doesn’t. And in fact it will usually do better the more context you give it about why you’re asking it to do something. (Is it for chemical engineering, or for sports analytics, or what?)</p> <p>But how ambitious can what you ask Notebook Assistant be? What if you ask it something “too big”? Yes, it won’t be able to solve that 100-year-old problem or build a giant software system in its immediate output. But it does remarkably well at identifying pieces that it can say something about, and that can help you understand how to get started. So, as with many things about Notebook Assistant, you shouldn’t assume that it won’t be helpful; just try it and see what happens! And, yes, the more you use Notebook Assistant, the more you’ll learn just what kind of thing it does best, and how to get the most out of it.</p> <p>So how should you ultimately think about Notebook Assistant? Mainly you should think of it like an very knowledgeable and hardworking expert. But at a more mundane level it can serve as a super-enhanced <a href="https://reference.wolfram.com/language/">documentation</a> lookup system or code completion system. It can also take something vague you might ask it, and somehow undauntedly find the “closest formalizable construct”—that it can then compute with. </p> <p>An important feature is that it is—in human terms—almost infinitely patient and hardworking. Where a human might think: “it’s too much trouble to write out all those details”, Notebook Assistant just goes ahead and does it. And, yes, it saves you huge amounts of typing. But, more important, it makes it “cheap” to do things more perfectly and more completely. So that means you actually end up labeling those plot axes, or adding a comment to your code, or coming up with meaningful names for your variables.</p> <p>One of the overarching points about Notebook Assistant is that it lowers the barrier to getting help. You don’t have to think carefully about formulating your question. You don’t have to go clicking through lots of links. And you don’t have to worry that it’s too trivial to waste a coworker’s time on the question. You can just ask Notebook Assistant. Oh, and it’ll give you a response immediately. (And you can go back and forth with it, and ask it to clarify and refine things.)</p> <h2 id="how-can-i-do-that">“How Can I Do That?”</h2> <p>At least for me it’s very common: you have something in your mind that you want to do, but you don’t quite know how to achieve it in the <a href="https://wolfram.com/language/">Wolfram Language</a>. Well, now you can just ask Notebook Assistant! </p> <p>I’ll show various examples here. It’s worth emphasizing that these examples typically won’t look exactly the same if you run them again. Notebook Assistant has a certain amount of “AI-style random creativity”—and it also routinely makes use of what you’ve done earlier in a session, etc. It also has to be said that Notebook Assistant will sometimes make mistakes—or will misunderstand what you’re asking it. But if you don’t like what it did, you can always press the <img loading='lazy' style="margin-bottom: -1px" src='https://content.wolfram.com/sites/43/2024/12/sw12092024regeneratebutton.png' alt='' title='' width='16' height='16'> button to generate a new response. </p> <p>Let’s start off with a basic computational operation: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12092024howupdateimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12092024howupdateimg1.png' alt='' title='' width='370' height='37'> </div> <p id="examplegallery"><a class="magnific image" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/How-Can-I-Do-That-01-output-zoom-v2.png"><img src="https://content.wolfram.com/sites/43/2024/11/How-Can-I-Do-That-01-output-v2.png" height="252" width="400" loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>As an experienced user of Wolfram Language, a simple “do it with <tt><a href="http://reference.wolfram.com/language/ref/FoldList.html">FoldList</a></tt>” would already have been enough. But Notebook Assistant goes all the way—generating specific code for exactly what I asked. Courtesy of Wolfram Language, the code is very short and easy to read. But Notebook Assistant does something else for one as well: it produces an example of the code in action—which lets one check that it really does what one wanted. Oh, and then it goes even further, and tells me about a function in the <a href="https://resources.wolframcloud.com/FunctionRepository">Wolfram Function Repository</a> (that I, for one, had never heard of; wait did I write it?) that directly does the operation I want.</p> <p>OK, so that was a basic computational operation. Now let’s try something a little more elaborate:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/11/NAtest3img3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/How-Can-I-Do-That-02-input-v2.png' alt='' title='' width='399' height='37'> </div> </p></div> <p id="examplegallery"><a class="magnific image" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/How-Can-I-Do-That-02-output-zoom-v2.png"><img src="https://content.wolfram.com/sites/43/2024/11/How-Can-I-Do-That-02-output-v2.png" width='620' height='430' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>This involves several steps, but Notebook Assistant nails it, giving a nice example. (And, yes, it’s reading the <a href="https://reference.wolfram.com/language/">Wolfram Language documentation</a>, so often its examples are based on that.) </p> <p>But even after giving an A+ result right at the top, Notebook Assistant goes on, talking about various options and extensions. And despite being (I think) quite an expert on what the Wolfram Language can do, I was frankly surprised by what it came up with; I didn’t know about these capabilities! </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/11/sw11252024everestimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11252024everestimg1.png' alt='' title='' width='353' height='auto'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/11/sw11252024everestimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11252024everestimg2.png' alt='' title='' width='353' height='auto'> </div> </p></div> <p>There’s an incredible amount of functionality built into the Wolfram Language (yes, four decades worth of it). And quite often things you want to do can be done with just a single Wolfram Language function. But which one? One of the great things about Notebook Assistant is that it’s very good at taking “raw thoughts”, sloppily worded, and figuring out what function you need. Like here, bam, “use <tt><a href="http://reference.wolfram.com/language/ref/LineGraph.html">LineGraph</a></tt>!”</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/11/sw11252024howimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11252024howimg10.png' alt='' title='' width='370' height='37'> </div> </p></div> <p id="examplegallery"><a class="magnific image" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/How-Can-I-Do-That-04-output-zoom-v2.png"><img src="https://content.wolfram.com/sites/43/2024/11/How-Can-I-Do-That-04-output-v2.png" width='602' height='219' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>You can ask Notebook Assistant “fairly basic” questions, and it’ll respond with nice, synthesized-on-the-spot “custom documentation”: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/11/sw11252024howimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11252024howimg12.png' alt='' title='' width='350' height='35'> </div> </p></div> <p id="examplegallery"><a class="magnific image" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/How-Can-I-Do-That-05-output-zoom-v2.png"><img src="https://content.wolfram.com/sites/43/2024/11/How-Can-I-Do-That-05-output-v2.png" width='617' height='384' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>You can also ask it about obscure and technical things; it knows about every Wolfram Language function, with all its details and options:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/11/sw11252024howimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11252024howimg14.png' alt='' title='' width='431' height='37'> </div> </p></div> <p id="examplegallery"><a class="magnific image" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/How-Can-I-Do-That-06-output-zoom-v2.png"><img src="https://content.wolfram.com/sites/43/2024/11/How-Can-I-Do-That-06-output-v2.png" width='609' height='167' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12072024howAimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12072024howAimg13.png' alt='' title='' width='370' height='37'> </div> </p></div> <p id="examplegallery"><a class="magnific image fullzoom" alt="" title="" href="https://content.wolfram.com/sites/43/2024/12/sw12072024howAimg14.png"><img src="https://content.wolfram.com/sites/43/2024/12/sw12072024howAimg14.png" width='587' height='279' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12072024howAimg15_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12072024howAimg15.png' alt='' title='' width='370' height='37'> </div> </p></div> <p id="examplegallery"><a class="magnific image fullzoom" alt="" title="" href="https://content.wolfram.com/sites/43/2024/12/sw12072024howAimg16.png"><img src="https://content.wolfram.com/sites/43/2024/12/sw12072024howAimg16.png" width='591' height='282' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>Notebook Assistant is surprisingly good at writing quite minimal code that does sophisticated things: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/11/sw11252024howimg16_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11252024howimg16.png' alt='' title='' width='453' height='37'> </div> </p></div> <p id="examplegallery"><a class="magnific image" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/How-Can-I-Do-That-07-output-zoom-v2.png"><img src="https://content.wolfram.com/sites/43/2024/11/How-Can-I-Do-That-07-output-v2.png" width='620' height='269' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>If you ask it open-ended questions, it’ll often answer with what amount to custom-synthesized <a href="https://writings.stephenwolfram.com/2017/11/what-is-a-computational-essay/">computational essays</a>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/11/sw11252024howimg18_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11252024howimg18.png' alt='' title='' width='370' height='37'> </div> </p></div> <p id="examplegallery"><a class="magnific image" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/How-Can-I-Do-That-08-output-zoom-v2.png"><img src="https://content.wolfram.com/sites/43/2024/11/How-Can-I-Do-That-08-output-v2.png" width='595' height='318' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>Notebook Assistant is pretty good at “pedagogically explaining what you can do”:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/11/sw11252024howimg20_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11252024howimg20.png' alt='' title='' width='370' height='37'> </div> </p></div> <p id="examplegallery"><a class="magnific image" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/How-Can-I-Do-That-09-output-zoom-v2.png"><img src="https://content.wolfram.com/sites/43/2024/11/How-Can-I-Do-That-09-output-v2.png" width='591' height='243' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>In everything we’ve seen so far, the workflow is that you ask Notebook Assistant something, then it generates a result, and then you use it. But everything can be much more interactive, and you can go back and forth with Notebook Assistant—say refining what you want it to do. </p> <p>Here I had something in mind, but I was quite sloppy in describing it. And although Notebook Assistant came up with a reasonable interpretation of what I asked, it wasn’t really what I had in mind: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/11/sw11252024howimg22_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11252024howimg22.png' alt='' title='' width='397' height='37'> </div> </p></div> <p id="examplegallery"><a class="magnific image" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/How-Can-I-Do-That-10-output-zoom-v2.png"><img src="https://content.wolfram.com/sites/43/2024/11/How-Can-I-Do-That-10-output-v2.png" width='620' height='236' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>So I went back and edited what I asked (right there in the Notebook Assistant window), and tried again:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/11/sw11252024howimg24_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11252024howimg24.png' alt='' title='' width='432' height='37'> </div> </p></div> <p id="examplegallery"><a class="magnific image" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/How-Can-I-Do-That-11-output-zoom-v2.png"><img src="https://content.wolfram.com/sites/43/2024/11/How-Can-I-Do-That-11-output-v2.png" width='620' height='229' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>The result was better, but still not right. But all I had to do was to tell it to make a change, and lo and behold, I got what I was thinking of:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/11/sw11252024howimg26_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11252024howimg26.png' alt='' title='' width='370' height='37'> </div> </p></div> <p id="examplegallery"><a class="magnific image" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/How-Can-I-Do-That-12-output-zoom-v2.png"><img src="https://content.wolfram.com/sites/43/2024/11/How-Can-I-Do-That-12-output-v2.png" width='620' height='210' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>By the way, you can also perfectly well ask about <a href="https://reference.wolfram.com/language/howto/DeployAWebPageInTheWolframCloud.html">deployment to the web</a>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/11/sw11252024howimg28_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11252024howimg28.png' alt='' title='' width='439' height='37'> </div> </p></div> <p id="examplegallery"><a class="magnific image" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/How-Can-I-Do-That-13-output-zoom-v2.png"><img src="https://content.wolfram.com/sites/43/2024/11/How-Can-I-Do-That-13-output-v2.png" width='620' height='452' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>And while I might have some minor quibbles (why use a string for the molecule name, not <tt>"Chemical"</tt>; why not use <tt><a href="http://reference.wolfram.com/language/ref/CloudPublish.html">CloudPublish</a></tt>; etc.) what Notebook Assistant produces works, and provides an excellent scaffold for further development. And, as it often does, Notebook Assistant adds a kind of “by the way, did you know?” at the end, showing how one could use <tt><a href="http://reference.wolfram.com/language/ref/ARPublish.html">ARPublish</a></tt> to produce output for augmented reality. </p> <p>Here’s one last example: creating a user interface element. I want to make a slider-like control that goes around (like an analog clock):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/11/sw11252024howimg30_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11252024howimg30.png' alt='' title='' width='370' height='37'> </div> </p></div> <p id="examplegallery"><a class="magnific image" alt="" title="" href="https://content.wolfram.com/sites/43/2024/12/How-Can-I-Do-That-14-output-zoom.gif"><img src="https://content.wolfram.com/sites/43/2024/12/How-Can-I-Do-That-14-output.gif" width='620' height='174' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>Well, actually, I had in mind something more minimal:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/11/sw11252024howimg32_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11252024howimg32.png' alt='' title='' width='370' height='37'> </div> </p></div> <p id="examplegallery"><a class="magnific image" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/How-Can-I-Do-That-15-output-zoom-v2.png"><img src="https://content.wolfram.com/sites/43/2024/11/How-Can-I-Do-That-15-output-v2.png" width='620' height='320' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>Impressive. Even if maybe it got that from some documentation or other example. But what if I wanted to tweak it? Well, actually, Notebook Assistant does seem to understand what it has:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/11/sw11252024howimg34_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11252024howimg34.png' alt='' title='' width='438' height='37'> </div> </p></div> <p id="examplegallery"><a class="magnific image" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/How-Can-I-Do-That-16-output-zoom-v2.png"><img src="https://content.wolfram.com/sites/43/2024/11/How-Can-I-Do-That-16-output-v2.png" width='620' height='302' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <h2 id="can-you-just-do-that-for-me">“Can You Just Do That for Me?”</h2> <p>What we’ve seen so far are a few examples of asking Notebook Assistant to tell us how to do things. But you can also just ask Notebook Assistant to do things for you, in effect producing “finished goods”: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/11/sw11262024canimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11262024canimg1.png' alt='' title='' width='426' height='38'> </div> </p></div> <p id="examplegallery"><a class="magnific image" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/Can-You-Just-Do-That-01-output-zoom.png"><img src="https://content.wolfram.com/sites/43/2024/11/sw11262024canimg2.png" width='400' height='auto' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>Pretty impressive! And it even just went ahead and made the picture. By the way, if I wanted the code packaged up into a single line, I can just ask for that:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/11/sw11262024canimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11262024canimg3.png' alt='' title='' width='370' height='38'> </div> </p></div> <p id="examplegallery"><a class="magnific image" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/Can-You-Just-Do-That-02-output-zoom.png"><img src="https://content.wolfram.com/sites/43/2024/11/sw11262024canimg4.png" width='610' height='114' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>Notebook Assistant can generate interactive content too. And—very usefully—you don’t have to give precise specifications up front: Notebook Assistant will automatically pick “sensible defaults” (that, yes, you can trivially edit later, or just tell Notebook Assistant to change it for you):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/11/sw11262024canimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11262024canimg5.png' alt='' title='' width='370' height='38'> </div> </p></div> <p id="examplegallery"><a class="magnific image" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/Can-You-Just-Do-That-03-output-zoom.png"><img src="https://content.wolfram.com/sites/43/2024/11/sw11262024canimg6.png" width='401' height='346' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>Here’s an example that requires putting together several different ideas and functions. But Notebook Assistant manages it just fine—and in fact the code it produces is interesting and clarifying to read:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/11/sw11262024canimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11262024canimg7.png' alt='' title='' width='370' height='38'> </div> </p></div> <p id="examplegallery"><a class="magnific image" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/Can-You-Just-Do-That-04-output-zoom.png"><img src="https://content.wolfram.com/sites/43/2024/11/sw11262024canimg8.png" width='550' height='257' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>Notebook Assistant knows about <a href="https://reference.wolfram.com/language">every area of Wolfram Language functionality</a>—here <a href="https://reference.wolfram.com/language/tutorial/SyntheticGeometry.html">synthetic geometry</a>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/11/sw11262024canimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11262024canimg9.png' alt='' title='' width='370' height='38'> </div> </p></div> <p id="examplegallery"><a class="magnific image" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/Can-You-Just-Do-That-05-output-zoom.png"><img src="https://content.wolfram.com/sites/43/2024/11/sw11262024canimg10.png" width='570' height='314' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>And here <a href="https://www.wolfram.com/language/core-areas/chemistry">chemistry</a>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/11/sw11262024canimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11262024canimg11.png' alt='' title='' width='414' height='38'> </div> </p></div> <p id="examplegallery"><a class="magnific image" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/Can-You-Just-Do-That-06-output-zoom.png"><img src="https://content.wolfram.com/sites/43/2024/11/sw11262024canimg12.png" width='399' height='422' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>It also knows about things like the Wolfram Function Repository, here running a function from there that <a href="https://resources.wolframcloud.com/FunctionRepository/resources/SolarVideo/">generates a video</a>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/11/sw11262024canimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11262024canimg13.png' alt='' title='' width='370' height='38'> </div> </p></div> <p id="examplegallery"><a class="magnific image" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/Can-You-Just-Do-That-07-output-zoom.png"><img src="https://content.wolfram.com/sites/43/2024/11/sw11262024canimg14.png" width='400' height='289' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>Here’s something that again leverages Notebook Assistant’s encyclopedic knowledge of <a href="https://wolfram.com/language/">Wolfram Language</a> capabilities, now pulling in real-time data:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/11/sw11262024canimg15_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11262024canimg15.png' alt='' title='' width='404' height='38'> </div> </p></div> <p id="examplegallery"><a class="magnific image" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/Can-You-Just-Do-That-08-output-zoom.png"><img src="https://content.wolfram.com/sites/43/2024/11/sw11262024canimg16.png" width='615' height='237' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>I can’t resist trying a few more examples:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/11/sw11262024canimg17_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11262024canimg17.png' alt='' title='' width='370' height='38'> </div> </p></div> <p id="examplegallery"><a class="magnific image" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/Can-You-Just-Do-That-09-output-zoom.png"><img src="https://content.wolfram.com/sites/43/2024/11/sw11262024canimg18.png" width='398' height='494' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>Let’s try something involving more sophisticated math:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/11/sw11262024canimg19_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/Can-You-Just-Do-That-10-input.png' alt='' title='' width='370' height='38'> </div> </p></div> <p id="examplegallery"><a class="magnific image" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/Can-You-Just-Do-That-10-output-zoom.png"><img src="https://content.wolfram.com/sites/43/2024/11/sw11262024canimg20.png" width='614' height='368' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>(I would have used <tt><a href="http://reference.wolfram.com/language/ref/RegularPolygon.html">RegularPolygon</a></tt><tt>[5]</tt>, and I don’t think <tt><a href="http://reference.wolfram.com/language/ref/DiscretizeRegion.html">DiscretizeRegion</a></tt> is necessary … but what Notebook Assistant did is still very impressive.)</p> <p>Or here’s some more abstract math:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/11/sw11262024canimg21_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11262024canimg21.png' alt='' title='' width='370' height='38'> </div> </p></div> <p id="examplegallery"><a class="magnific image" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/Can-You-Just-Do-That-11-output-zoom.png"><img src="https://content.wolfram.com/sites/43/2024/11/sw11262024canimg22.png" width='397' height='388' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>OK, so Notebook Assistant provides a very powerful way to go from words to computational results. So what then is the role of computational language and of “raw Wolfram Language”? First of all, it’s the Wolfram Language that makes everything we’ve seen here work; it’s what the words are being turned into so that they can be computed from. But there’s something much more than that. The Wolfram Language isn’t just for computers to compute with. It’s also for humans to think with. And it’s an incredibly powerful medium for that thinking. Like a great generalization of mathematical notation from the distant past, it provides a streamlined way to broadly formalize things in computational terms—and to systematically build things up. </p> <p>Notebook Assistant is great for getting started with things, and for producing a first level of results. But words aren’t ultimately an efficient way say how to build up from there. You need the crisp, formal structure of computational language. In which even the tiny amounts of code you write can be incredibly powerful. </p> <p>Now that I’ve been using Notebook Assistant for a while I think I can say that on quite a few occasions it’s helped me launch things, it’s helped me figure out details, and it’s helped me debug things that have gone wrong. But the backbone of my computational progress has been me writing Wolfram Language myself (though quite often starting from something Notebook Assistant wrote). Notebook Assistant is an important new part of the “on ramp” to Wolfram Language; but it’s raw Wolfram Language that lets one really zoom forward to build new structures and achieve what’s computationally possible.</p> <h2 id="where-do-i-start">“Where Do I Start?”</h2> <p>Computational thinking is an incredibly powerful approach. But sometimes it’s hard to get started with, particularly if you’re not used to it. And although one might not imagine it, Notebook Assistant can be very useful here, essentially helping one brainstorm about what direction to take.</p> <p>I was explaining this to our head of Sales, and tried:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/11/sw11262024whereimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11262024whereimg1.png' alt='' title='' width='370' height='38'> </div> </p></div> <p id="examplegallery"><a class="magnific image" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/Where-Do-I-Start-01-output-zoom.png"><img src="https://content.wolfram.com/sites/43/2024/11/sw11262024whereimg2.png" width='618' height='521' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>I really didn’t expect this to do anything terribly useful … and I was frankly amazed at what happened. Pushing my luck I tried: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/11/sw11262024whereimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11262024whereimg3.png' alt='' title='' width='370' height='38'> </div> </p></div> <p id="examplegallery"><a class="magnific image" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/Where-Do-I-Start-02-output-zoom.png"><img src="https://content.wolfram.com/sites/43/2024/11/sw11262024whereimg4.png" width='618' height='356' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>Obviously this isn’t the end of the story, but it’s a remarkably good beginning—going from a vague request to something that’s set up to be thought about computationally. </p> <p>Here’s another example. I’m trying to invent a good system for finding books in my library. I just took a picture of a shelf of books behind my desk:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/11/sw11262024whereimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12092024wherebooks.png' alt='' title='' width='370' height='37'> </div> </p></div> <p id="examplegallery"><a class="magnific image" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/Where-Do-I-Start-03-output-zoom.png"><img src="https://content.wolfram.com/sites/43/2024/11/sw11262024whereimg6.png" width='399' height='937' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>Once again, a very impressive result. Not the final answer, but a surprisingly good start. That points me in the direction of <a href="https://reference.wolfram.com/language/guide/ImageProcessing.html">image processing</a> and <a href="https://reference.wolfram.com/language/guide/SegmentationAnalysis.html">segmentation</a>. At first, it’s running too slowly, so it downsamples the image. Then it tells me I might need to tweak the parameters. So I just ask it to create a tool to do that:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/11/sw11262024whereimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11262024whereimg7.png' alt='' title='' width='380' height='38'> </div> </p></div> <p id="examplegallery"><a class="magnific image" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/Where-Do-I-Start-04-output-zoom.png"><img src="https://content.wolfram.com/sites/43/2024/11/sw11262024whereimg8.png" width='399' height='546' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>And then:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/11/sw11262024whereimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11262024whereimg9.png' alt='' title='' width='370' height='38'> </div> </p></div> <p id="examplegallery"><a class="magnific image" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/Where-Do-I-Start-05-output-zoom.png"><img src="https://content.wolfram.com/sites/43/2024/11/sw11262024whereimg10.png" width='717' height='438' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>It’s very impressive how much Notebook Assistant can help one go “from zero to computation”. And when one gets used to using it, it starts to be quite natural to just try it on all sorts of things one’s thinking about. But if it’s just “quick, tell me something to compute”, it’s usually harder to come up with anything.</p> <p>And that reminds me of the very first time I ever saw a computer in real life. It was 1969 and I was 9 years old (and the computer was an IBM mainframe). The person who was showing me the computer asked me: “So what do you want to compute?” I really had no idea at that time “what one might compute”. Rather lamely I said “the weight of a dinosaur”. So, 55 years later, let’s try that again:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12072024whereBimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12072024whereBimg11.png' alt='' title='' width='370' height='37'> </div> </p></div> <p id="examplegallery"><a class="magnific image fullzoom" alt="" title="" href="https://content.wolfram.com/sites/43/2024/12/sw12072024whereBimg12.png"><img src="https://content.wolfram.com/sites/43/2024/12/sw12072024whereBimg12.png" width='536' height='210' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>And let’s try going further:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12072024whereBimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12072024whereBimg13.png' alt='' title='' width='370' height='37'> </div> </p></div> <p id="examplegallery"><a class="magnific image fullzoom" alt="" title="" href="https://content.wolfram.com/sites/43/2024/12/sw12072024whereBimg14.png"><img src="https://content.wolfram.com/sites/43/2024/12/sw12072024whereBimg14.png" width='552' height='260' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <h2 id="tweak-the-details-for-me">“Tweak the Details for Me”</h2> <p>Something I find very useful with Notebook Assistant is having it “tweak the details” of something I’ve already generated. For example, let’s say I have a basic plot of a sine curve in a notebook:</p> <p id="examplegallery"><a class="magnific image fullzoom" alt="" title="" href="https://content.wolfram.com/sites/43/2024/12/Tweak-the-Details-img2.png"><img src="https://content.wolfram.com/sites/43/2024/12/Tweak-the-Details-img2.png" width='619' height='161' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>Assuming I have that notebook in focus, Notebook Assistant will “see” what’s there. So then I can tell it to modify my sine curve—and what it will do is produce new code with extra details added:</p> <p id="examplegallery"><a class="magnific image fullzoom" alt="" title="" href="https://content.wolfram.com/sites/43/2024/12/sw12072024tweakBimg4.png"><img src="https://content.wolfram.com/sites/43/2024/12/sw12072024tweakBimg4.png" width='624' height='254' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>That’s a good result. But as a Wolfram Language aficionado I notice that the code is a bit more complicated than it needs to be. So what can I do about it? Well, I can just ask Notebook Assistant to simplify it:</p> <p id="examplegallery"><a class="magnific image fullzoom" alt="" title="" href="https://content.wolfram.com/sites/43/2024/12/sw12072024tweakBimg6.png"><img src="https://content.wolfram.com/sites/43/2024/12/sw12072024tweakBimg6.png" width='619' height='228' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>I can keep going, asking it to further “embellish” the plot:</p> <p id="examplegallery"><a class="magnific image fullzoom" alt="" title="" href="https://content.wolfram.com/sites/43/2024/12/sw12072024tweakBimg8.png"><img src="https://content.wolfram.com/sites/43/2024/12/sw12072024tweakBimg8.png" width='619' height='202' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>Let’s push our luck and try going even further:</p> <p id="examplegallery"><a class="magnific image fullzoom" alt="" title="" href="https://content.wolfram.com/sites/43/2024/12/sw12072024tweakBimg10.png"><img src="https://content.wolfram.com/sites/43/2024/12/sw12072024tweakBimg10.png" width='619' height='239' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>Oops. Something went wrong. No callouts, and a pink “error” box. I tried regenerating a few times. Often that helps. But this time it didn’t seem to. So I decided to give Notebook Assistant a suggestion:</p> <p id="examplegallery"><a class="magnific image fullzoom" alt="" title="" href="https://content.wolfram.com/sites/43/2024/12/sw12072024tweakBimg12.png"><img src="https://content.wolfram.com/sites/43/2024/12/sw12072024tweakBimg12.png" width='619' height='230' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>And now it basically got it. And with a little more back and forth I can expect to get exactly what I want. </p> <p>In the Wolfram Language, functions (like <tt><a href="http://reference.wolfram.com/language/ref/Plot.html">Plot</a></tt>) are set up to have good automatic defaults. But when you want, for example, to achieve some particular, detailed look, you often have to end up specifying all sorts of additional settings. And Notebook Assistant is very good at doing this, and in effect, patiently typing out all those option settings, etc.</p> <h2 id="what-went-wrong-fix-it">“What Went Wrong? Fix It!”</h2> <p>Let’s say you wrote some <a href="https://wolfram.com/language/">Wolfram Language</a> (or perhaps Notebook Assistant did it for you). And let’s say it doesn’t work. Maybe it just produces the wrong output. Or maybe it generates all sorts of messages when it runs. Either way, you can just ask the Assistant “What went wrong?”</p> <p id="examplegallery"><a class="magnific image fullzoom" alt="" title="" href="https://content.wolfram.com/sites/43/2024/12/What-Went-Wrong-img2C.png"><img src="https://content.wolfram.com/sites/43/2024/12/What-Went-Wrong-img2C.png" width='620' height='105' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>Here the Assistant rather patiently and clearly explained the message that was generated, then suggested “correct code”:</p> <p id="examplegallery"><a class="magnific image fullzoom" alt="" title="" href="https://content.wolfram.com/sites/43/2024/12/What-Went-Wrong-img3C.png"><img src="https://content.wolfram.com/sites/43/2024/12/What-Went-Wrong-img3C.png" width='620' height='103' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>The Assistant tends to be remarkably helpful in situations like this—even for an experienced Wolfram Language user like me. In a sense, though, it has an “unfair advantage”. Not only has it learned “what’s reasonable” from seeing large amounts of Wolfram Language code; it also has access to “internal information”—like a stream of telemetry about messages that were generated (as well as stack traces, etc.). </p> <p>In general, Notebook Assistant is rather impressive at “spotting errors” even in long and sophisticated pieces of Wolfram Language code—and in suggesting possible fixes. And I can say that this is a way in which using Notebook Assistant has immediately saved me significant time in doing things with Wolfram Language.</p> <h2 id="improve-my-code">“Improve My Code”</h2> <p>Notebook Assistant doesn’t just know how to write <a href="https://wolfram.com/language/">Wolfram Language</a> code; it knows how to write good Wolfram Language code. And in fact if you give it even a sloppy “outline” of Wolfram Language code, the Assistant is usually quite good at making it clean and complete. And that’s important not only in being able to produce code that will run correctly; it’s also important in making code that’s clear enough that you can understand it (courtesy of the readability of good Wolfram Language code).</p> <p>Here’s an example starting with a rather horrible piece of Wolfram Language code on the right:</p> <p id="examplegallery"><a class="magnific image fullzoom" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/sw112720240840img2.png"><img src="https://content.wolfram.com/sites/43/2024/11/sw112720240840img2.png" width='620' height='349' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>The code on the right is quite buggy (it doesn’t initialize <tt>list</tt>, for example). But Notebook Assistant guesses what it’s supposed to do, and then makes nice “Wolfram style” versions, explaining what it’s doing.</p> <p>If the code you’re dealing with is long and complicated, Notebook Assistant may (like a person) get confused. But you can always select a particular part, then ask Notebook Assistant specifically about that. And the symbolic nature—and coherence—of the Wolfram Language will typically mean that Notebook Assistant will be able to act “modularly” on the piece that you’ve selected.</p> <p>Something I’ve found rather useful is to have Notebook Assistant refactor code for me. Here I’m starting from a sequence of separate inputs (yes, itself generated by Notebook Assistant) and I’m turning it into a single function:</p> <p id="examplegallery"><a class="magnific image fullzoom" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/sw112720240840img4.png"><img src="https://content.wolfram.com/sites/43/2024/11/sw112720240840img4.png" width='620' height='276' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>Now we can use the function however we want:</p> <p id="examplegallery"><a class="magnific image fullzoom" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/sw112720240840img5.png"><img src="https://content.wolfram.com/sites/43/2024/11/sw112720240840img5.png" width='620' height='223' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>Going the other way is useful too. And Notebook Assistant is surprisingly good at grokking what a piece of code is “about”, and coming up with reasonable names for variables, functions, etc.:</p> <p id="examplegallery"><a class="magnific image fullzoom" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/Improve-My-Code-04-output-2.png"><img src="https://content.wolfram.com/sites/43/2024/11/Improve-My-Code-04-output-2.png" width='620' height='435' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>Yet another thing Notebook Assistant is good at is knowing all sorts of tricks to make code run faster:</p> <p id="examplegallery"><a class="magnific image fullzoom" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/sw112720240840img8.png"><img src="https://content.wolfram.com/sites/43/2024/11/sw112720240840img8.png" width='620' height='199' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <h2 id="explain-that-to-me">“Explain That to Me”</h2> <p>“What does that piece of code actually do?” <a href="https://www.wolfram.com/language/elementary-introduction/3rd-ed/47-writing-good-code.html">Good Wolfram Language code</a>—like good prose or good mathematical formalism—can succinctly communicate ideas, in its case in computational terms, precisely grounded in the definition of the language. But (as with prose and math) you sometimes need a more detailed exploration. And providing narrative explanations of code is something else that Notebook Assistant is good at. Here it’s taking a <a href="https://www.wolframscience.com/nks/notes-10-5--lengths-of-number-representations/">single line of (rather elegant) Wolfram Language code</a> and writing a whole essay about what the code is doing: </p> <p id="examplegallery"><a class="magnific image fullzoom" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/sw11272024explainimg2.png"><img src="https://content.wolfram.com/sites/43/2024/11/sw11272024explainimg2.png" width='620' height='520' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>What if you have a long piece of code, and you just want to explain some small part of it? Well, since Notebook Assistant sees selections you make, you can just select one part of your code, and Notebook Assistant will know that’s what you want to explain.</p> <h2 id="fill-in-the-paperwork-for-me">“Fill in the Paperwork for Me”</h2> <p>The <a href="https://wolfram.com/language/">Wolfram Language</a> is carefully designed to have built-in functions that just “do what you need”, without having to use idioms or set up repeated boilerplate. But there are situations where there’s inevitably a certain amount of “bureaucracy” to do. For example, let’s say you’re writing a function to deploy to the Function Repository. You enter the definition for the function into a <a href="https://writings.stephenwolfram.com/2019/06/the-wolfram-function-repository-launching-an-open-platform-for-extending-the-wolfram-language/#contributing-to-the-repository">Function Resource Definition Notebook</a>. But now you have to fill in documentation, examples, etc. And in fact that’s often the part that typically takes the longest. But now you can ask Notebook Assistant to do it for you. Here I put the cursor in the Examples section:</p> <p id="examplegallery"><a class="magnific image fullzoom" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/sw11272024paperworkimg2.png"><img src="https://content.wolfram.com/sites/43/2024/11/sw11272024paperworkimg2.png" width='620' height='253' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>It’s always a good idea to set up tests for functions you define. And this is another thing Notebook Assistant can help with:</p> <p id="examplegallery"><a class="magnific image fullzoom" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/sw11272024paperworkimg4.png"><img src="https://content.wolfram.com/sites/43/2024/11/sw11272024paperworkimg4.png" width='620' height='846' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <h2 id="the-inspiration-button">The “Inspiration Button” <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11272024inspirationimg2.png' alt='' title='' width='11' height='18'/></h2> <p>All the examples of interacting with Notebook Assistant that we’ve seen so far involve using the Notebook Assistant window, that you can open with the <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11272024inspirationimg3.png' alt='' title='' width='16' height='16'/> button on the notebook toolbar. But another method involves using the <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11272024inspirationimg4.png' alt='' title='' width='20' height='18'/> button in the toolbar, which we’ve been calling the “inspiration button”.</p> <p>When you use the Notebook Assistant window, the Assistant will always try to figure out what you’re talking about. For example, if you say “Plot that” it’ll use what it knows about what notebook you’re using, and where you are in it, to try to work out what you mean by “that”. But when you use the <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11272024inspirationimg4.png' alt='' title='' width='20' height='18'/> button it’ll specifically try to “provide inspiration at your current selection”. </p> <p>Let’s say you’ve typed <tt><a href="http://reference.wolfram.com/language/ref/Plot.html">Plot</a></tt><tt>[</tt><tt><a href="http://reference.wolfram.com/language/ref/Sin.html">Sin</a></tt><tt>[x]</tt>. Press <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11272024inspirationimg4.png' alt='' title='' width='20' height='18'/> and it’ll suggest a possible completion:</p> <p id="examplegallery"><a class="magnific image fullzoom" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/The-Inspiration-Button-01-output.gif"><img src="https://content.wolfram.com/sites/43/2024/11/The-Inspiration-Button-01-output.gif" width='620' height='155' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>After using that suggestion, you can keep going:</p> <p id="examplegallery"><a class="magnific image fullzoom" alt="" title="" href="https://content.wolfram.com/sites/43/2024/11/sw11272024inspirationimg8.png"><img src="https://content.wolfram.com/sites/43/2024/11/sw11272024inspirationimg8.png" width='620' height='184' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>You can think of the <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11272024inspirationimg4.png' alt='' title='' width='20' height='18'/> button as providing a sophisticated meaning-aware autocomplete.</p> <p id="examplegallery"><a class="magnific image fullzoom" alt="" title="" href="https://content.wolfram.com/sites/43/2024/12/swsw12092024inspriationXimg10.png"><img src="https://content.wolfram.com/sites/43/2024/12/swsw12092024inspriationXimg10.png" width='621' height='184' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>It also lets you do things like code simplification. Imagine you’ve written the (rather grotesque):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/swsw12092024inspriationXimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/The-Inspiration-Button-04-v3.png' alt='' title='' width='445' height='282'> </div> </p></div> <p>If you want to get rid of the <a href="https://reference.wolfram.com/language/ref/For.html"><tt>For</tt></a> loops, just select them and press the <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11272024inspirationimg4.png' alt='' title='' width='20' height='18'/> button to get a much simpler version:</p> <p id="examplegallery"><a class="magnific image fullzoom" alt="" title="" href="https://content.wolfram.com/sites/43/2024/12/The-Inspiration-Button-05-v3.png"><img src="https://content.wolfram.com/sites/43/2024/12/The-Inspiration-Button-05-v3.png" width='551' height='418' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>Want to go even further? Select that result and Notebook Assistant manages to get to a one-liner:</p> <p id="examplegallery"><a class="magnific image fullzoom" alt="" title="" href="https://content.wolfram.com/sites/43/2024/12/The-Inspiration-Button-06-v3.png"><img src="https://content.wolfram.com/sites/43/2024/12/The-Inspiration-Button-06-v3.png" width='499' height='205' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <h2 id="magic-writing-magic-coding">Magic Writing, Magic Coding</h2> <p>At some level it seems bizarre. Write a text cell that describes code to follow it. Start an Input cell, then press <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11272024inspirationimg4.png' alt='' title='' width='20' height='18'/> and Notebook Assistant will try to magically write the code! </p> <p><video width="620" height="340" autoplay loop muted><source src="https://content.wolfram.com/sites/43/2024/12/magic1input.mp4" type="video/mp4" /></video></p> <p>You can go the other way as well. Start with the code, then start a <span class="promptformatted">CodeText</span> cell above it, and it’ll “magically” write a caption:</p> <p><video width="620" height="340" autoplay loop muted><source src="https://content.wolfram.com/sites/43/2024/12/magic2codetext.mp4" type="video/mp4" /></video></p> <p>If you start a heading cell, it’ll try to make up a heading:</p> <p><video width="620" height="340" autoplay loop muted><source src="https://content.wolfram.com/sites/43/2024/12/magic3subsubsection2.mp4" type="video/mp4" /></video></p> <p>Start a <span class="promptformatted">Text</span> cell, and it’ll try to “magically” write relevant textual content:</p> <p><video width="620" height="450" autoplay loop muted><source src="https://content.wolfram.com/sites/43/2024/12/magic4text2.mp4" type="video/mp4" /></video></p> <p>You can go even further: just put the cursor underneath the existing content, and press <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/11/sw11272024inspirationimg4.png' alt='' title='' width='20' height='18'/>—and Notebook Assistant will start suggesting how you can go on:</p> <p><video width="620" height="720" autoplay loop muted><source src="https://content.wolfram.com/sites/43/2024/12/magic5insert2.mp4" type="video/mp4" /></video></p> <p>As I write this, of course I had to try it: what does Notebook Assistant think I should write next? Here’s what it suggests (and, yes, in this case, those aren’t such bad ideas):</p> <p id="examplegallery"><a class="magnific image fullzoom" alt="" title="" href="https://content.wolfram.com/sites/43/2024/12/sw12072024magicimg3.png"><img src="https://content.wolfram.com/sites/43/2024/12/sw12072024magicimg3.png" width='620' height='666' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <h2 id="the-practicalities-of-the-assistant">The Practicalities of the Assistant</h2> <p>One of the objectives for Notebook Assistant is to have it provide “hassle-free” access to AI and LLM technology integrated into the Wolfram System. And indeed, once you’ve set up your subscription (within <a href="https://account.wolfram.com/products" target="_blank">your Wolfram Account</a>), everything “just works”. Under the hood, there’s all sorts of technology, servers, etc. But you don’t have to worry about any of that; you can just use Notebook Assistant as a built-in part of the Wolfram Notebook experience.</p> <p>As you work with Notebook Assistant, you’ll get progressively better intuition about where it can best help you. (And, yes, we’ll be continually updating Notebook Assistant, so it’ll often be worth trying things again if a bit of time has passed.) Notebook Assistant—like any AI-based system—has definite human-like characteristics, including sometimes making mistakes. Often those mistakes will be obvious (e.g. code with incorrect syntax colored red); sometimes they may be more difficult to spot. But the great thing about Notebook Assistant is that it’s firmly anchored to the “solid ground” of Wolfram Language. And any time it writes Wolfram Language code that you can see does what you want, you can always confidently use it.</p> <p>There are some things that will help Notebook Assistant do its best for you. Particularly important is giving it the best view of the “context” for what you ask it. Notebook Assistant will generally look at whatever has already been said in a particular chat. So if you’re going to change the subject, it’s best to use the <img style="margin-top: 2px;position:relative;top:3px;" src='https://content.wolfram.com/sites/43/2024/12/newchat-buttom.png' alt='' title='' width='20' height='17'/> button to start a new chat, so Notebook Assistant will focus on the new subject, and not get confused by what you (or it) said before. </p> <p>When you open the Notebook Assistant chat window you’ll often want to talk about—or refer to—material in some other notebook. Generally Notebook Assistant will assume that the notebook you last used is the one that’s relevant—and that any selection you have in that notebook is the thing to concentrate on the most. If you want Notebook Assistant to focus exclusively on what you’re saying in the chat window, one way to achieve that is to start a blank notebook. Another approach is to use the <img style="margin-top: 2px;position:relative;top:6px;" loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/NA-sidechat-sources.png' alt='' title='' width='70' height='23'> menu, which provides more detailed control over what material Notebook Assistant will consider. (For now, it just deals with notebooks you have open—but external files, URLs, etc. are coming soon.) </p> <p>Notebook Assistant will by default store all your chat sessions. You can see your chat history (with chats automatically assigned names by the Assistant) by pressing the <img style="margin-top: 2px;position:relative;top:3px;" src='https://content.wolfram.com/sites/43/2024/12/NA-sidechat-history.png' alt='' title='' width='18' height='18'/> History button. You can delete chats from your history here. You can also “pop out” chats with <img style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/12/NA-sidechat-history-popout.png' alt='' title='' width='15' height='15'/>, creating standalone notebooks that you can save, send to other people, etc. </p> <p>So what’s inside Notebook Assistant? It’s quite a tower of technology. The core of its “linguistic interface” is an LLM (actually, several different LLMs)—trained on extensive Wolfram Language material, and with access to a variety of tools, especially Wolfram Language evaluators. Also critical to Notebook Assistant is its access to a variety of RAGs based on vector databases, that it uses for immediate semantic search of material such as Wolfram Language documentation. Oh, and then there’s a lot of technology to connect Notebook Assistant to the symbolic internal structure of notebooks, etc.</p> <p>So when you use Notebook Assistant, where is it actually running? Its larger LLM tasks are currently running on cloud servers. But a substantial part of its functionally is running right on your computer—using Wolfram Language (notably the <a href="https://reference.wolfram.com/language/guide/MachineLearning.html">Wolfram machine learning framework</a>, vector database system, etc.) And because these things are running locally, the Assistant can request access to local information on your computer—as well as avoiding the latency of accessing cloud-based systems.</p> <h2 id="chats-in-your-main-notebook-coming-soon">Chats in Your Main Notebook (Coming Soon)</h2> <p>Much of the time, you want your interactions with Notebook Assistant to be somehow “off on the side”—say in the Notebook Assistant window, or in the inspiration button menu. But sometimes you want your interactions to be right in your main notebook.</p> <p>And for this you’ll soon (in Version 14.2) be able to use an enhanced version of the <a href="https://writings.stephenwolfram.com/2023/06/introducing-chat-notebooks-integrating-llms-into-the-notebook-paradigm/ ">Chat Notebook</a> technology that we developed last year, not just in a separate “Chat Notebook”, but fully integrated into any notebook. </p> <p>At the beginning of a cell in any notebook, just press <span class="kbd"><kbd>‘</kbd></span>. You get a chat cell that communicates with Notebook Assistant:</p> <p> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12082024chatinputA.png' alt='Chat cell' title='Chat cell' width='711' height='43'></p> <p>And now the output from that chat cell is placed directly below in the notebook—so you can create a notebook that mixes standard notebook content with chat content.</p> <p>It all works basically just like a fully integrated version of our Chat Notebook technology. (And this functionality is already available in Version 14.1 if you explicitly create a chat notebook with <span class="promptformatted">File</span> > <span class="promptformatted">New</span> > <span class="promptformatted">Chat Notebook</span>.) As in Chat Notebooks, you use a chat break (with <span class="kbd"><kbd>~</kbd></span>) to start a new chat within the same notebook. (In general, when you use a chat cell in an ordinary notebook to access Notebook Assistant, the assistant will see only material that occurs before the chat, and within the same chat block.)</p> <h2 id="also-introducing-llm-kit">Also Introducing: LLM Kit</h2> <p>In mid-2023 we introduced <tt><a href="http://reference.wolfram.com/language/ref/LLMFunction.html">LLMFunction</a></tt>, <tt><a href="http://reference.wolfram.com/language/ref/LLMSynthesize.html">LLMSynthesize</a></tt> and related functions (as well as <tt><a href="http://reference.wolfram.com/language/ref/ChatEvaluate.html">ChatEvaluate</a></tt>, <tt><a href="http://reference.wolfram.com/language/ref/ImageSynthesize.html">ImageSynthesize</a></tt>, etc.) to let you access LLM functionality directly within the Wolfram Language. Until now these functions required connection to an external LLM provider. But along with Notebook Assistant we’re introducing today LLM Kit—which allows you to access all LLM functionality in the Wolfram Language directly through a subscription within your Wolfram Account. </p> <p>It’s all very easy: as soon as you enable your subscription, not only Notebook Assistant but also all LLM functionality will just work, going through our LLM service. (And, yes, Notebook Assistant is basically built on top of LLM Kit and the LLM service access it defines.)</p> <p>When you’ve enabled your Notebook Assistant + LLM Kit subscription, this is what you’ll see in the <span class="promptformatted">Preferences</span> panel:</p> <p><img src='https://content.wolfram.com/sites/43/2024/12/sw12082024llmkitimg1.png' alt='Notebook Assistant + LLM Kit Preferences panel' title='Notebook Assistant + LLM Kit Preferences panel' width='616' height='392'/></p> <p>Our LLM service is primarily aimed at “human speed” LLM usage, in other words, things like responding to what you ask the Notebook Assistant. But the service also seamlessly supports programmatic things like <tt><a href="http://reference.wolfram.com/language/ref/LLMFunction.html">LLMFunction</a></tt>. And for anything beyond small-scale uses of <tt><a href="http://reference.wolfram.com/language/ref/LLMFunction.html">LLMFunction</a></tt>, etc. you’ll probably want to upgrade from the basic “Essentials” subscription level to the “Pro” level. And if you want to go “industrial scale” in your LLM usage, you can do that by explicitly purchasing <a href="https://www.wolfram.com/service-credits/">Wolfram Service Credits</a>. </p> <p>Everything is set up to be easy if you use our Wolfram LLM service—and that’s what Notebook Assistant is based on. But for Chat Notebooks and programmatic LLM functionality, our Wolfram Language framework also supports connection to a wide range of external LLM service providers. You have to have your own external subscription to whatever external service you want to use. But once you have the appropriate access key you’ll be able to set things up so that you can pick that LLM provider interactively in Chat Notebooks, programmatically through <tt><a href="http://reference.wolfram.com/language/ref/LLMConfiguration.html">LLMConfiguration</a></tt>, or in the <span class="promptformatted">Preferences</span> panel. </p> <p>(By the way, we’re continually monitoring the performance of different LLMs on Wolfram Language generation; you can see weekly benchmark results at the <a href="https://www.wolfram.com/llm-benchmarking-project">Wolfram LLM Benchmark Project website</a>—or get the data behind that from the <a href="https://datarepository.wolframcloud.com/">Wolfram Data Repository</a>.)</p> <h2 id="opening-up-the-ability-to-go-computational">Opening Up the Ability to “Go Computational”</h2> <p>There’s really never been anything quite like it before: a way of automatically taking what can be quite vague human thoughts and ideas, and making them crisp and structured—by expressing them computationally. And, yes, this is made possible now by the unexpectedly effective linguistic interface that LLMs give us. But ultimately what makes it possible is that the LLMs have a target: the <a href="https://wolfram.com/language/">Wolfram Language</a> in all its breadth and depth. </p> <p>For me it’s an exciting moment. Because it’s a moment where everything we’ve been building these past four decades is suddenly much more broadly accessible. Expert users of Wolfram Language will be able to make use of all sorts of amazing nooks of functionality they never knew about. And people who’ve never used Wolfram Language before—or never even formulated anything computationally—will suddenly be able to do so.</p> <p>And it’s remarkable what kinds of things one can “make computational”. Let’s say you ask Wolfram Notebook Assistant to make up a story. Like pretty much anything today with LLMs inside, it’ll dutifully do that:</p> <p id="examplegallery"><a class="magnific image fullzoom" alt="" title="" href="https://content.wolfram.com/sites/43/2024/12/Opening-Up-the-Ability-01-output-zoom.png"><img src="https://content.wolfram.com/sites/43/2024/12/sw12082024abilityimg1.png" width='601' height='237' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>But how can one make something like this computational? Well, just ask Notebook Assistant:</p> <p id="examplegallery"><a class="magnific image fullzoom" alt="" title="" href="https://content.wolfram.com/sites/43/2024/12/Opening-Up-the-Ability-02-output-zoom.png"><img src="https://content.wolfram.com/sites/43/2024/12/sw12082024abilityimg2.png" width='430' height='838' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p id="examplegallery"><a class="magnific image" alt="" title="" href="https://content.wolfram.com/sites/43/2024/12/Opening-Up-the-Ability-03-output-zoom.png"><img src="https://content.wolfram.com/sites/43/2024/12/sw12082024abilityimg3.png" width='619' height='177' loading="lazy" alt="Click to enlarge" title="Click to enlarge"></a></p> <p>And what it does is rather remarkable: it uses Wolfram Language to create an interactive agent-based computational game version of the story! </p> <p>Computation is the great paradigm of our times. And the development of “computational X” for all X seems destined to be the future of pretty much every field. The whole tower of ideas and technology that is the modern Wolfram Language was built precisely to provide the computational language that is needed. But now Notebook Assistant is dramatically broadening access to that—making it possible to get “computational language superpowers” using just ordinary (and perhaps even vague) natural language. </p> <p>And even though I’ve now been living the computational language story for more than four decades Notebook Assistant keeps on surprising me with what it manages to make computational. It’s incredibly powerful to be able “go computational”. And even if you can’t imagine how it could work in what you’re doing, you should still just try it! Notebook Assistant may well surprise you—and in that moment show you a path to leverage the great power of the computational paradigm in ways that you’re never imagined. </p> <p></p> <p style="font-style: italic; color: #555;"> <style type="text/css"> div.bottomstripe { max-width:620px; margin-bottom:10px; background-color: #fff39a; border: solid 2px #ffd400; padding: 7px 10px 7px 10px; line-height: 1.2;} #blog .post_content .bottomstripe a, #blog .post_content .bottomstripe a:link, #blog .post_content .bottomstripe a:visited { font-family:"Source Sans Pro",Arial,Sans Serif; font-size:11pt; color:#aa0d00;} </style> <style type='text/css'>div.bottomstripe { background-color: #ff6516; border: solid 2px #ff6516; max-width: 100%; border-radius: 5px; } #blog .post_content .bottomstripe a, #blog .post_content .bottomstripe a:link, #blog .post_content .bottomstripe a:visited { color: white; }#blog #content .post_content a:hover { color: #ffe59b; }</style> <div class="bottomstripe"> <a href="https://www.wolfram.com/notebook-assistant-llm-kit/"><strong>Subscribe to Wolfram Notebook Assistant now »</strong></a> </div> ]]></content:encoded> <wfw:commentRss>https://writings.stephenwolfram.com/2024/12/useful-to-the-point-of-being-revolutionary-introducing-wolfram-notebook-assistant/feed/</wfw:commentRss> <slash:comments>0</slash:comments> <enclosure url="https://content.wolfram.com/sites/43/2024/12/magic1input.mp4" length="175419" type="video/mp4" /> <enclosure url="https://content.wolfram.com/sites/43/2024/12/magic2codetext.mp4" length="153438" type="video/mp4" /> </item> <item> <title>Foundations of Biological Evolution: More Results & More Surprises</title> <link>https://writings.stephenwolfram.com/2024/12/foundations-of-biological-evolution-more-results-more-surprises/</link> <comments>https://writings.stephenwolfram.com/2024/12/foundations-of-biological-evolution-more-results-more-surprises/#comments</comments> <pubDate>Thu, 05 Dec 2024 23:13:27 +0000</pubDate> <dc:creator><![CDATA[Stephen Wolfram]]></dc:creator> <category><![CDATA[Computational Science]]></category> <category><![CDATA[Life Science]]></category> <category><![CDATA[New Kind of Science]]></category> <category><![CDATA[Ruliology]]></category> <guid isPermaLink="false">https://writings.stephenwolfram.com/?p=64272</guid> <description><![CDATA[<span class="thumbnail"><img width="128" height="108" src="https://content.wolfram.com/sites/43/2024/12/bioevel-icon-1.png" class="attachment-thumbnail size-thumbnail wp-post-image" alt="" /></span>This is a follow-on to Why Does Biological Evolution Work? A Minimal Model for Biological Evolution and Other Adaptive Processes [May 3, 2024]. Even More from an Extremely Simple Model A few months ago I introduced an extremely simple “adaptive cellular automaton” model that seems to do remarkably well at capturing the essence of what’s […]]]></description> <content:encoded><![CDATA[<span class="thumbnail"><img width="128" height="108" src="https://content.wolfram.com/sites/43/2024/12/bioevel-icon-1.png" class="attachment-thumbnail size-thumbnail wp-post-image" alt="" /></span><style> .mfp-container .mfp-content { max-width: 655px !important; } a.magnific img:hover { cursor: zoom-in; } .mfp-iframe-holder .mfp-content { max-width: 820px !important; max-height: 634px; height: 100%; width: 100%; } .mfp-content .mfp-iframe-scaler button.mfp-close { right: -12px; top: -12px !important; } </style> <p><img class="aligncenter" title="Foundations of Biological Evolution: More Results & More Surprises" src="https://content.wolfram.com/sites/43/2024/12/BioEvol2-hero-v3-min.png" alt="Foundations of Biological Evolution: More Results & More Surprises" width="649" height="405" /></p> <p style="font-size:14px;background:#e5f2f85c;padding:5px 15px;border:1px solid #cfdde3c7;max-width:620px;margin:25px 0px;"><em>This is a follow-on to <a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/">Why Does Biological Evolution Work? A Minimal Model for Biological Evolution and Other Adaptive Processes</a> [May 3, 2024].</em></p> <h2 id="even-more-from-an-extremely-simple-model">Even More from an Extremely Simple Model</h2> <p><a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/">A few months ago I introduced an extremely simple “adaptive cellular automaton” model</a> that seems to do remarkably well at capturing the essence of what’s happening in biological evolution. But over the past few months I’ve come to realize that the model is actually even richer and deeper than I’d imagined. And here I’m going to describe some of what I’ve now figured out about the model—and about the often-surprising things it implies for the foundations of biological evolution.</p> <p>The starting point for <a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/#the-model">the model</a> is to view biological systems in abstract computational terms. We think of an organism as having a genotype that’s represented by a program, that’s then run to produce its phenotype. So, for example, the <a href="https://www.wolframscience.com/nks/chap-2--the-crucial-experiment#sect-2-1--how-do-simple-programs-behave">cellular automaton</a> rules on the left correspond to a genotype which are then run to produce the phenotype on the right (starting from a “seed” of a single red cell):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/12032024modelimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/12032024modelimg1.png' alt='' title='' width='475' height='394'> </div> </p></div> <p><span id="more-64272"></span></p> <p>The key idea in our model is to adaptively evolve the genotype rules—say by making single “point mutations” to the list of outcomes from the rules:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/12032024modelimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/12032024modelimg2.png' alt='' title='' width='210' height='78'> </div> </p></div> <p>At each step in the adaptive evolution we “accept” a mutation if it leads to a phenotype that has a higher—or at least equal—fitness relative to what we had before. So, for example, taking our fitness function to be the height (i.e. lifetime) of the phenotype pattern (with patterns that are infinite being assigned zero fitness), a sequence of (randomly chosen) adaptive evolution steps that go from the null rule to the rule above might be:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/12032024modelimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/12032024modelimg3.png' alt='' title='' width='646' height='289'> </div> </p></div> <p>What if we make a different sequence of randomly chosen adaptive evolution steps? Here are a few examples of what happens—each in a sense “using a different idea” for how to achieve high fitness:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/12032024modelimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/12032024modelimg4.png' alt='' title='' width='623' height='1014'> </div> </p></div> <p>And, yes, one can’t help but be struck by how “lifelike” this all looks—both in the complexity of these patterns, and in their diversity. But what is ultimately responsible for what we’re seeing? It’s long been a core question about biological evolution. Are the forms it produces the result of careful “sculpting” by the environment (and by the fitness functions it implies)—or are their most important features somehow instead a consequence of something more intrinsic and fundamental that doesn’t depend on details of fitness functions?</p> <p>Well, let’s say we pick a <a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/#other-adaptive-evolution-strategies:~:text=what%20about%20other%20kinds%20of%20objectives">different fitness function</a>—for example, not the height of a phenotype pattern, but instead its width (or, more specifically, the width of its bounding box). Here are some results of adaptive evolution in this case:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/12032024modelimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/12032024modelimg5.png' alt='' title='' width='581' height='776'> </div> </p></div> <p>And, yes, the patterns we get are now ones that achieve larger “bounding box width”. But somehow there’s still a remarkable similarity to what we saw with a rather different fitness function above. And, for example, in both cases, high fitness, it seems, is normally achieved in a complicated and hard-to-understand way. (The last pattern is a bit of an exception; as can also happen in biology, this is a case where for once there’s a “mechanism” in evidence that we can understand.) </p> <p>So what in the end is going on? As I discussed <a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/">when I introduced the model a few months ago</a>, it seems that the “dominant force” is not selection according to fitness functions, but instead the fundamental computational phenomenon of <a href="https://www.wolframscience.com/nks/p737--computational-irreducibility/">computational irreducibility</a>. And what we’ll find here is that in fact what we see is, more than anything, the result of an interplay between the computational irreducibility of the process by which our phenotypes develop, and the computational boundedness of typical forms of fitness functions.</p> <p>The importance of such an interplay is something that’s very much come into focus as a result of our <a href="https://www.wolframphysics.org/" target="_blank" rel="noopener">Physics Project</a>. And indeed it now <a href="https://writings.stephenwolfram.com/2022/03/the-physicalization-of-metamathematics-and-its-implications-for-the-foundations-of-mathematics/#mathematics-and-physics-have-the-same-foundations">seems that the foundations</a> of both physics and mathematics are—more than anything—reflections of this interplay. And now it seems that’s true of biological evolution as well. </p> <p>In studying our model, there are many detailed phenomena we’ll encounter—most of which seem to have surprisingly direct analogs in actual biological evolution. For example, here’s what happens if we plot the behavior of the fitness function for our first example above over the course of the adaptive evolution process:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/12032024modelimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/12032024modelimg6.png' alt='' title='' width='663' height='227'> </div> </p></div> <p>We see a sequence of “plateaus”, punctuated by jumps in fitness that reflect some “breakthrough” being made. In the picture, each red dot represents the fitness associated with a genotype that was tried. Many fall below the line of “best results so far”. But there are also plenty of red dots that lie right on the line. And these correspond to genotypes that yield the same fitness that’s already been achieved. But here—as in actual biological evolution—it’s important that there can be “<a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/#:~:text=phenomena.%20There%20are-,%E2%80%9Cfitness%2Dneutral%E2%80%9D%20mutations,-that%20can%20%E2%80%9Cgo">fitness-neutral evolution</a>”, where genotypes change, but the fitness does not. Usually such changes of genotype yield not just the same fitness, but also the exact same phenotype. Sometimes, however, there can be multiple phenotypes with the same fitness—and indeed this happens at one stage in the example here</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/12032024modelimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/12032024modelimg7.png' alt='' title='' width='644' height='277'> </div> </p></div> <p>and at multiple stages in the second example we showed above:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/12032024modelimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/12032024modelimg8.png' alt='' title='' width='646' height='236'> </div> </p></div> <h2 id="the-multiway-graph-of-all-possible-evolutions">The Multiway Graph of All Possible Evolutions</h2> <p>In the previous section we saw examples of the results of a few particular random sequences of mutations. But what if we were to look at all possible sequences of mutations? As I <a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/#the-multiway-graph-of-all-possible-mutation-histories">discussed when I introduced the model</a>, it’s possible to construct a <a href="https://www.wolframscience.com/nks/chap-5--two-dimensions-and-beyond#sect-5-6--multiway-systems">multiway graph</a> that represents all possible mutation paths. Here’s what one gets for symmetric <em>k</em> = 2, <em>r</em> = 2 rules—starting from the null rule, and using height as a fitness function:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024multiwayimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024multiwayimg1.png' alt='' title='' width='633' height='445'> </div> </p></div> <p>The way this graph is constructed, there are arrows from a given phenotype to all phenotypes with larger (finite) height that can be reached by a single mutation. </p> <p>But what if our fitness function is width rather than height? Well, then we get a different multiway graph in which arrows go to phenotypes not with larger height but instead with larger width:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024multiwayimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024multiwayimg2.png' alt='' title='' width='606' height='375'> </div> </p></div> <p>So what’s really going on here? Ultimately one can think of there being an underlying graph (that one might call the “<a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/#:~:text=graph%20of%20every%20possible%20way">mutation graph</a>”) in which every edge represents a transformation between two phenotypes that can be achieved by a single mutation in the underlying genotype: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024multiwayimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024multiwayimg3.png' alt='' title='' width='635' height='429'> </div> </p></div> <p>At this level, the transformations can go either way, so this graph is undirected. But the crucial point is that as soon as one imposes a fitness function, it defines a particular direction for each transformation (at least, each transformation that isn’t fitness neutral for this fitness function). And then if one starts, say, from the null rule, one will pick out a certain “evolution cone” subgraph of the original mutation graph.</p> <p>So, for example, with width as the fitness function, the subgraph one gets is what’s highlighted here:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024multiwayimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024multiwayimg4.png' alt='' title='' width='495' height='278'> </div> </p></div> <p>There are several subtleties here. First, we simplified the multiway graph by doing <a href="https://reference.wolfram.com/language/ref/TransitiveReductionGraph.html">transitive reduction</a> and drawing only the minimal edges necessary to define the connectivity of the graph. If we want to see all possible single-mutation transformations between phenotypes we need to do <a href="https://reference.wolfram.com/language/ref/TransitiveClosureGraph.html">transitive completion</a>, in which case for the width fitness function the multiway graph we get is:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024multiwayimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024multiwayimg5.png' alt='' title='' width='611' height='336'> </div> </p></div> <p>But now there’s another subtlety. The edges in the multiway graph represent fitness-changing transformations. But there are also fitness-neutral transformations. And occasionally these can even lead to different (though equal-fitness) phenotypes, so that really each node in the graph above (say, the transitively reduced one) should sometimes be associated with multiple phenotypes</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024multiwayimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024multiwayimg6.png' alt='' title='' width='667' height='347'> </div> </p></div> <p>which can “fitness neutrally” transform into each other, as in:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024multiwayimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024multiwayimg7.png' alt='' title='' width='580' height='318'> </div> </p></div> <p>But even this isn’t the end of the subtleties. Fitness-neutral sets typically contain many genotypes differing by changes of rule cases that don’t affect the phenotype they produce. But it may be that just one or a few of these genotypes are “primed” to be able to generate another phenotype with just one additional mutation. Or, in other words, each node in the multiway graph above represents a whole class of genotypes “equivalent under fitness-neutral transformations”, and when we draw an arrow it indicates that some genotype in that class can be transformed by a single mutation to some genotype in the class associated with a different phenotype:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024multiwayimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024multiwayimg8.png' alt='' title='' width='664' height='125'> </div> </p></div> <p>But beyond the subtleties, the key point is that particular fitness functions in effect just define particular orderings on the underlying mutation graph. It’s somewhat like choices of reference frames or families of simultaneity surfaces in physics. Different choices of fitness function in effect define different ways in which the underlying mutation graph can be “navigated” by evolution over the course of time.</p> <p>As it happens, the results are not so different between height and width fitness functions. Here’s a combined multiway graph, indicating transformations variously allowed by these different fitness functions:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024multiwayimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024multiwayimg9.png' alt='' title='' width='623' height='385'> </div> </p></div> <p>Homing in on a small part of this graph, we see that there are different “flows” associated with maximizing height and maximizing width: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024multiwayimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024multiwayimg10.png' alt='' title='' width='297' height='309'> </div> </p></div> <p>With a single fitness function that for any two phenotypes systematically treats one phenotype as fitter than another, the multiway graph must always define a definite flow. But as soon as one considers changing fitness functions in the course of evolution, it’s possible to get cycles in the multiway graph, as in the example above—so that, in effect, “evolution can repeat itself”. </p> <h2 id="fitness-functions-based-on-aspect-ratio">Fitness Functions Based on Aspect Ratio</h2> <p>We’ve looked at fitness functions based on maximizing height and on maximizing width. But what if we try to combine these? Here’s a plot of the widths and heights of all phenotypes that occur in the symmetric <em>k</em> = 2, <em>r</em> = 2 case we studied above:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024fitnessAimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024fitnessAimg1.png' alt='' title='' width='636' height='404'> </div> </p></div> <p>We could imagine a variety of ways to define “fitness frontiers” here. But as a specific example, let’s consider fitness functions that are based on trying to achieve specific aspect ratios—i.e. phenotypes that are as close as possible to a particular constant-aspect-ratio line in the plot above. </p> <p>With the symmetric <em>k</em> = 2, <em>r</em> = 2 rules we’re using here, only a certain set of aspect ratios can ever be obtained:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024fitnessBimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024fitnessAimg2.png' alt='' title='' width='638' height='36'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024fitnessAimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024fitnessAimg3.png' alt='' title='' width='638' height='76'> </div> </p></div> <p>The corresponding phenotypes (with their aspect ratios) are:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024fitnessAimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024fitnessAimg4.png' alt='' title='' width='622' height='357'> </div> </p></div> <p>As we change the aspect ratio that we’re trying to achieve, the evolution multiway graph will change:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024fitnessBimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024fitnessAimg5.png' alt='' title='' width='646' height='626'> </div> </p></div> <p>In all cases we’re starting from the null rule. For target aspect ratio 1.0 this rule itself already achieves that aspect ratio—so the multiway graph in that case is trivial. But in general, different aspect ratios yield evolution multiway graphs that are different subgraphs of the complete mutation graph we saw above. </p> <p>So if we follow all possible paths of evolution, how close can we actually get to any given target aspect ratio? This plot shows what final aspect ratios can be achieved as a function of target aspect ratio:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024fitnessBimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024fitnessBimg6.png' alt='' title='' width='427' height='276'> </div> </p></div> <p>And in a sense this is a summary of the effect of “developmental constraints” for “adaptive cellular automaton organisms” like this. If there were no constraints then for every target aspect ratio it’d be possible to get an “organism” with that aspect ratio—so in the plot there’d be a point lying on the red line. But in actuality the process of cellular automaton growth imposes constraints—that in particular allows only certain phenotypes, with certain aspect ratios, to exist. And beyond that, which phenotypes can actually be reached by adaptive evolution depends on the evolution multiway graph, with “different turns” on the graph leading to different fitness (i.e. different aspect ratio) phenotypes.</p> <p>But what the plot above shows overall is that for a certain range of target aspect ratios, adaptive evolution is successfully able to get at least close to those aspect ratios. If the target aspect ratio gets out of that range, however, “developmental constraints” come in that prevent the target from being reached.</p> <p>With “larger genomes”, i.e. rules with larger numbers of cases to specify, it’s possible to do better, and to more accurately achieve particular aspect ratios, over larger ranges of values. And indeed we can see some version of this effect even for symmetric <em>k</em> = 2, <em>r</em> = 2 rules by plotting aspect ratios that can be achieved as a function of the number of cases that need to be specified in the rule:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024fitnessAimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024fitnessAimg7.png' alt='' title='' width='394' height='176'> </div> </p></div> <p>As an alternative visualization, we can plot the “best convergence to the target” as a function of the number of rule cases—and once again we see that larger numbers of rule cases let us get closer to target aspect ratios:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024fitnessAimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024fitnessAimg8.png' alt='' title='' width='466' height='153'> </div> </p></div> <p>It’s worth mentioning that—just as we discussed for height and width fitness functions above—there are subtleties here associated with fitness-neutral sets. For example, here are sets of phenotypes that all have the specified aspect ratios—with phenotypes that can be reached by single point mutations being joined:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024fitnessAimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024fitnessAimg9.png' alt='' title='' width='611' height='199'> </div> </p></div> <p>In the evolution multiway graphs above, we included only one phenotype for each fitness-neutral set. But here’s what we get for target aspect ratio 0.7 if we show all phenotypes with a given fitness:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024fitnessAimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024fitnessAimg10.png' alt='' title='' width='371' height='318'> </div> </p></div> <p>Note that on the top line, we don’t just get the null rule. Instead, we get four phenotypes, all of which, like the null rule, have aspect ratio 1, and so are equally far from the target aspect ratio 0.7. </p> <p>The picture above is only the transitively reduced graph. But if we include all possible transformations associated with single point mutations, we get instead:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024fitnessAimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024fitnessAimg11.png' alt='' title='' width='452' height='318'> </div> </p></div> <p>Based on this graph, we can now make what amounts to a foliation, showing collections of phenotypes reached by a certain minimum number of mutations, progressively approaching our target aspect ratio (here 0.7):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024fitnessAimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024fitnessAimg12.png' alt='' title='' width='309' height='339'> </div> </p></div> <p>Here’s what we get from the range of target aspect ratios shown above (where, as above, “terminal phenotypes” are highlighted):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024fitnessAimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024fitnessAimg13.png' alt='' title='' width='538' height='542'> </div> </p></div> <p>In a sense these sequences show us what phenotypes can appear at progressive stages in the “fossil record” for different (aspect-ratio) fitness functions in our very simple model. The highlighted cases are “evolutionary dead ends”. The others can evolve further.</p> <h2 id="unreachable-cases">Unreachable Cases</h2> <p>Our model takes the process of adaptive evolution to never “go backwards”, or, in other words, to never evolve from a particular genotype to one with lower fitness. But this means that starting with a certain genotype (say the null rule) there may be genotypes (and hence phenotypes) that will never be reached. </p> <p>With height as a fitness function, there are <a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/#the-whole-space-exhaustive-search-vs-adaptive-evolution">just two single (“orphan”) phenotypes</a> that can’t be reached:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024unreachableimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024unreachableimg1.png' alt='' title='' width='524' height='370'> </div> </p></div> <p>And with width as the fitness function, it turns out the very same phenotypes also can’t be reached:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024unreachableimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024unreachableimg2.png' alt='' title='' width='482' height='295'> </div> </p></div> <p>But if we use a fitness function that, for example, tries to achieve aspect ratio 0.7, we get many more phenotypes that can’t be reached starting from the null rule:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024unreachableimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024unreachableimg3.png' alt='' title='' width='365' height='339'> </div> </p></div> <p>In the original mutation graph all the phenotypes appear. But when we foliate (or, more accurately, order) that graph using a particular fitness function, some phenotypes become unreachable by evolutionarily-possible transformations—in a rough analogy to the way some events in physics can become unreachable in the presence of an event horizon.</p> <h2 id="multiway-graphs-for-larger-rule-spaces">Multiway Graphs for Larger Rule Spaces</h2> <p>So far we’ve discussed multiway graphs here only for symmetric <em>k</em> = 2, <em>r</em> = 2 rules. There are a total of 524,288 (= 2<sup>19</sup>) possible such rules, producing <a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/#the-whole-space-exhaustive-search-vs-adaptive-evolution:~:text=77%20distinct%20phenotypic%20patterns">77 distinct phenotypes</a>. But what about larger classes of rules? As an example, we can consider all <em>k</em> = 2, <em>r</em> = 2 rules, without the constraint of symmetry. There are 2,147,483,648 (= 2<sup>31</sup>) possible such rules, and there turn out to be<span class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12162024C2Cnumbersimg3_copy.txt' data-c2c-type='text/html'>3137</span>distinct phenotypes.</p> <p>For the height fitness function, the complete multiway graph in this case is</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024ruleimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024ruleimg3.png' alt='' title='' width='623' height='420'> </div> </p></div> <p>or, annotated with actual phenotypes:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024ruleimg4_copy.txt' data-c2c-type='text/html'> <a class="magnific iframe" href="https://www.wolframcloud.com/obj/blog-posts/BiologicalEvolution2/k2r2/index.html" style="border:none;"></p> <figure><img src="https://www.wolframcloud.com/obj/blog-posts/BiologicalEvolution2/k2r2/k2r2.png" width='663' height='458' loading="lazy" alt="Click to enlarge" title="Click to enlarge"><font size="-1"></p> <figcaption><em>Click image to zoom and pan</em></figcaption> <p></font></figure> <p></a> </div> </p></div> <p>If instead we just show bounding boxes, it’s easier to see where long-lifetime phenotypes occur:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024ruleimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024ruleimg5.png' alt='' title='' width='664' height='451'> </div> </p></div> <p>With a different graph layout the evolution multiway graph (with initial node indicated) becomes:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024ruleimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024ruleimg6.png' alt='' title='' width='603' height='403'> </div> </p></div> <p>One subtlety here is that the null rule has <a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/#the-whole-space-exhaustive-search-vs-adaptive-evolution:~:text=procedure%20never%20reaches%20the%20null%20rule">no successors with single point mutation</a>. When we were talking about symmetric <em>k</em> = 2, <em>r</em> = 2 rules, we took a “single point mutation” always to change both a particular rule case and its mirror image. But if we don’t have the symmetry requirement, a single point mutation really can just change a single rule case. And if we start from the null range and look at the results of changing just one bit (i.e. the output of just one rule case) in all possible ways we find that we either get the same pattern as with the null rule, or we get a pattern that grows without bound:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024ruleimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024ruleimg7.png' alt='' title='' width='632' height='73'> </div> </p></div> <p>Or, put another way, we can’t get anywhere with single bit mutations starting purely from the null rule. So what we’ve done is instead to start our multiway graph from <em>k</em> = 2, <em>r</em> = 2 rule 20, which has two bits “on”, and gives phenotype:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024ruleimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024ruleimg8.png' alt='' title='' width='65' height='39'> </div> </p></div> <p>But starting from this, just one mutation (together with a sequence of fitness-neutral mutations) is sufficient to give<span class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12162024C2Cnumbersimg4_copy.txt' data-c2c-type='text/html'>94</span>phenotypes—or<span class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12162024C2Cnumbersimg5_copy.txt' data-c2c-type='text/html'>49</span>after removing mirror images: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024ruleimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024ruleimg9.png' alt='' title='' width='685' height='181'> </div> </p></div> <p>The total number of new phenotypes we can reach after successively more (non-fitness-neutral) mutations is</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024ruleimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024ruleimg10.png' alt='' title='' width='228' height='75'> </div> </p></div> <p>while the successive <a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/#the-whole-space-exhaustive-search-vs-adaptive-evolution">longest-lifetime patterns</a> are:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024ruleimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024ruleimg11.png' alt='' title='' width='602' height='236'> </div> </p></div> <p>And what we see here is that it’s in principle possible to achieve long lifetimes even with fairly few mutations. But when the mutations are done at random, it can still take a very large number of steps to successfully “random walk” to long lifetime phenotypes. </p> <p>And out of a total of<span class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12162024C2Cnumbersimg6_copy.txt' data-c2c-type='text/html'>2407</span>distinct phenotypes,<span class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12162024C2Cnumbersimg7_copy.txt' data-c2c-type='text/html'>984</span>are “dead ends” where no further evolution is possible. Some of these dead ends have long lifetimes</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024ruleimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024ruleimg12.png' alt='' title='' width='600' height='169'> </div> </p></div> <p>but others have very short lifetimes:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024ruleimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024ruleimg13.png' alt='' title='' width='634' height='57'> </div> </p></div> <p>There’s much more to explore in this multiway graph—and we’ll continue a bit below. But for now let’s look at another evolution multiway graph of accessible size: the one for symmetric <nobr><em>k</em> = 3,</nobr> <em>r</em> = 1 rules. There are a total of 129,140,163 (= 3<sup>17</sup>) possible such rules, that yield a total of<span class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12162024C2Cnumbersimg9_copy.txt' data-c2c-type='text/html'>14,778</span>distinct phenotypes:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024nikupdatesimg2_copy.txt' data-c2c-type='text/html'> <a class="magnific iframe" href="https://www.wolframcloud.com/obj/blog-posts/BiologicalEvolution2/k3r1/zoom.html" style="border:none;"></p> <figure><img loading="lazy" src="https://content.wolfram.com/sites/43/2024/12/sw12052024nikupdatesimg2.png" alt="Click to enlarge" title="Click to enlarge" width="633" height="435"><font size="-1"></p> <figcaption><em>Click image to zoom and pan</em></figcaption> <p></font></figure> <p></a> </div> </p></div> <p>Showing only bounding boxes of patterns this becomes:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12122024C2Cupdateimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024nikupdatesimg3.png' alt='' title='' width='631' height='425'> </div> </p></div> <p>Unlike the <em>k</em> = 2, <em>r</em> = 2 case, we can now start this whole graph with the null rule. However, if we look at all possible symmetric <em>k</em> = 3, <em>r</em> = 1 rules, there turn out to be 6 “isolates” that can’t be reached from the null rule by adaptive evolution with the height fitness function:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024ruleimg17_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024ruleimg17.png' alt='' title='' width='364' height='279'> </div> </p></div> <p>Starting from the null rule, the number of phenotypes reached after successively more (non-fitness-neutral) mutations is</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024ruleimg18_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024ruleimg18.png' alt='' title='' width='274' height='87'> </div> </p></div> <p>and the successive <a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/#the-whole-space-exhaustive-search-vs-adaptive-evolution:~:text=maximum%20lifetime%20found%20is%20not%20just%20308%2C%20but%202194">longest-lived of these phenotypes</a> are:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024ruleimg19_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024ruleimg19.png' alt='' title='' width='594' height='295'> </div> </p></div> <h2 id="aspect-ratio-fitness">Aspect Ratio Fitness</h2> <p>Just as we looked at fitness functions based on aspect ratio above for symmetric <em>k</em> = 2, <em>r</em> = 2 rules, so now we can do this for the whole space of all possible <em>k</em> = 2, <em>r</em> = 2 rules. Here’s a plot of the heights and widths of patterns that can be achieved with these rules:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg1.png' alt='' title='' width='668' height='421'> </div> </p></div> <p>These are the possible aspect ratios this implies:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg2.png' alt='' title='' width='648' height='35'> </div> </p></div> <p>And here’s their distribution (on a log scale):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg3.png' alt='' title='' width='646' height='78'> </div> </p></div> <p>The range of possible values extends much further than for symmetric <em>k</em> = 2, <em>r</em> = 2 rules: <span class='InlineFormula'><img src='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg4.png' width= '53' height='34' align='absmiddle'></span> to <span class='InlineFormula'><img src='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg5.png' width= '80' height='37' align='absmiddle'></span> rather than <span class='InlineFormula'><img src='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg6.png' width= '53' height='34' align='absmiddle'></span> to <span class='InlineFormula'><img src='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg7.png' width= '68' height='36' align='absmiddle'></span>. The patterns now with the largest aspect ratios are</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg8.png' alt='' title='' width='658' height='260'> </div> </p></div> <p>while those with the smallest aspect ratios are:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg9.png' alt='' title='' width='661' height='60'> </div> </p></div> <p>Note that just as for symmetric <em>k</em> = 2, <em>r</em> = 2 rules, to reach a wider range of aspect ratios, more cases in the rule have to be specified:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg10.png' alt='' title='' width='404' height='184'> </div> </p></div> <p>So what happens if we use adaptive evolution to try to reach different possible target aspect ratios? Most of the time (at least up to aspect ratio ≈ 3) there’s some sequence of mutations that will do it—though often we can get stuck at a different aspect ratio:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12122024C2Cupdateimg2_copy.txt' data-c2c-type='text/html'><img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg11.png' alt='' title='' width='650' height='283'> </div> </p></div> <p>If we look at the “best convergence” to a given target aspect ratio then we see that this improves as we increase the number of cases specified in the rule:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg12.png' alt='' title='' width='659' height='215'> </div> </p></div> <p>So what does the multiway graph look like for a fitness function associated with a particular aspect ratio? Here’s the result for aspect ratio 3:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg13.png' alt='' title='' width='636' height='425'> </div> </p></div> <p>The initial node involves patterns with aspect ratio 1—actually a fitness-neutral set of 263 of them. And as we go through the multiway graph, the aspect ratios get nearer to 3. The very closest they get, though, are for the patterns (whose locations are indicated on the graph):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg14.png' alt='' title='' width='158' height='245'> </div> </p></div> <p>But actually (as we saw in the lineup above), there is a rule that gives aspect ratio exactly 3:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg15_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg15.png' alt='' title='' width='66' height='195'> </div> </p></div> <p>But it turns out that this rule can’t be reached by adaptive evolution using single point mutations. In effect, adaptive evolution isn’t “strong enough” to achieve the exact aspect ratio we want; we can think of it as being “unpredictably prevented” by computationally irreducible “developmental constraints”.</p> <p>OK, so what about the <a href="https://writings.internal.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/#symmetric">symmetric <em>k</em> = 3, <em>r</em> = 1 rules</a>? Here’s how they’re distributed in width and height:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg16_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg16.png' alt='' title='' width='698' height='435'> </div> </p></div> <p>And, yes, in a typical “there are always surprises” story, there’s a strange height 265, width 173 pattern that shows up:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg17_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg17.png' alt='' title='' width='405' height='284'> </div> </p></div> <p>The overall possible aspect ratios are now</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg18_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg18.png' alt='' title='' width='648' height='35'> </div> </p></div> <p>and their (log) distribution is:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg19_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg19.png' alt='' title='' width='646' height='78'> </div> </p></div> <p>The phenotypes with the largest aspect ratios are</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg20_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg20.png' alt='' title='' width='652' height='388'> </div> </p></div> <p>while those with the smallest aspect ratios are:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg21_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg21.png' alt='' title='' width='729' height='44'> </div> </p></div> <p>Once again, to reach a larger range of aspect ratios, one has to specify more cases in the rule:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg22_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg22.png' alt='' title='' width='453' height='201'> </div> </p></div> <p>If we try to target a certain aspect ratio, there’s somewhat more of a tendency to get stuck than for <em>k</em> = 2, <em>r</em> = 2 rules—perhaps somewhat as a result of there now being fewer total rules (though more phenotypes) available:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12122024C2Cupdateimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024aspectimg23.png' alt='' title='' width='609' height='264'> </div> </p></div> <h2 id="branching-in-the-multiway-evolution-graph">Branching in the Multiway Evolution Graph</h2> <p>Looking at a typical multiway evolution graph such as</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024branchingimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024branchingimg1.png' alt='' title='' width='596' height='419'> </div> </p></div> <p>we see that different phenotypes can be quite separated in the graph—a bit like organisms on different branches of the tree of life in actual biology. But how can we characterize this separation? One approach is to compute the so-called <a href="https://reference.wolfram.com/language/ref/DominatorTreeGraph.html">dominator tree</a> of the graph:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024branchingimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024branchingimg2.png' alt='' title='' width='644' height='453'> </div> </p></div> <p>We can think of this as a way to provide a map of the least common ancestors of all nodes. The tree is set up so that given two nodes you just trace up the tree to find their common ancestor. Another interpretation of the tree is that it shows you what nodes you have no choice but to pass through in getting from the initial node to any given node—or, in other words, what phenotypes adaptive evolution has to produce on the way to a given phenotype. </p> <p>Here’s another rendering of the tree:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024branchingimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024branchingimg3.png' alt='' title='' width='425' height='450'> </div> </p></div> <p>We can think of this as the analog of the biological tree of life, with successive branchings picking out finer and finer “taxonomic domains” (analogous to kingdoms, phyla, etc.)</p> <p>The tree also shows us something else: how significant different links or nodes are—and how much of the tree one would “lop off” if they were removed. Or, put a different way, how much would be achieved by blocking a certain link or node—as one might imagine doing to try to block the evolution of bacteria or tumor cells?</p> <p>What if we look at larger multiway evolution graphs, like the complete <em>k</em> = 2, <em>r</em> = 2 one? Once again we can construct a dominator tree:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024branchingimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024branchingimg4.png' alt='' title='' width='457' height='462'> </div> </p></div> <p>It’s notable that there’s tremendous variance in the “fan out” here, with the phenotypes with largest successor counts being the rather undistinguished:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024branchingimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024branchingimg5.png' alt='' title='' width='655' height='59'> </div> </p></div> <p>But what if one’s specifically trying to reach, say, one of the maximum lifetime (length<span class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12162024C2Cnumbersimg11_copy.txt' data-c2c-type='text/html'>308)</span> phenotypes? Well, then one has to follow the paths in a particular subgraph of the original multiway evolution graph</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024branchingimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024branchingimg6.png' alt='' title='' width='572' height='386'> </div> </p></div> <p>corresponding to the phenotype graph:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024branchingimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024branchingimg7.png' alt='' title='' width='540' height='425'> </div> </p></div> <p>If one goes off this “narrow path” then one simply can’t reach the length-308 phenotype; one inevitably gets stuck in what amounts to another branch of the analog of the “tree of life”. So if one is trying to “guide evolution” to a particular outcome, this tells one that one needs to block off lots of “exit ramps”.</p> <p>But what “fraction of the whole graph” is the subgraph that leads to the length-308 phenotype? The whole graph has<span class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12162024C2Cnumbersimg12_copy.txt' data-c2c-type='text/html'>2409</span>vertices and 3878 edges, while the subgraph has 64 vertices and 119 edges, i.e. in both cases about 3%. A different measure is what fraction of all paths through the graph lead to the length-308 phenotype. The total number of paths is<span class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12162024C2Cnumbersimg15_copy.txt' data-c2c-type='text/html'>606,081,</span>while the number leading to the length-308 phenotype is 1260, or about 0.2%. Does this tell us what the probability of reaching that phenotype will be if we just make a random sequence of mutations? Not quite, because in the multiway evolution graph many equivalencings have been done, notably for fitness-neutral sets. And if we don’t do such equivalencings, it turns out (as we’ll discuss below) that the corresponding number is significantly smaller—about 0.007%.</p> <h2 id="exact-match-fitness-functions">Exact-Match Fitness Functions</h2> <p>The fitness functions we’ve been considering so far look only at coarse features of phenotype patterns—like their height, width and aspect ratio. But what happens if we have a fitness function that’s maximal only for a phenotype that exactly matches a particular pattern? </p> <p>As an example, let’s consider <em>k</em> = 2, <em>r</em> = 1 cellular automata with phenotypes grown for a specific number of steps—and with a fitness function that counts the number of cells that agree with ones in a target:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024exactimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024exactimg1.png' alt='' title='' width='555' height='106'> </div> </p></div> <p>Let’s say we start with the null rule, then adaptively evolve by making single point mutations to the rule (here just 8 bits). With a target of the rule 30 pattern, this is the multiway graph we get:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024exactimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024exactimg2.png' alt='' title='' width='599' height='617'> </div> </p></div> <p>And what we see is that after a grand tour of nearly a third of all possible rules, we can successfully reach the rule 30 pattern. But we can also get stuck at rule 86 and rule 190 patterns—even though their fitness values are much lower:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024exactimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024exactimg3.png' alt='' title='' width='692' height='139'> </div> </p></div> <p>If we consider all possible <em>k</em> = 2, <em>r</em> = 1 cellular automaton patterns as targets, it turns out that these can always be reached by adaptive evolution from the null rule—though a little less than half the time there are other possible endpoints (here specified by rule numbers) at which the evolution process can get stuck:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024exactimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024exactimg4.png' alt='' title='' width='659' height='431'> </div> </p></div> <p>So far we’ve been assuming that we have a fitness function that’s maximized by matching some pattern generated by a cellular automaton pattern. But what if we pick some quite different pattern to match against? Say our pattern is:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024exactimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024exactimg5.png' alt='' title='' width='96' height='109'> </div> </p></div> <p>With <em>k</em> = 2, <em>r</em> = 1 rules (running with wraparound in a finite-size region), we can construct a multiway graph</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024exactimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024exactimg6.png' alt='' title='' width='451' height='453'> </div> </p></div> <p>and find out that the maximum fitness endpoints are the not-very-good approximations: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024exactimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024exactimg7.png' alt='' title='' width='106' height='55'> </div> </p></div> <p>We can also get to these by applying random mutations:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024exactimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024exactimg8.png' alt='' title='' width='416' height='207'> </div> </p></div> <p>But what if we try a larger rule space, say <em>k</em> = 2, <em>r</em> = 2 rules? Our approximations to the “A” image get a bit better:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024exactimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024exactimg9.png' alt='' title='' width='608' height='377'> </div> </p></div> <p>Going to <em>k</em> = 2, <em>r</em> = 3 leads to slightly better (but not great) final approximations:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024exactimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024exactimg10.png' alt='' title='' width='566' height='43'> </div> </p></div> <p>If we try to do the same thing with our target instead being</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024exactimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024exactimg11.png' alt='' title='' width='43' height='57'> </div> </p></div> <p>we get for example</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024exactimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024exactimg12.png' alt='' title='' width='566' height='50'> </div> </p></div> <p>while with target</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024exactimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024exactimg13.png' alt='' title='' width='52' height='59'> </div> </p></div> <p>we get (even less convincing) results like:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024exactimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024exactimg14.png' alt='' title='' width='566' height='43'> </div> </p></div> <p>What’s going on here? Basically it’s that if we try to set up too intricate a fitness function, then our rule spaces won’t contain rules that successfully maximize it, and our adaptive evolution process will end up with a variety of not-very-good approximations.</p> <h2 id="how-fitness-builds-up">How Fitness Builds Up</h2> <p>When one looks at an evolution process like</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024fitnessimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024fitnessimg1.png' alt='' title='' width='487' height='209'> </div> </p></div> <p>one typically has the impression that successive phenotypes are achieving greater fitness by somehow progressively “building on the ideas” of earlier ones. And to get a more granular sense of this we can highlight cells at each step that are using “newly added cases” in the rule:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024fitnessimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024fitnessimg2.png' alt='' title='' width='615' height='331'> </div> </p></div> <p>We can think of new rule cases as a bit like new genes in biology. So what we’re seeing here is the analog of new genes switching on (or coming into existence) as we progress through the process of biological evolution.</p> <p>Here’s what happens for some other paths of evolution:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024nikupdatesimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024nikupdatesimg4.png' alt='' title='' width='632' height='1241'> </div> </p></div> <p>What we see is quite variable. There are a few examples where new rule cases show up only at the end, as if a new “incrementally engineered” pattern was being “grafted on at the end”. But most of the time new rule cases show up sparsely dotted all over the pattern. And somehow those few “tweaks” lead to higher fitness—even though there’s no obvious reason why, and no obvious way to predict where they should be.</p> <p>It’s interesting to compare this with actual biology, where it’s pretty common to see what appear to be “random gratuitous changes” between apparently very similar organisms. (And, yes, this can lead to all sorts of problems in things like comparing toxicity or drug effectiveness in model animals versus humans.) </p> <p>There are many ways to consider quantitatively characterizing how “rule utilization” builds up. As just one example, here are plots for successive phenotypes along the evolution paths shown above of what stages in growth new rule cases show up:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024fitnessimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' style="margin-bottom: -15px" src='https://content.wolfram.com/sites/43/2024/12/sw12052024fitnessimg4.png' alt='' title='' width='398' height='116'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024fitnessimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024fitnessimg5.png' alt='' title='' width='670' height='199'> </div> </p></div> <h2 id="but-is-it-explainable">But Is It Explainable?</h2> <p>Here are two “adaptively evolved” long-lifetime rules that we discussed at the beginning:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024explainableimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024explainableimg1.png' alt='' title='' width='370' height='430'> </div> </p></div> <p>We can always run these rules and see what patterns they produce. But is there a way to explain what they do? And for example to analyze how they manage to yield lifetimes? Or is what we’re seeing in these rules basically “pure computational irreducibility” where the only way to tell what patterns they will generate—and how long they’ll live—is just explicitly to run them step by step?</p> <p>The second rule here seems to have a bit more regularity than the first, so let’s tackle it first. Let’s look at the “blade” part. Once such an object—of any width—has formed, its behavior will basically be repetitive, and it’s easy to predict what will happen:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024explainableimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024explainableimg2.png' alt='' title='' width='103' height='253'> </div> </p></div> <p>The left-hand edge moves by 1 position every 7 steps, and the right-hand edge by 4 positions every 12 steps. And since <span class='InlineFormula'><img src='https://content.wolfram.com/sites/43/2024/12/sw12042024explainableimg3.png' width= '41' height='34' align='absmiddle'></span>, however wide the initial configuration is, it’ll always die out, after a number of steps that’s roughly <span class='InlineFormula'><img src='https://content.wolfram.com/sites/43/2024/12/sw12042024explainableimg4.png' width= '67' height='34' align='absmiddle'></span> times the initial width.</p> <p>But OK, how does a configuration like this get produced? Well, that’s far from obvious. Here’s what happens with a sequence of few-cell initial conditions <img loading='lazy' style="margin-bottom: -3px" src='https://content.wolfram.com/sites/43/2024/12/sw12042024commaboxesA.png' alt='' title='' width='91' height='15'> …:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024explainableimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024explainableimg5.png' alt='' title='' width='674' height='103'> </div> </p></div> <p>So, yes, it doesn’t always directly make the “blade”. Sometimes, for example, it instead makes things like these, some of which basically just become repetitive, and live forever:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024explainableimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024explainableimg6.png' alt='' title='' width='641' height='323'> </div> </p></div> <p>And even if it starts with a “blade texture” unexpected things can happen:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024explainableimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024explainableimg7.png' alt='' title='' width='99' height='367'> </div> </p></div> <p>There are <a href="https://www.wolframscience.com/nks/p268--special-initial-conditions/">repetitive patterns that can persist</a>—and indeed the “blade” uses one of these:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024explainableimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024explainableimg8.png' alt='' title='' width='566' height='104'> </div> </p></div> <p>Starting from a random initial condition one sees various kinds of behavior, with the blade being fairly common:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024explainableimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024explainableimg9.png' alt='' title='' width='678' height='287'> </div> </p></div> <p>But none of this really makes much of a dent in “explaining” why with this rule, starting from a single red cell, we get a long-lived pattern. Yes, once the “blade” forms, we know it’ll take a while to come to a point. But beyond this little pocket of computational reducibility we can’t say much in general about what the rule does—or why, for example, a blade forms with this initial condition.</p> <p>So what about our other rule? There’s no obvious interesting pocket of reducibility there at all. Looking at a sequence of few-cell initial conditions we get:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024explainableimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024explainableimg10.png' alt='' title='' width='641' height='118'> </div> </p></div> <p>And, yes, there’s all sorts of different behavior that can occur:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024explainableimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024explainableimg11.png' alt='' title='' width='358' height='344'> </div> </p></div> <p>The first of these patterns is basically periodic, simply shifting 2 cells to the left every 56 steps. The third one dies out after 369 steps, and the fourth one becomes basically periodic (with period 56) after 1023 steps:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024explainableimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024explainableimg12.png' alt='' title='' width='190' height='430'> </div> </p></div> <p>If we start from a random initial condition we see a few places where things die out in a repeatable pattern. But mostly everything just looks very complicated:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024explainableimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024explainableimg13.png' alt='' title='' width='678' height='287'> </div> </p></div> <p><a href="https://www.wolframscience.com/nks/p267--special-initial-conditions/">As always happens</a>, the rule supports regions of repetitive behavior, but they don’t normally extend far enough to introduce any significant computational reducibility:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024explainableimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024explainableimg14.png' alt='' title='' width='319' height='95'> </div> </p></div> <p>So what’s the conclusion? Basically it’s that these rules—like pretty much all others we’ve seen here—behave in essentially computationally irreducible ways. Why do they have long lifetimes? All we can really say is “because they do”. Yes, we can always run them and see what happens. But we can’t make any kind of “explanatory theory”, for example of the kind we’re used to in mathematical approaches to physics.</p> <h2 id="distribution-in-morphospace">Distribution in Morphospace</h2> <p>We can think of the pattern of growth seen in each phenotype as defining what we might call in biology its “<a href="https://www.wolframscience.com/nks/chap-8--implications-for-everyday-systems#sect-8-6--growth-of-plants-and-animals">morphology</a>”. So what happens if we try to operate as “pure taxonomists”, laying out different phenotypes in “morphospace”? Here’s a result based on using <a href="https://reference.wolfram.com/language/ref/FeatureSpacePlot.html">machine learning and <tt>FeatureSpacePlot</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024morphospaceimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024morphospaceimg1.png' alt='' title='' width='645' height='399'> </div> </p></div> <p>And, yes, this tends to group “visually similar” phenotypes together. But how does proximity in morphospace relate to proximity in genotypes? Here is the same arrangement of phenotypes as above, but now indicating the transformations associated with single mutations in genotype:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024morphospaceimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024morphospaceimg2.png' alt='' title='' width='649' height='376'> </div> </p></div> <p>If for example we consider maximizing for height, only some of the phenotypes are picked out:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024morphospaceimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024morphospaceimg3.png' alt='' title='' width='356' height='220'> </div> </p></div> <p>For width, a somewhat different set are picked out:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024morphospaceimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024morphospaceimg4.png' alt='' title='' width='358' height='221'> </div> </p></div> <p>And here is what happens if our fitness function is based on aspect ratio:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024morphospaceimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024morphospaceimg5.png' alt='' title='' width='666' height='216'> </div> </p></div> <p>In other words, different fitness functions “select out” different regions in morphospace. </p> <p>We can also construct a morphospace not just for symmetric but for all <em>k</em> = 2, <em>r</em> = 2 rules:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024nikupdatesimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024nikupdatesimg5A.png' alt='' title='' width='648' height='391'> </div> </p></div> <p>The detailed pattern here is not particularly significant, and, more than anything, just reflects the method of dimension reduction that we’ve used. What is more meaningful, however, is how different fitness functions select out different regions in morphospace. This shows the results for fitness functions based on height and on width—with points colored according to the actual values of height and width for those phenotypes:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024morphospaceCimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024morphospaceimg8.png' alt='' title='' width='613' height='209'> </div> </p></div> <p>Here are the corresponding results for fitness functions based on different aspect ratios, where now the coloring is based on closeness to the target aspect ratio:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12062024colorupdateimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12062024colorupdateimg1.png' alt='' title='' width='576' height='226'> </div> </p></div> <p>What’s the main conclusion here? We might have expected that different fitness functions would cleanly select visibly different parts of morphospace. But at least with our machine-learning-based way of laying out morphospace that’s not what we’re seeing. And it seems likely that this is actually a general result—and that there is no layout procedure that can make any “easy to describe” fitness function “geometrically simple” in morphospace. And once again, this is presumably a consequence of underlying computational irreducibility—and to the fact that we can’t expect any morphospace layout procedure to be able to provide a way to “untangle the irreducibility” that will work for all fitness functions.</p> <h2 id="probabilities-and-the-time-course-of-evolution">Probabilities and the Time Course of Evolution</h2> <p>In what we’ve done so far, we’ve mostly been concerned with things like what sequences of phenotypes can ever be produced by adaptive evolution. But in making analogies to actual biological evolution—and particularly to how it’s captured in the fossil record—it’s also relevant to discuss time, and to ask not only what phenotypes can be produced, but also when, and how frequently. </p> <p>For example, let’s assume there’s a constant rate of point mutations in time. Then starting from a given rule (like the null rule) there’ll be a certain rate at which transitions to other rules occur. Some of these transitions will lead to rules that are selected out. Others will be kept, but will yield the same phenotype. And still others will lead to transitions to different phenotypes.</p> <p>We can represent this by a “phenotype transition diagram” in which the thickness of each outgoing edge from a given phenotype indicates the fraction of all possible mutations that lead to the transition associated with that edge: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024probabilitiesXimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024probabilitiesBimg1.png' alt='' title='' width='625' height='447'> </div> </p></div> <p>Gray self-loops in this diagram represent transitions that lead back to the same phenotype (because they change cases in the rule that don’t matter). Pink self-loops correspond to transitions that lead to rules that are selected out. We don’t show rules that have been selected out here; instead we assume that in this case we just “wait at the original phenotype” and don’t make a transition. </p> <p>We can annotate the whole symmetric <em>k</em> = 2, <em>r</em> = 2 multiway evolution graph with transition probabilities:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024probabilitiesXimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024probabilitiesBimg2.png' alt='' title='' width='577' height='625'> </div> </p></div> <p>Underlying this graph is a matrix of transition probabilities between all 2<sup>19</sup> possible symmetric <nobr><em>k</em> = 2, <em>r</em> = 2</nobr> rules (where the structure reflects the fact that many rules transform to rules which differ only by one bit):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024probabilitiesXimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024probabilitiesBimg4.png' alt='' title='' width='187' height='187'> </div> </p></div> <p>Keeping only distinct phenotypes and ordering by lifetime, we can then make a matrix of phenotype transition probabilities:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024probabilitiesXimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024probabilitiesBimg5.png' alt='' title='' width='203' height='203'> </div> </p></div> <p>Treating the transitions as a Markov process, this allows us to compute the expected frequency of each phenotype as a function of time (i.e. number of mutations):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024probabilitiesXimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024probabilitiesBimg6.png' alt='' title='' width='619' height='331'> </div> </p></div> <p>What’s basically happening here is that there’s steady evolution away from the single-cell phenotype. There are some intermediate phenotypes that come and go, but in the end, everything “flows” to the final (“leaf”) phenotypes on the multiway evolution graph—leading to a limiting “equilibrium” probability distribution:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024probabilitiesXimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024probabilitiesBimg7.png' alt='' title='' width='572' height='188'> </div> </p></div> <p>Stacking the different curves, we get an alternative visualization of the evolution of phenotype frequencies:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024probabilitiesXimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024probabilitiesBimg8.png' alt='' title='' width='468' height='291'> </div> </p></div> <p>If we were “running evolution” with enough separate individuals, these would be the limiting curves we’d get. If we reduced the number of individuals, we’d start to see fluctuations—and there’d be a certain probability, for example, for a particular phenotype to end up with zero individuals, and effectively go extinct.</p> <p>So what happens with a different fitness function? Here’s the result using width instead of height:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024probabilitiesXimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024probabilitiesBimg9.png' alt='' title='' width='633' height='339'> </div> </p></div> <p>And here are results for fitness functions based on a sequence of targets for aspect ratio:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024probabilitiesXimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024probabilitiesBimg10.png' alt='' title='' width='618' height='364'> </div> </p></div> <p>And, yes, the fitness function definitely influences the time course of our adaptive evolution process.</p> <p>So far we’ve been looking only at symmetric <em>k</em> = 2, <em>r</em> = 2 rules. If we look at the space of all possible <em>k</em> = 2, <em>r</em> = 2 rules, the behavior we see is similar. For example, here’s the time evolution of possible phenotypes based on our standard height fitness function:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024probabilitiesXimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024probabilitiesBimg11.png' alt='' title='' width='646' height='352'> </div> </p></div> <p>And this is what we see if we look only at the longest-lifetime (i.e. largest-height) cases:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024probabilitiesXimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024probabilitiesBimg12.png' alt='' title='' width='439' height='226'> </div> </p></div> <p>As the scale here indicates, such long-lived phenotypes are quite rare—though most still occur with nonzero frequency even after arbitrarily large times (which is an inevitable given that they appear as “maximal fitness” terminal nodes in the multiway graph). </p> <p>And indeed if we plot the final frequencies of phenotypes against their lifetimes we see that there are a wide range of different cases:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024probabilitiesXimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024probabilitiesBimg13.png' alt='' title='' width='645' height='400'> </div> </p></div> <p>The phenotypes with the highest “equilibrium” frequencies are</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024probabilitiesXimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024probabilitiesBimg14.png' alt='' title='' width='596' height='195'> </div> </p></div> <p>with some having fairly small lifetimes, and others larger. </p> <h2 id="the-macroscopic-flow-of-evolution">The Macroscopic Flow of Evolution</h2> <p>In the previous section, we looked at the time course of evolution with various different—but fixed—fitness functions. But what if we had a fitness function that changes with time—say analogous to an environment for biological evolution that changes with time? </p> <p>Here’s what happens if we have an aspect ratio fitness function whose target value increases linearly with time:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024macroscopicimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024macroscopicimg1.png' alt='' title='' width='646' height='353'> </div> </p></div> <p>The behavior we see is quite complex, with certain phenotypes “winning for a while” but then dying out, often quite precipitously—with others coming to take their place.</p> <p>If instead the target aspect ratio decreases with time, we see rather different behavior: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024macroscopicimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024macroscopicimg2.png' alt='' title='' width='560' height='348'> </div> </p></div> <p>(The discontinuous derivatives here are basically associated with the sudden appearance of new phenotypes at particular target aspect ratio values.)</p> <p>It’s also possible to give a “shock to the system” by suddenly changing the target aspect ratio:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024macroscopicimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024macroscopicimg3.png' alt='' title='' width='620' height='184'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024macroscopicimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024macroscopicimg4.png' alt='' title='' width='620' height='184'> </div> </p></div> <p>And what we see is that sometimes this shock leads to fewer surviving phenotypes, and sometimes to more.</p> <p>We can think of a changing fitness function as being something that applies a “macroscopic driving force” to our system. Things happen quickly down at the level of individual mutation and selection events—but the fitness function defines overall “goals” for the system that in effect change only slowly. (It’s a bit like a fluid where there are fast molecular-scale processes, but typically slow changes of macroscopic parameters like pressure.) </p> <p>But if the fitness function defines a goal, how well does the system manage to meet it? Here’s a comparison between an aspect ratio goal (here, linearly increasing) and the distribution of actual aspect ratios achieved, with the darker curve indicating the mean aspect ratio obtained by a weighted average over phenotypes, and the lighter blue area indicating the standard deviation:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024macroscopicimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024macroscopicimg5.png' alt='' title='' width='409' height='257'> </div> </p></div> <p>And, yes, as we might have expected from earlier results, the system doesn’t do particularly well at achieving the goal. Its behavior is ultimately not “well sculpted” by the forces of a fitness function; instead it is mostly dominated by the intrinsic (computationally irreducible) dynamics of the underlying adaptive evolution process.</p> <p>One important thing to note however is that our results depend on the value of a parameter: essentially the rate at which underlying mutations occur relative to the rate of change of the fitness function. In the picture above 5000 mutations occur over the time the fitness function goes from minimum to maximum value. This is what happens if we change the number of mutations that occur (or, in effect, the “mutation rate”):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024macroscopicimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024macroscopicimg6.png' alt='' title='' width='603' height='145'> </div> </p></div> <p>Generally—and not surprisingly—adaptive evolution does better at achieving the target when the mutation rate is higher, though in both the cases shown here, nothing gets terribly close to the target.</p> <p>In their general character our results here seem reminiscent of what one might expect in typical studies of continuum systems, say based on differential equations. And indeed one can imagine that there might be “continuum equations of adaptive evolution” that govern situations like the ones we’ve seen here. But it’s important to understand that it’s far from self evident that this is possible. Because underneath everything is a multiway evolution graph with a definite and complicated structure. And one might think that the details of this structure would matter to the overall “continuum evolution process”. And indeed sometimes they will. </p> <p>But—as we have seen throughout our <a href="https://www.wolframphysics.org/" target="_blank" rel="noopener">Physics Project</a>—underlying computational irreducibility leads to a certain inevitable simplicity when looking at phenomena perceived by <a href="https://writings.stephenwolfram.com/2023/12/observer-theory/">computationally bounded observers</a>. And we can expect that something similar can happen with biological evolution (and indeed <a href="https://writings.stephenwolfram.com/2024/08/whats-really-going-on-in-machine-learning-some-minimal-models/">adaptive evolution in general</a>). Assuming that our fitness functions (and their process of change) are computationally bounded, then we can expect that their “aggregate effects” will follow comparatively simple laws—which we can perhaps think of as laws for the “flow of evolution” in response to external input. </p> <h2 id="can-evolution-be-reversed">Can Evolution Be Reversed?</h2> <p>In the previous section we saw that with different fitness functions, different time series of phenotypes appear, with some phenotypes, for example, sometimes “going extinct”. But let’s say evolution has proceeded to a certain point with a particular fitness function—and certain phenotypes are now present. Then one question we can ask is whether it’s possible to “reverse” that evolution, and revert to phenotypes that were present before. In other words, if we change the fitness function, can we make evolution “go backwards”?</p> <p>We’ve often discussed a fitness function based on maximizing total (finite) lifetime. But what if, after using this fitness function for a while, we “reverse it”, now minimizing total lifetime? </p> <p>Consider the multiway evolution graph for symmetric <em>k</em> = 2, <em>r</em> = 2 rules starting from the null rule, with the fitness function yet again being maximizing lifetime:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024reversedimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024reversedimg1.png' alt='' title='' width='435' height='306'> </div> </p></div> <p>But what if we now say the fitness function minimizes lifetime? If we start from the longest-lifetime phenotype we get the “lifetime minimization” multiway graph:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024reversedimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024reversedimg2.png' alt='' title='' width='123' height='346'> </div> </p></div> <p>We can compare this “reversed graph” to the “forward graph” based on all paths from the null rule to the maximum-lifetime rule:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024reversedAimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024reversedimg3.png' alt='' title='' width='119' height='349'> </div> </p></div> <p>And in this case we see that the phenotypes that occur are almost the same, with the exception of the fact that <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/12/sw12042024tbox.png' alt='' title='' width='29' height='18'/> can appear in the reverse case. </p> <p>So what happens when we look at all <em>k</em> = 2, <em>r</em> = 2 rules? Here’s the “reverse graph” starting from the longest-lifetime phenotype:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024reversedimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024reversedimg5.png' alt='' title='' width='634' height='458'> </div> </p></div> <p>A total of 345 phenotypes appear here eventually leading all the way back to <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/12/sw12042024boxes.png' alt='' title='' width='26' height='18'/>. In the overall “forward graph” (which has to start from <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/12/sw12042024lbox.png' alt='' title='' width='23' height='18'/> rather than <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/12/sw12042024boxes.png' alt='' title='' width='26' height='18'/>) a total of<span class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12162024C2Cnumbersimg21_copy.txt' data-c2c-type='text/html'>2409</span>phenotypes appear, though (as we saw above) only 64 occur in paths that eventually lead to the maximum lifetime phenotype:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024reversedimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024reversedimg9.png' alt='' title='' width='505' height='398'> </div> </p></div> <p>And what we see here is that the forward and reverse graphs look quite different. But could we perhaps construct a fitness function for the reverse graph that will successfully corral the evolution process to precisely retrace the steps of the forward graph? </p> <p>In general, this isn’t something we can expect to be able to do. Because to do so would in effect require “breaking the computational irreducibility” of the system. It would require having a fitness function that can in essence predict every detail of the evolution process—and in so doing be in a position to direct it. But to achieve this, the fitness function would in a sense have to be computationally as sophisticated as the evolution process itself. </p> <p>It’s a variant of an argument we’ve used several times here. Realistic fitness functions are computationally bounded (and in practice often very coarse). And that means that they can’t expect to match the computational irreducibility of the underlying evolution process. </p> <p>There’s an analogy to the <a href="https://writings.stephenwolfram.com/2023/02/computational-foundations-for-the-second-law-of-thermodynamics/">Second Law of thermodynamics</a>. Just as the microscopic collisions of individual molecules are in principle easy to reverse, so potentially are individual transitions in the evolution graph. But putting many collisions or many transitions together leads to a process that is computationally sophisticated enough that the fairly coarse means at our disposal can’t “decode” and reverse it. </p> <p>Put another way, there is in practice a certain inevitable irreversibility to both molecular dynamics and biological evolution. Yes, with enough computational effort—say carefully controlling the fitness function for every individual organism—it might in principle be possible to precisely “reverse evolution”. But in practice the kinds of fitness functions that exist in nature—or that one can readily set up in a lab—are computationally much too weak. And as a result one can’t expect to be able to get evolution to precisely retrace its steps.</p> <h2 id="random-or-selected-can-one-tell">Random or Selected? Can One Tell?</h2> <p>Given only a genotype, is there a way to tell whether it’s “just random” or whether it’s <a href="https://writings.stephenwolfram.com/2018/01/showing-off-to-the-universe-beacons-for-the-afterlife-of-our-civilization/">actually the result of some long and elaborate process</a> of adaptive evolution? From the genotype one can in principle use the rules it defines to “grow” the corresponding phenotype—and then look at whether it has an “unusually large” fitness. But the question is whether it’s possible to tell anything directly from the genotype, without going through the computational effort of generating the phenotype.</p> <p>At some level it’s like asking, whether, say, from a cellular automaton rule, one can predict the ultimate behavior of the cellular automaton. And a core consequence of computational irreducibility is that one can’t in general expect to do this. Still, one might imagine that one could at least make a “reasonable guess” about whether a genotype is “likely” to have been chosen “purely randomly” or to have been “carefully selected”. </p> <p>To explore this, we can look at the genotypes for symmetric <em>k</em> = 2, <em>r</em> = 2 rules, say ordered by their lifetime-based fitness—with black and white here representing “required” rule cases, and gray representing undetermined ones (which can all independently be either black or white):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024randomimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024randomimg1.png' alt='' title='' width='285' height='602'> </div> </p></div> <p>On the right is a summary of how many white, black and undetermined (gray) outcomes are present in each genotype. And as we have seen several times, to achieve high fitness all or almost all of the outcomes must be determined—so that in a sense all or almost all of the genome is “being used”. But we still need to ask whether, given a certain actual pattern of outcomes, we can successfully guess whether or not a genotype is the result of selection.</p> <p>To get more of a sense of this, we can look at plots of the probabilities for different outcomes for each case in the rule, first (trivially) for all combinatorially possible genotypes, then for all genotypes that give viable (i.e. in our case, finite-lifetime) phenotypes, and then for “selected genotypes”:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024randomimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024randomimg2.png' alt='' title='' width='479' height='167'> </div> </p></div> <p>Certain cases are always completely determined for all viable genomes—but rather trivially so, because, for example, if <img style="margin-bottom: -2px" loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/randomboxrow.png' alt='' title='' width='73' height='13'/> then the pattern generated will expand at maximum speed forever, and so cannot have a finite lifetime. </p> <p>So what happens for all <em>k</em> = 2, <em>r</em> = 2 rules? Here are the actual genomes that lead to particular fitness levels:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024randomimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024randomimg4.png' alt='' title='' width='615' height='427'> </div> </p></div> <p>And now here are the corresponding probabilities for different outcomes for each case in the rule:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12032024randomimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12032024randomimg5.png' alt='' title='' width='479' height='171'> </div> </p></div> <p>And, yes, given a particular setup we could imagine working out from results like these at least an approximation to the likelihood for a given randomly chosen genome to be a selected one. But what’s true in general? Is there something that can be determined with bounded computational effort (i.e. without explicitly computing phenotypes and their fitnesses) that gives a good estimate of whether a genome is selected? There are good reasons to believe that computational irreducibility will make this impossible. </p> <p>It’s a different story, of course, if one’s given a “fully computed” phenotype. But at the genome level—without that computation—it seems unlikely that one can expect to distinguish random from “selected-somehow” genotypes.</p> <h2 id="adaptive-evolution-of-initial-conditions">Adaptive Evolution of Initial Conditions</h2> <p>In making our idealized model of biological evolution we’ve focused (as biology seems to) on the adaptive evolution of the genotype—or, in our case, the underlying rule for our cellular automata. But what if instead of changing the underlying rule, we change the initial condition used to “grow each organism”?</p> <p>For example, let’s say that we start with the “single cell” we’ve been using so far, but then at each step in adaptive evolution we change the value of one cell in the initial condition (say within a certain distance of our original cell)—then keep any initial condition that does not lead to a shorter lifetime:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024adaptiveimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024adaptiveimg1.png' alt='' title='' width='648' height='244'> </div> </p></div> <p>The sequence of lifetimes (“fitness values”) obtained in this process of adaptive evolution is</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024adaptiveimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024adaptiveimg2.png' alt='' title='' width='326' height='136'> </div> </p></div> <p>and the “breakthrough” initial conditions are:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024adaptiveAimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024adaptiveimg3.png' alt='' title='' width='127' height='217'> </div> </p></div> <p>The basic setup is similar to what we’ve seen repeatedly in the adaptive evolution of rules rather than initial conditions. But one immediate difference is that, at least in the example we’ve just seen, changing initial conditions does not as obviously “introduce new ideas” for how to increase lifetime; instead, it gives more of an impression of just directly extending “existing ideas”. </p> <p>So what happens more generally? Rules with <em>k</em> = 2, <em>r</em> = 1 tend to show either infinite growth or no growth—with finite lifetimes arising only from direct “erosion” of initial conditions (here for rules 104 and 164): </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024adaptiveimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024adaptiveimg4.png' alt='' title='' width='597' height='77'> </div> </p></div> <p>For <em>k</em> = 2, <em>r</em> = 2 rules the story is more complicated, even in the symmetric case. Here are the sequences of longest lifetime patterns obtained with all possible progressively wider initial conditions with various rules:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024adaptiveimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024adaptiveimg5.png' alt='' title='' width='634' height='739'> </div> </p></div> <p>Again, there is a certain lack of “fundamentally new ideas” in evidence, though there are definitely “mechanisms” that get progressively extended with larger initial conditions. (One notable regularity is that the maximum lifetimes of patterns often seem roughly proportional to the width of initial condition allowed.)</p> <p>Can adaptive evolution “discover more”? Typically, when it’s just modifying initial conditions in a fixed region, it doesn’t seem so—again it seems to be more about “extending existing mechanisms” than introducing new ones:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024adaptiveimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' style="margin-bottom: -15px" src='https://content.wolfram.com/sites/43/2024/12/sw12042024adaptiveimg6.png' alt='' title='' width='653' height='229'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024adaptiveimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024adaptiveimg7.png' alt='' title='' width='636' height='375'> </div> </p></div> <h2 id="2d-cellular-automata">2D Cellular Automata</h2> <p>Everything we’ve done so far has been for 1D cellular automata. So what happens if we go to 2D? In the end, the story is going to be very similar to 1D—except that the rule spaces even for quite minimal neighborhoods are vastly larger.</p> <p>With <em>k</em> = 2 colors, it turns out that with a 5-cell neighborhood <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/12/sw12042024blackplus.png' alt='' title='' width='18' height='18'> one can’t “escape from the null rule” by single point mutations. The issue is that any single case one adds in the rule will either do nothing, or will lead only to unbounded growth. And even with a 9-cell neighborhood <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/12/sw12042024blackboxes.png' alt='' title='' width='18' height='18'> one can’t get rules that show growth that is neither limited nor infinite with a single-cell initial condition. But with a <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/12/sw12042024blackplus.png' alt='' title='' width='18' height='18'> initial condition this is possible, and for example here is a sequence of phenotype patterns generated by adaptive evolution using lifetime as a fitness function:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024cellularimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024cellularimg1.png' alt='' title='' width='677' height='354'> </div> </p></div> <p>Here’s what these patterns look like when “viewed from above”: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024cellularimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024cellularimg2.png' alt='' title='' width='498' height='168'> </div> </p></div> <p>And here’s how the fitness progressively increases in this case:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024cellularimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024cellularimg3.png' alt='' title='' width='415' height='131'> </div> </p></div> <p>There are a total of <a href="https://www.wolframscience.com/nks/notes-5-2--numbers-of-possible-2d-cellular-automaton-rules/">2<sup>512</sup> ≈ 10<sup>154</sup> possible 9-neighbor rules</a>, and in this vast rule space it’s easy for adaptive evolution to find rules with long finite lifetimes. (By the way, I’ve no idea what the absolute maximum “busy beaver” lifetime in this space is.) </p> <p>Just as in 1D, there’s a fair amount of variation in the behavior one sees. Here are some examples of the “final rules” for various instances of adaptive evolution:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12042024cellularimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12042024cellularimg5.png' alt='' title='' width='597' height='1040'> </div> </p></div> <p>In a few cases one can readily “see the mechanism” for the lifetime—say associated with collisions between localized structures. But mostly, as in the other examples we’ve seen, there’s no realistic “narrative explanation” for how these rules achieve long yet finite lifetimes. </p> <h2 id="the-turing-machine-case">The Turing Machine Case</h2> <p>OK, so we’ve now looked at 2D as well as 1D cellular automata. But what about systems that aren’t cellular automata at all? Will we still see the same core phenomena of adaptive evolution that we’ve identified in cellular automata? The <a href="https://www.wolframscience.com/nks/chap-12--the-principle-of-computational-equivalence/">Principle of Computational Equivalence</a> would certainly lead one to expect that we would. But to check at least one example let’s look at <a href="https://www.wolframscience.com/nks/chap-3--the-world-of-simple-programs#sect-3-4--turing-machines">Turing machines</a>. </p> <p>Here’s a Turing machine with <em>s</em> = 3 states for its head, and <em>k</em> = 2 colors for cells on its tape:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024turingimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024turingimg1.png' alt='' title='' width='480' height='294'> </div> </p></div> <p>The Turing machine is set up to halt if it ever reaches a case in the rule where the output is <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024turingimg2.png' alt='' title='' width='9' height='9'/>. Starting from a blank initial condition, this particular Turing machine halts after 19 steps. </p> <p>So what happens if we try to adaptively evolve Turing machines with long lifetimes (i.e. that take many steps to halt)? Say we start from a “null rule” that halts in all cases, and then we make a sequence of single point mutations in the rule, keeping ones that don’t lead the Turing machine to halt in fewer steps than before. Here’s an example where the adaptive evolution eventually reaches a Turing machine that takes 95 steps to halt:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024turingimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024turingQimg3.png' alt='' title='' width='637' height='420'> </div> </p></div> <p>The sequence of (“breakthrough”) mutations involved here is</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024turingimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024turingimg4.png' alt='' title='' width='248' height='368'> </div> </p></div> <p>corresponding to a fitness curve of the form:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024turingimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024turingimg5.png' alt='' title='' width='359' height='151'> </div> </p></div> <p>And, yes, all of this is very analogous to what we’ve seen in cellular automata. But one difference is that with Turing machines there are routinely much larger jumps in halting times. And the basic reason for this is just that Turing machines have much less going on at any particular step than typical cellular automata do—so it can take them much longer to achieve some particular state, like a halting state. </p> <p>Here’s an example of adaptive evolution in the space of <em>s</em> = 3, <em>k</em> = 3 Turing machines—and in this case the final halting time is long enough that we’ve had to squash the image vertically (by a factor of 5):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024turingimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024turingimg6.png' alt='' title='' width='605' height='383'> </div> </p></div> <p>The fitness curve in this case is best viewed on a logarithmic scale:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024turingimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024turingimg7.png' alt='' title='' width='367' height='149'> </div> </p></div> <p>But while the largest-lifetime cellular automata that we saw above typically seemed to have very complex behavior, the largest-lifetime Turing machine here seems, at least on the face of it, to operate in a much more “systematic” and “mechanical” way. And indeed this becomes even more evident if we compress our visualization by looking only at steps on which the Turing machine head reverses its direction:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024turingimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024turingimg8.png' alt='' title='' width='123' height='189'> </div> </p></div> <p>Long-lifetime Turing machines found by adaptive evolution are not always so simple, though they still tend to show more regularity than long-lifetime cellular automata:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024turingimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024turingimg9.png' alt='' title='' width='688' height='263'> </div> </p></div> <p>But—presumably because Turing machines are “less efficient” than cellular automata—the very longest possible lifetimes can be very large. It’s not clear whether rules with such lifetimes can be found by adaptive evolution—not least because even to evaluate the fitness function for any particular candidate rule could <a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/#the-issue-of-undecidability">take an unbounded time</a>. And indeed among <em>s</em> = 3, <em>k</em> = 3 rules the <a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/#computation-theoretic-perspectives-and-busy-beavers">very longest possible is about 10<sup>17</sup> steps</a>—achieved by the rule</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024turingimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024turingimg12.png' alt='' title='' width='371' height='32'> </div> </p></div> <p>with the following “very pedantic behavior”:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024turingimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024turingimg13.png' alt='' title='' width='651' height='167'> </div> </p></div> <p>So what about multiway evolution graphs? There are a total of 20,736 <em>s</em> = 2, <em>k</em> = 2 Turing machines with halting states allowed. From these there are 37 distinct finite-lifetime phenotypes:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024turingimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024turingQimg13.png' alt='' title='' width='522' height='149'> </div> </p></div> <p>Just as in other cases we’ve investigated, there are fitness-neutral sets such as:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024turingBimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024turingQimg14.png' alt='' title='' width='670' height='146'> </div> </p></div> <p>Taking just one representative from each of these 18 sets, we can then construct a multiway evolution graph for 2,2 Turing machines with lifetime as our fitness function:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024turingimg16_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024turingimg16.png' alt='' title='' width='581' height='311'> </div> </p></div> <p>Here’s the analogous result for 3,2 Turing machines—with<span class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12162024C2Cnumbersimg22_copy.txt' data-c2c-type='text/html'>2250</span>distinct phenotypes, and a maximum lifetime of 21 steps (and the patterns produced by the machines just show by “slabs”):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12122024C2Cupdateimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024turingimg17B.png' alt='' title='' width='617' height='412'> </div> </p></div> <p>We could pick other fitness functions (like maximum pattern width, number of head reversals, etc.) But the basic structure and consequences of adaptive evolution seem to work very much the same in Turing machines as in cellular automata—much as we expect from the Principle of Computational Equivalence.</p> <h2 id="multiway-turing-machines">Multiway Turing Machines</h2> <p>Ordinary Turing machines (as well as ordinary cellular automata) in effect always follow a single path of history, producing a definite sequence of states based on their underlying rule. But it’s also possible to study <a href="https://www.wolframphysics.org/bulletins/2021/02/multiway-turing-machines/" target="_blank" rel="noopener">multiway Turing machines</a> in which many paths of history can be followed. Consider for example the rule:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024machinesimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024machinesimg1.png' alt='' title='' width='263' height='41'> </div> </p></div> <p>The <img loading='lazy' style="margin-bottom: -3px" src='https://content.wolfram.com/sites/43/2024/12/sw12052024machinesicon.png' alt='' title='' width='19' height='19'/> case in this rule has two possible outcomes—so this is a multiway system, and to represent its behavior we need a multiway graph:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024machinesimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024machinesimg3.png' alt='' title='' width='258' height='192'> </div> </p></div> <p>From a biological point of view, we can potentially think of such a multiway system as an idealized model for a process of adaptive evolution. So now we can ask: can we evolve this evolution? Or, in other words, can we apply adaptive evolution to systems like multiway Turing machines?</p> <p>As an example, let’s assume that we make single point mutation changes to just one case in a multiway Turing machine rule:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024machinesimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024machinesimg4.png' alt='' title='' width='223' height='283'> </div> </p></div> <p>Many multiway Turing machines won’t halt, or at least won’t halt on all their branches. But for our fitness function let’s assume we require multiway Turing machines to halt on all branches (or at least go into loops that revisit the same states), and then let’s take the fitness to be the <a href="https://bulletins.wolframphysics.org/2021/02/multiway-turing-machines/#the-halting-problem-and-busy-beavers" target="_blank" rel="noopener">total number of nodes in the multiway graph when everything has halted</a>. (And, yes, this is a direct generalization of our lifetime fitness function for ordinary Turing machines.) </p> <p>So with this setup here are some examples of sequences of “breakthroughs” in adaptive evolution processes: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12122024C2Cupdateimg5_copy.txt' data-c2c-type='text/html'> <img src='https://content.wolfram.com/sites/43/2024/12/sw12052024machinesimg5.png' alt='Breakthrough sequences' title='Breakthrough sequences' width='620' height='1003'/> </div> </p></div> <p>But what about looking at all possible paths of evolution for multiway Turing machines? Or, in other words, what about making a multiway graph of the evolution of multiway Turing machines?</p> <p>Here’s an example of what we get by doing this (showing at each node just a single example of a fitness-neutral set):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/12/sw12052024machinesimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/12/sw12052024machinesimg6.png' alt='' title='' width='651' height='459'> </div> </p></div> <p>So what’s really going on here? We’ve got a multiway graph of multiway graphs. But it’s worth understanding that the inner and outer multiway graphs are a bit different. The outer one is effectively a <a href="https://writings.stephenwolfram.com/2021/11/the-concept-of-the-ruliad/">rulial multiway graph</a>, in which different parts correspond to following different rules. The inner one is effectively a branchial multiway graph, in which different parts correspond to different ways of applying a particular rule. Ultimately, though, we can at least in principle expect to encode branchial transformations as rulial ones, and vice versa. </p> <p>So we can think of the adaptive evolution of multiway Turing machines as a first step in exploring “higher-order evolution”: the evolution of evolution, etc. And ultimately in exploring inevitable limits of recursive evolution in the ruliad—and how these might relate to the formation of observers in the ruliad.</p> <h2 id="some-conclusions">Some Conclusions</h2> <p>What does all this mean for the foundations of biological evolution? First and foremost, it reinforces the idea of <a href="https://www.wolframscience.com/nks/chap-8--implications-for-everyday-systems#sect-8-5--fundamental-issues-in-biology">computational irreducibility as a dominant force in biology</a>. One might have imagined that what we see in biology must have been “carefully sculpted” by fitness constraints (say imposed by the environment). But what we’ve found here suggests that instead much of what we see is actually just a direct reflection of computational irreducibility. And in the end, more than anything else, what biological evolution seems to be doing is to “recruit” lumps of irreducible computation, and set them up so as to achieve “fitness objectives”.</p> <p>It is, as I recently discovered, very similar to <a href="https://writings.stephenwolfram.com/2024/08/whats-really-going-on-in-machine-learning-some-minimal-models/">what happens in machine learning</a>. And in both cases this picture implies that there’s a limit to the kind of explanations one can expect to get. If one asks why something has the form it does, the answer will often just be: “because that’s the lump of irreducible computation that happened to be picked up”. And there isn’t any reason to think that there’ll be a “narrative explanation” of the kind one might hope for in traditional science.</p> <p>The simplicity of models makes it possible to study not just particular possible paths of adaptive evolution, but complete multiway graphs of all possible paths. And what we’ve seen here is that fitness functions in effect define a kind of <a href="https://www.wolframphysics.org/technical-introduction/the-updating-process-for-string-substitution-systems/foliations-and-coordinates-on-causal-graphs" target="_blank" rel="noopener">traversal order or (roughly) foliation for such multiway graphs</a>. If such foliations could be arbitrarily complex, then they could potentially pick out specific outcomes for evolution—in effect successfully “sculpting biology” through the details of natural selection and fitness functions. </p> <p>But the point is that fitness functions and resulting foliations of multiway evolution graphs don’t get arbitrarily complex. And even as the underlying processes by which phenotypes develop are full of computational irreducibility, the fitness functions that are applied are computationally bounded. And in effect the complexity that is perhaps the single most striking immediate feature of biological systems is therefore a consequence of the interplay between the computational boundedness of selection processes, and the computational irreducibility of underlying processes of growth and development. </p> <p>All of this relies on the fundamental idea that biological evolution—and biology—are at their core computational phenomena. And given this interpretation, there’s then a remarkable unification that’s emerging. </p> <p>It begins with the <a href="https://writings.stephenwolfram.com/2021/11/the-concept-of-the-ruliad/">ruliad</a>—the abstract object corresponding to the entangled limit of all possible computational processes. We’ve talked about the ruliad as the ultimate foundation for physics, and for <a href="https://writings.stephenwolfram.com/2022/03/the-physicalization-of-metamathematics-and-its-implications-for-the-foundations-of-mathematics/">mathematics</a>. And we now see that we can think of it as the ultimate foundation for biology too. </p> <p>In physics what’s crucial is that <a href="https://writings.stephenwolfram.com/2023/12/observer-theory/">observers like us “parse” the ruliad</a> in certain ways—and that these ways lead us to have a perception of the ruliad that follows core known laws of physics. And similarly, when observers like us do mathematics, we can think of ourselves as “extracting that mathematics” from the way we parse the ruliad. And now what we’re seeing is that biology emerges because of the way selection from the environment, etc. “parses” the ruliad. </p> <p>And what makes this view powerful is that we have to assume surprisingly little about how selection works to still be able to deduce important things about biology. In particular, if we assume that the selection operates in a computationally bounded way, then just from the inevitable underlying computational irreducibility “inherited” from the ruliad, we immediately know that biology must have certain features. </p> <p>In physics, the <a href="https://writings.stephenwolfram.com/2023/02/computational-foundations-for-the-second-law-of-thermodynamics/">Second Law of thermodynamics arises from the interplay</a> of underlying computational irreducibility of mechanical processes involving many particles or other objects, and our computational boundedness as observers. We have the impression that “randomness is increasing” because as computationally bounded observers we can’t “decrypt” the underlying computational irreducibility. </p> <p>What’s the analog of this in biology? Much as we can’t expect to “say what happens” in a system that follows the Second Law, so we can’t expect to “explain by selection” what happens in a biological system. Or, put another way, much of what we see in biology is just the way it is because of computational irreducibility—and try as we might it won’t be “explainable” by some fitness criterion that we can describe. </p> <p>But that doesn’t mean that we can’t expect to deduce “general laws of biology”, much as there are general laws about gases whose detailed structure follows the Second Law. And in what we’ve done here we can begin to see some hints of what those general laws might look like.</p> <p>They’ll be things like bulk statements about possible paths of evolution, and the effect of changing the constraints on them—a bit like laws of fluid mechanics but now applied to the rulial space of possible genotypes. But if there’s one thing that’s clear it’s that the minimal model we’ve developed of biological evolution has remarkable richness and potential. In the past it’s been possible to say things about what amounts to the pure combinatorics of evolution; now we can start talking in a structured way about what evolution actually does. And in doing this we go in the direction of finally giving biology a foundation as a theoretical science. </p> <h2 id='theres-so-much-more-to-study'>There’s So Much More to Study!</h2> <p>Even though this is my second long piece about my minimal model of biological evolution, I’ve barely scratched the surface of what can be done with it. First and foremost there are many detailed connections to be made with actual phenomena that have been observed—or could be observed—in biology. But there are also many things to be investigated directly about the model itself—and in effect much ruliology to be done on it. And what’s particularly notable is how accessible a lot of that ruliology is. (And, yes, you can click any picture here to get the Wolfram Language code that generates it.) What are some obvious things to do? Here are few. Investigate other fitness functions. Other rule spaces. Other initial conditions. Other evolution strategies. Investigate evolving both rules and initial conditions. Investigate different kinds of changes of fitness functions during evolution. Investigate the effect of having a much larger rule space. Investigate robustness (or not) to perturbations. </p> <p>In what I’ve done here, I’ve effectively aggregated identical genotypes (and phenotypes). But one could also investigate what happens if one in effect “traces every individual organism”. The result will be abstract structures that generalize the multiway systems we’ve shown here—and that are associated with higher levels of abstract formalism capable of describing phenomena that in effect go “below species”.</p> <p><span class="Subsubsection"><a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/#historical-notes">For historical notes see here »</a></span></p> <h2 id="thanks" style='font-size:1.2rem'>Thanks</h2> <p style='font-size:90%'>Thanks to Wolfram Institute fellows Richard Assar and Nik Murzin for their help, as well as to the supporters of the new Wolfram Institute initiative in theoretical biology. Thanks also to Brad Klee for his help. Related student projects were done at our <a href="https://education.wolfram.com/programs/">Summer Programs</a> this year by Brian Mboya, Tadas Turonis, Ahama Dalmia and Owen Xuan. </p> <p style='font-size:90%'>Since writing <a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/">my first piece about biological evolution</a> in March, I’ve had occasion to attend two biology conferences: SynBioBeta and WISE (“Workshop on Information, Selection, and Evolution” at the Carnegie Institution). I thank many attendees at both conferences for their enthusiasm and input. Curiously, before the WISE conference in October 2024 the last conference I had attended on biological evolution was more than 40 years earlier: the June 1984 Mountain Lake Conference on Evolution and Development.</p> ]]></content:encoded> <wfw:commentRss>https://writings.stephenwolfram.com/2024/12/foundations-of-biological-evolution-more-results-more-surprises/feed/</wfw:commentRss> <slash:comments>2</slash:comments> </item> <item> <title>On the Nature of Time</title> <link>https://writings.stephenwolfram.com/2024/10/on-the-nature-of-time/</link> <comments>https://writings.stephenwolfram.com/2024/10/on-the-nature-of-time/#comments</comments> <pubDate>Tue, 08 Oct 2024 21:41:58 +0000</pubDate> <dc:creator><![CDATA[Stephen Wolfram]]></dc:creator> <category><![CDATA[Philosophy]]></category> <category><![CDATA[Physics]]></category> <guid isPermaLink="false">https://writings.stephenwolfram.com/?p=63438</guid> <description><![CDATA[<span class="thumbnail"><img width="128" height="108" src="https://content.wolfram.com/sites/43/2024/10/swblog-time-icon-v2.png" class="attachment-thumbnail size-thumbnail wp-post-image" alt="" /></span>The Computational View of Time Time is a central feature of human experience. But what actually is it? In traditional scientific accounts it’s often represented as some kind of coordinate much like space (though a coordinate that for some reason is always systematically increasing for us). But while this may be a useful mathematical description, […]]]></description> <content:encoded><![CDATA[<span class="thumbnail"><img width="128" height="108" src="https://content.wolfram.com/sites/43/2024/10/swblog-time-icon-v2.png" class="attachment-thumbnail size-thumbnail wp-post-image" alt="" /></span><h2 id="the-computational-view-of-time">The Computational View of Time</h2> <p>Time is a central feature of human experience. But what actually is it? In traditional scientific accounts it’s often represented as some kind of coordinate much like space (though a coordinate that for some reason is always systematically increasing for us). But while this may be a useful mathematical description, it’s not telling us anything about what time in a sense “intrinsically is”. </p> <p>We get closer as soon as we start thinking in computational terms. Because then it’s natural for us to think of successive states of the world as being computed one from the last by the progressive application of some computational rule. And this suggests that we can identify the progress of time with the “progressive doing of <nobr id="link-test"> computation</nobr> by the universe”. </p> <p>But does this just mean that we are replacing a “time coordinate” with a “computational step count”? No. Because of the phenomenon of <a href="https://www.wolframscience.com/nks/chap-12--the-principle-of-computational-equivalence#sect-12-6--computational-irreducibility">computational irreducibility</a>. With the traditional mathematical idea of a time coordinate one typically imagines that this coordinate can be “set to any value”, and that then one can immediately calculate the state of the system at that time. But computational irreducibility implies that it’s not that easy. Because it says that there’s often essentially no better way to find what a system will do than by explicitly tracing through each step in its evolution.<span id="more-63438"></span></p> <p>In the pictures on the left there’s computational reducibility, and one can readily see what state will be after any number of steps <em>t</em>. But in the pictures on the right there’s (presumably) computational irreducibility, so that the only way to tell what will happen after <em>t</em> steps is effectively to run all those steps:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/10/sw10082024timeBimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/10/sw10082024timeBimg1.png' alt='' title='' width='642' height='178'> </div> </p></div> <p>And what this implies is that there’s a certain robustness to time when viewed in these computational terms. There’s no way to “jump ahead” in time; the only way to find out what will happen in the future is to go through the irreducible computational steps to get there. </p> <p>There are simple idealized systems (say with purely periodic behavior) where there’s computational reducibility, and where there isn’t any robust notion of the progress of time. But the point is that—as the <a href="https://www.wolframscience.com/nks/chap-12--the-principle-of-computational-equivalence/">Principle of Computational Equivalence</a> implies—our universe is inevitably full of computational irreducibility which in effect defines a robust notion of the progress of time.</p> <h2 id="the-role-of-the-observer">The Role of the Observer</h2> <p>That time is a reflection of the progress of computation in the universe is an important starting point. But it’s not the end of the story. For example, here’s an immediate issue. If we have a computational rule that determines each successive state of a system it’s at least in principle possible to know the whole future of the system. So given this why then do we have the experience of the future only “unfolding as it happens”?</p> <p>It’s fundamentally because of the way <a href="https://writings.stephenwolfram.com/2023/12/observer-theory/">we are as observers</a>. If the underlying system is computationally irreducible, then to work out its future behavior requires an irreducible amount of computational work. But it’s a core feature of observers like us that we are computationally bounded. So we can’t do all that irreducible computational work to “know the whole future”—and instead we’re effectively stuck just doing computation alongside the system itself, never able to substantially “jump ahead”, and only able to see the future “progressively unfold”.</p> <p>In essence, therefore, we experience time because of the interplay between our computational boundedness as observers, and the computational irreducibility of underlying processes in the universe. If we were not computationally bounded, we could “perceive the whole of the future in one gulp” and we wouldn’t need a notion of time at all. And if there wasn’t underlying computational irreducibility there wouldn’t be the kind of “progressive revealing of the future” that we associate with our experience of time.</p> <p>A notable feature of our everyday perception of time is that it seems to “flow only in one direction”—so that for example it’s generally much easier to remember the past than to predict the future. And this is closely related to the Second Law of thermodynamics, which (as <a href="https://writings.stephenwolfram.com/2023/02/computational-foundations-for-the-second-law-of-thermodynamics/">I’ve argued at length elsewhere</a>) is once again a result of the interplay between underlying computational irreducibility and our computational boundedness. Yes, the microscopic laws of physics may be reversible (and indeed if our system is simple—and computationally reducible—enough of this reversibility may “shine through”). But the point is that computational irreducibility is in a sense a much stronger force.</p> <p>Imagine that we prepare a state to have orderly structure. If its evolution is computationally irreducible then this structure will effectively be “encrypted” to the point where a computationally bounded observer can’t recognize the structure. Given underlying reversibility, the structure is in some sense inevitably “still there”—but it can’t be “accessed” by a computationally bounded observer. And as a result such an observer will perceive a definite flow from orderliness in what is prepared to disorderliness in what is observed. (In principle one might think it should be possible to set up a state that will “behave antithermodynamically”—but the point is that to do so would require predicting a computationally irreducible process, which a computationally bounded observer can’t do.)</p> <p>One of the longstanding confusions about the nature of time has to do with its “mathematical similarity” to space. And indeed ever since the early days of relativity theory it’s seemed convenient to talk about “spacetime” in which notions of space and time are bundled together.</p> <p>But in our <a href="https://www.wolframphysics.org/" target="_blank" rel="noopener">Physics Project</a> that’s not at all how things fundamentally work. At the lowest level the state of the universe is <a href="https://writings.stephenwolfram.com/2020/04/finally-we-may-have-a-path-to-the-fundamental-theory-of-physics-and-its-beautiful/#what-is-space">represented by a hypergraph</a> which captures what can be thought of as the “spatial relations” between discrete “atoms of space”. <a href="https://writings.stephenwolfram.com/2020/04/finally-we-may-have-a-path-to-the-fundamental-theory-of-physics-and-its-beautiful/#time">Time then corresponds</a> to the progressive rewriting of this hypergraph.</p> <p>And in a sense the “atoms of time” are the elementary “rewriting events” that occur. If the “output” from one event is needed to provide “input” to another, then we can think of the first event as preceding the second event in time—and the events as being “timelike separated”. And in general we can construct a <a href="https://writings.stephenwolfram.com/2020/04/finally-we-may-have-a-path-to-the-fundamental-theory-of-physics-and-its-beautiful/#the-graph-of-causal-relationships">causal graph</a> that shows the dependencies between different events. </p> <p>So how does this relate to time—and spacetime? As we’ll discuss below, our everyday experience of time is that it follows a single thread. And so we tend to want to “parse” the causal graph of elementary events into a series of slices that we can view as corresponding to “successive times”. As in <a href="https://writings.stephenwolfram.com/2020/04/finally-we-may-have-a-path-to-the-fundamental-theory-of-physics-and-its-beautiful/#deriving-special-relativity">standard relativity theory</a>, there typically isn’t a unique way to assign a sequence of such “simultaneity surfaces”, with the result that there are different “reference frames” in which the identifications of space and time are different. </p> <p>The complete causal graph bundles together what we usually think of as space with what we usually think of as time. But ultimately the progress of time is always associated with some choice of successive events that “computationally build on each other”. And, yes, it’s more complicated because of the possibilities of different choices. But the basic idea of the progress of time as “the doing of computation” is very much the same. (In a sense time represents “computational progress” in the universe, while space represents the “layout of its data structure”.)</p> <p>Very much as in the derivation of the Second Law (or of fluid mechanics from molecular dynamics), the <a href="https://writings.stephenwolfram.com/2020/04/finally-we-may-have-a-path-to-the-fundamental-theory-of-physics-and-its-beautiful/#general-relativity-and-gravity">derivation of Einstein’s equations</a> for the large-scale behavior of spacetime from the underlying causal graph of hypergraph rewriting depends on the fact that we are computationally bounded observers. But even though we’re computationally bounded, we still have to “have something going on inside”, or we wouldn’t record—or sense—any “progress in time”.</p> <p>It seems to be the essence of observers like us—as captured in my <a href="https://writings.stephenwolfram.com/2023/12/observer-theory/">recent Observer Theory</a>—that we equivalence many different states of the world to derive our internal perception of “what’s going on outside”. And at some rough level we might imagine that we’re sensing time passing by the rate at which we add to those internal perceptions. If we’re not adding to the perceptions, then in effect time will stop for us—as happens if we’re asleep, anesthetized or dead. </p> <p>It’s worth mentioning that in some extreme situations it’s not the internal structure of the observer that makes perceived time stop; instead it’s the underlying structure of the universe itself. As we’ve mentioned, the “progress of the universe” is associated with successive rewriting of the underlying hypergraph. But when there’s been “too much activity in the hypergraph” (which physically corresponds roughly to too much energy-momentum), one can end up with a situation in which “there are no more rewrites that can be done”—so that in effect some part of the universe can no longer progress, <a href="https://www.wolframphysics.org/technical-introduction/potential-relation-to-physics/cosmology-expansion-and-singularities" target="_blank" rel="noopener">and “time stops” there</a>. It’s analogous to what happens at a spacelike singularity (normally associated with a black hole) in traditional general relativity. But now it has a very direct computational interpretation: one’s reached a “fixed point” at which there’s no more computation to do. And so there’s no progress to make in time.</p> <h2 id="multiple-threads-of-time">Multiple Threads of Time</h2> <p>Our strong human experience is that time progresses as a single thread. But now our <a href="https://writings.stephenwolfram.com/2020/04/finally-we-may-have-a-path-to-the-fundamental-theory-of-physics-and-its-beautiful/#the-inevitability-of-quantum-mechanics">Physics Project suggests</a> that at an underlying level time is actually in effect multithreaded, or, in other words, that there are many different “paths of history” that the universe follows. And it is only because of the way we as observers sample things that we experience time as a single thread. </p> <p>At the level of a particular underlying hypergraph the point is that there may be many different updating events that can occur, and each sequence of such updating event defines a different “path of history”. We can summarize all these paths of history in a multiway graph in which we merge identical states that arise:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/10/sw10082024timeAimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/10/sw10082024timeAimg2.png' alt='' title='' width='194' height='242'> </div> </p></div> <p>But given this underlying structure, why is it that we as observers believe that time progresses as a single thread? It all has to do with the notion of <a href="https://www.wolframphysics.org/technical-introduction/the-updating-process-for-string-substitution-systems/the-concept-of-branchial-graphs/" target="_blank" rel="noopener">branchial space</a>, and our presence within branchial space. The presence of many paths of history is what leads to quantum mechanics; the fact that we as <a href="https://writings.stephenwolfram.com/2023/12/observer-theory/#the-case-of-quantum-mechanics">observers ultimately perceive just one path</a> is associated with the traditionally-quite-mysterious phenomenon of “measurement” in quantum mechanics. </p> <p>When we talked about causal graphs above, we said that we could “parse” them as a series of “spacelike” slices corresponding to instantaneous “states of space”—represented by spatial hypergraphs. And by analogy we can similarly imagine breaking multiway graphs into “instantaneous slices”. But now these slices don’t represent states of ordinary space; instead they represent states of what we call branchial space.</p> <p>Ordinary space is “knitted together” by updating events that have causal effects on other events that can be thought of as “located at different places in space”. (Or, said differently, space is knitted together by the overlaps of the elementary light cones of different events.) Now we can think of branchial space as being “knitted together” by updating events that have effects on events that end up on different branches of history. </p> <p>(In general there is a close analogy between ordinary space and branchial space, and we can define a multiway causal graph that includes both “spacelike” and “branchlike” directions—with the branchlike direction supporting not light cones but what we can call entanglement cones.)</p> <p>So how do we as observers parse what’s going on? A key point is that we are inevitably part of the system we’re observing. So the branching (and merging) that’s going on in the system at large is also going on in us. So that means we have to ask how a “branching mind” will perceive a branching universe. Underneath, there are lots of branches, and lots of “threads of history”. And there’s lots of computational irreducibility (and even what we can call <a href="https://writings.stephenwolfram.com/2022/06/games-and-puzzles-as- multicomputational-systems/#humanizing-multicomputational-processes">multicomputational irreducibility</a>). But computationally bounded observers like us have to equivalence most of those details to wind up with something that “fits in our finite minds”. </p> <p>We can make an analogy to what happens in a gas. Underneath, there are lots of molecules bouncing around (and behaving in computationally irreducible ways). But observers like us are big compared to molecules, and (being computationally bounded) we don’t get to perceive their individual behavior, but only their aggregate behavior—from which we extract a thin set of computationally reducible “fluid-dynamics-level” features. </p> <p>And it’s basically the same story with the underlying structure of space. Underneath, there’s an elaborately changing network of discrete atoms of space. But as large, computationally bounded observers we can only sample aggregate features in which many details have been equivalenced, and in which space tends to seem continuous and describable in basically computationally reducible ways. </p> <p>So what about branchial space? Well, it’s basically the same story. Our minds are “big”, in the sense that they span many individual branches of history. And they’re computationally bounded so they can’t perceive the details of all those branches, but only certain aggregated features. And in a first approximation what then emerges is in effect a single aggregated thread of history. </p> <p>With sufficiently careful measurements we can sometimes see “quantum effects” in which multiple threads of history are in evidence. But at a direct human level we always seem to aggregate things to the point where what we perceive is just a single thread of history—or in effect a single thread of progression in time.</p> <p>It’s not immediately obvious that any of these “aggregations” will work. It could be that important effects we perceive in gases would depend on phenomena at the level of individual molecules. Or that to understand the large-scale structure of space we’d continually be having to think about detailed features of atoms of space. Or, similarly, that we’d never be able to maintain a “consistent view of history”, and that instead we’d always be having to trace lots of individual threads of history. </p> <p>But the key point is that for us to stay as computationally bounded observers we have to pick out only features that are computationally reducible—or in effect boundedly simple to describe. </p> <p>Closely related to our computational boundedness is the <a href="https://writings.stephenwolfram.com/2021/03/what-is-consciousness-some-new-perspectives-from-our-physics-project/">important assumption</a> we make that we as observers have a certain persistence. At every moment in time, we are made from different atoms of space and different branches in the multiway graph. Yet we believe we are still “the same us”. And the crucial physical fact (that has to be derived in our model) is that in ordinary circumstances there’s no inconsistency in doing this.</p> <p>So the result is that even though there are many “threads of time” at the lowest level—representing many different “quantum branches”—observers like us can (usually) successfully still view there as being a single consistent perceived thread of time.</p> <p>But there’s another issue here. It’s one thing to say that a single observer (say a single human mind or a single measuring device) can perceive history to follow a single, consistent thread. But what about different human minds, or different measuring devices? Why should they perceive any kind of consistent “objective reality”?</p> <p>Essentially the answer, I think, is that they’re all sufficiently nearby in branchial space. If we think about physical space, observers in different parts of the universe will clearly “see different things happening”. The “laws of physics” may be the same—but what star (if any) is nearby will be different. Yet (at least for the foreseeable future) for all of us humans it’s always the same star that’s nearby. </p> <p>And so it is, presumably, in branchial space. There’s some small patch in which we humans—with our shared origins—exist. And it’s presumably because that patch is small relative to all of branchial space that all of us perceive a consistent thread of history and a common objective reality. </p> <p>There are many subtleties to this, many of which aren’t yet fully worked out. In physical space, we know that effects can in principle spread at the speed of light. And in branchial space the analog is that effects can spread at the maximum entanglement speed (whose value we don’t know, though it’s <a href="https://www.wolframphysics.org/technical-introduction/potential-relation-to-physics/units-and-scales/" target="_blank" rel="noopener">related by Planck unit conversions to the elementary length and elementary time</a>). But in maintaining our shared “objective” view of the universe it’s crucial that we’re not all going off in different directions at the speed of light. And of course the reason that doesn’t happen is that we don’t have zero mass. And indeed presumably nonzero mass is a critical part of being observers like us. </p> <p>In our Physics Project it’s roughly the density of events in the hypergraph that determines the density of energy (and mass) in physical space (with their associated gravitational effects). And similarly it’s roughly the density of events in the multiway graph (or in branchial graph slices) that determines the density of action—the relativistically invariant analog of energy—in branchial space (with its associated effects on quantum phase). And though it’s not yet completely clear how this works, it seems likely that once again when there’s mass, effects don’t just “go off at the maximum entanglement speed in all directions”, but instead stay nearby.</p> <p>There are definitely connections between “staying at the same place”, believing one is persistent, and being computationally bounded. But these are what seem necessary for us to have our typical view of time as a single thread. In principle we can imagine observers very different from us—say with minds (like the inside of an idealized quantum computer) capable of experiencing many different threads of history. But the Principle of Computational Equivalence suggests that there’s a high bar for such observers. They need not only to be able to deal with computational irreducibility but also multicomputational irreducibility, in which one includes both the process of computing new states, and the process of equivalencing states. </p> <p>And so for observers that are “anything like us” we can expect that once again time will tend to be as we normally experience it, following a single thread, consistent between observers. </p> <p>(It’s worth mentioning that all of this only works for observers like us “in situations like ours”. For example, at the <a href="https://www.wolframphysics.org/technical-introduction/potential-relation-to-physics/event-horizons-and-singularities-in-spacetime-and-quantum-mechanics/" target="_blank" rel="noopener">“entanglement horizon” for a black hole</a>—where branchially-oriented edges in the multiway causal graph get “trapped”—time as we know it in some sense “disintegrates”, because an observer won’t be able to “knit together” the different branches of history to “form a consistent classical thought” about what happens.)</p> <h2 id="time-in-the-ruliad">Time in the Ruliad</h2> <p>In what we’ve discussed so far we can think of the progress of time as being associated with the repeated application of rules that progressively “rewrite the state of the universe”. In the previous section we saw that these rules can be applied in many different ways, leading to many different underlying threads of history. </p> <p>But so far we’ve imagined that the rules that get applied are always the same—leaving us with the mystery of “Why those rules, and not others?” But this is where <a href="https://writings.stephenwolfram.com/2021/11/the-concept-of-the-ruliad/">the ruliad</a> comes in. Because the ruliad involves no such seemingly arbitrary choices: it’s what you get by following all possible computational rules.</p> <p>One can imagine many bases for the ruliad. One can make it from all possible hypergraph rewritings. Or all possible (multiway) Turing machines. But in the end it’s a single, unique thing: the entangled limit of all possible computational processes. There’s a sense in which “everything can happen somewhere” in the ruliad. But what gives the ruliad structure is that there’s a definite (essentially geometrical) way in which all those different things that can happen are arranged and connected.</p> <p>So what is our perception of the ruliad? Inevitably we’re part of the ruliad—so we’re observing it “from the inside”. But the crucial point is that what we perceive about it depends on what we are like as observers. And my big surprise in the past few years has been that assuming even just a little about what we’re like as observers immediately implies that what we perceive of the ruliad follows the core laws of physics we know. In other words, by assuming what we’re like as observers, we can in effect derive our laws of physics.</p> <p>The key to all this is the interplay between the computational irreducibility of underlying behavior in the ruliad, and our computational boundedness as observers (together with our related assumption of our persistence). And it’s this interplay that gives us the Second Law in statistical mechanics, the Einstein equations for the structure of spacetime, and (we think) the path integral in quantum mechanics. In effect what’s happening is that our computational boundedness as observers makes us equivalence things to the point where we are sampling only computationally reducible slices of the ruliad, whose characteristics can be described using recognizable laws of physics.</p> <p>So where does time fit into all of this? A central feature of the ruliad is that it’s unique—and everything about it is “abstractly necessary”. Much as given the definition of numbers, addition and equality it’s inevitable that one gets 1 + 1 = 2, so similarly given the definition of computation it’s inevitable that one gets the ruliad. Or, in other words, there’s no question about whether the ruliad exists; it’s just an abstract construct that inevitably follows from abstract definitions. </p> <p>And so at some level this means that the ruliad inevitably just “exists as a complete thing”. And so if one could “view it from outside” one could think of it as just a single timeless object, with no notion of time. </p> <p>But the crucial point is that we don’t get to “view it from the outside”. We’re embedded within it. And, what’s more, we must view it through the “lens” of our computational boundedness. And this is why we inevitably end up with a notion of time. </p> <p>We observe the ruliad from some point within it. If we were not computationally bounded then we could immediately compute what the whole ruliad is like. But in actuality we can only discover the ruliad “one computationally bounded step at a time”—in effect progressively applying bounded computations to “move through rulial space”. </p> <p>So even though in some abstract sense “the whole ruliad is already there” we only get to explore it step by step. And that’s what gives us our notion of time, through which we “progress”. </p> <p>Inevitably, there are many different paths that we could follow through the ruliad. And indeed every mind (and every observer like us)—with its distinct inner experience—presumably follows a different path. But much as we described for branchial space, the reason we have a shared notion of “objective reality” is presumably that we are all very close together in rulial space; we form in a sense a tight “rulial flock”.</p> <p>It’s worth pointing out that not every sampling of the ruliad that may be accessible to us conveniently corresponds to exploration of progressive slices of time. Yes, that kind of “progression in time” is characteristic of our physical experience, and our typical way of describing it. But what about our experience, say, of mathematics?</p> <p>The first point to make is that just as the ruliad contains all possible physics, it also <a href="https://writings.stephenwolfram.com/2022/03/the-physicalization-of-metamathematics-and-its-implications-for-the-foundations-of-mathematics/">contains all possible mathematics</a>. If we construct the ruliad, say from hypergraphs, the nodes are now not “atoms of space”, but instead abstract elements (that in general <a href="https://writings.stephenwolfram.com/2022/03/the-physicalization-of-metamathematics-and-its-implications-for-the-foundations-of-mathematics/#metamath-emes">we call emes</a>) that form pieces of mathematical expressions and mathematical theorems. We can think of these abstract elements as being laid out now not in physical space, but in some abstract metamathematical space. </p> <p>In our physical experience, we tend to remain localized in physical space, branchial space, etc. But in “doing mathematics” it’s more as if we’re progressively expanding in metamathematical space, carving out some domain of “theorems we assume are true”. And while we could identify some kind of “path of expansion” to let us define some analog of time, it’s not a necessary feature of the way we explore the ruliad.</p> <p>Different places in the ruliad in a sense correspond to describing things using different rules. And by analogy to the concept of motion in physical space, we can effectively “move” from one place to another in the ruliad by translating the computations done by one set of rules to computations done by another. (And, yes, it’s nontrivial to even have the <a href="https://writings.stephenwolfram.com/2022/03/on-the-concept-of-motion/">possibility of “pure motion”</a>.) But if we indeed remain localized in the ruliad (and can maintain what we can think of as our “coherent identity”) then it’s natural to think of there being a “path of motion” along which we progress “with time”. But when we’re just “expanding our horizons” to encompass more paradigms and to bring more of rulial space into what’s covered by our minds (so that in effect we’re “expanding in rulial space”), it’s not really the same story. We’re not thinking of ourselves as “doing computation in order to move”. Instead, we’re just identifying equivalences and using them to expand our definition of ourselves, which is something that we can at least approximate (much like in “quantum measurement” in traditional physics) as happening “outside of time”. Ultimately, though, everything that happens must be the result of computations that occur. It’s just that we don’t usually “package” these into what we can describe as a definite thread of time.</p> <h2 id="so-what-in-the-end-is-time">So What in the End Is Time?</h2> <p>From the paradigm (and Physics Project ideas) that we’ve discussed here, the question “What is time?” is at some level simple: time is what progresses when one applies computational rules. But what’s critical is that time can in effect be defined abstractly, independent of the details of those rules, or the “substrate” to which they’re applied. And what makes this possible is the Principle of Computational Equivalence, and the ubiquitous phenomenon of computational irreducibility that it implies.</p> <p>To begin with, the fact that time can robustly be thought of as “progressing”, in effect in a linear chain, is a consequence of computational irreducibility—because computational irreducibility is what tells us that computationally bounded observers like us can’t in general ever “jump ahead”; we just have to follow a linear chain of steps. </p> <p>But there’s something else as well. The Principle of Computational Equivalence implies that there’s in a sense just one (ubiquitous) kind of computational irreducibility. So when we look at different systems following different irreducible computational rules, there’s inevitably a certain universality to what they do. In effect they’re all “accumulating computational effects” in the same way. Or in essence progressing through time in the same way.</p> <p>There’s a close analogy here with heat. It could be that there’d be detailed molecular motion that even on a large scale <a href="https://writings.stephenwolfram.com/2023/01/how-did-we-get-here-the-tangled-history-of-the-second-law-of-thermodynamics/#what-is-heat">worked noticeably differently in different materials</a>. But the fact is that we end up being able to characterize any such motion just by saying that it represents a certain amount of heat, without getting into more details. And that’s very much the same kind of thing as being able to say that such-and-such an amount of time has passed, without having to get into the details of how some clock or other system that reflects the passage of time actually works.</p> <p>And in fact there’s more than a “conceptual analogy” here. Because the <a href="https://writings.stephenwolfram.com/2023/02/computational-foundations-for-the-second-law-of-thermodynamics/">phenomenon of heat is again a consequence of computational irreducibility</a>. And the fact that there’s a uniform, “abstract” characterization of it is a consequence of the universality of computational irreducibility.</p> <p>It’s worth emphasizing again, though, that just as with heat, a robust concept of time depends on us being computationally bounded observers. If we were not, then we’d able to break the Second Law by doing detailed computations of molecular processes, and we wouldn’t just describe things in terms of randomness and heat. And similarly, we’d be able to break the linear flow of time, either jumping ahead or following different threads of time. </p> <p>But as computationally bounded observers of computationally irreducible processes, it’s basically inevitable that—at least to a good approximation—we’ll view time as something that forms a single one-dimensional thread. </p> <p>In traditional mathematically based science there’s often a feeling that the goal should be to “predict the future”—or in effect to “outrun time”. But computational irreducibility tells us that in general we can’t do this, and that the only way to find out what will happen is just to run the same computation as the system itself, essentially step by step. But while this might seem like a letdown for the power of science, we can also see it as what gives meaning and significance to time. If we could always jump ahead then at some level nothing would ever fundamentally be achieved by the <a href="https://www.wolframscience.com/nks/chap-12--the-principle-of-computational-equivalence#sect-12-7--the-phenomenon-of-free-will">passage of time (or, say, by the living of our lives)</a>; we’d always be able to just say what will happen, without “living through” how we got there. But computational irreducibility gives time and the process of it passing a kind of hard, tangible character.</p> <p>So what does all this imply for the various classic issues (and apparent paradoxes) that arise in the way time is usually discussed? </p> <p>Let’s start with the question of reversibility. The traditional laws of physics basically apply both forwards and backwards in time. And the ruliad is inevitably symmetrical between “forward” and “backward” rules. So why is it then that in our typical experience time always seems to “run in the same direction”? </p> <p>This is closely related to the Second Law, and once again it’s consequence of our computational boundedness interacting with underlying computational irreducibility. In a sense what defines the direction of time for us is that we (typically) find it much easier to remember the past than to predict the future. Of course, we don’t remember every detail of the past. We only remember what amounts to certain “filtered” features that “fit in our finite minds”. And when it comes to predicting the future, we’re limited by our inability to “outrun” computational irreducibility. </p> <p>Let’s recall how the Second Law works. It basically says that if we set up some state that’s “ordered” or “simple” then this will tend to “degrade” to one that’s “disordered” or “random”. (We can think of the evolution of the system as effectively “encrypting” the specification of our starting state to the point where we—as computationally bounded observers—can no longer recognize its ordered origins.) But because our underlying laws are reversible, this degradation (or “encryption”) must happen when we go both forwards and backwards in time:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/10/sw10082024timeAimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/10/sw10082024timeAimg3.png' alt='' title='' width='169' height='167'> </div> </p></div> <p>But the point is that our “experiential” definition of the direction of time (in which the “past” is what we remember, and the “future” is what we find hard to predict) is inevitably aligned with the “thermodynamic” direction of time we observe in the world at large. And the reason is that in both cases we’re defining the past to be something that’s computationally bounded (while the future can be computationally irreducible). In the experiential case the past is computationally bounded because that’s what we can remember. In the thermodynamic case it’s computationally bounded because those are the states we can prepare. In other words, the “arrows of time” are aligned because in both cases we are in effect “requiring the past to be simpler”. </p> <p>So what about time travel? It’s a concept that seems natural—and perhaps even inevitable—if one imagines that “time is just like space”. But it becomes a lot less natural when we think of time in the way we’re doing here: as a process of applying computational rules. </p> <p>Indeed, at the lowest level, these rules are by definition just sequentially applied, producing one state after another—and in effect “progressing in one direction through time”. But things get more complicated if we consider not just the raw, lowest-level rules, but what we might actually observe of their effects. For example, what if the rules lead to a state that’s identical to one they’ve produced before (as happens, for example, in a system with periodic behavior)? If we equivalence the state now and the state before (so we represent both as a single state) then we can <a href="https://writings.stephenwolfram.com/2020/04/finally-we-may-have-a-path-to-the-fundamental-theory-of-physics-and-its-beautiful/#black-holes-singularities-etc">end up with a loop in our causal graph (a “closed timelike curve”)</a>. And, yes, in terms of the raw sequence of applying rules these states can be considered different. But the point is that if they are identical in every feature then any observer will inevitably consider them the same. </p> <p>But will such equivalent states ever actually occur? As soon as there’s computational irreducibility it’s basically inevitable that the states will never perfectly match up. And indeed for the states to contain an observer like us (with “memory”, etc.) it’s basically impossible that they can match up. </p> <p>But can one imagine an observer (or a “timecraft”) that would lead to states that match up? Perhaps somehow it could carefully pick particular sequences of atoms of space (or elementary events) that would lead it to states that have “happened before”. And indeed in a computationally simple system this might be possible. But as soon as there’s computational irreducibility, this simply isn’t something one can expect any computationally bounded observer to be able to do. And, yes, this is directly analogous to why one <a href="https://writings.stephenwolfram.com/2023/02/computational-foundations-for-the-second-law-of-thermodynamics/#maxwells-demon-and-the-character-of-observers">can’t have a “Maxwell’s demon”</a> observer that “breaks the Second Law”. Or why one can’t have something that carefully navigates the lowest-level structure of space to <a href="https://writings.stephenwolfram.com/2020/10/faster-than-light-in-our-model-of-physics-some-preliminary-thoughts/">effectively travel faster than light</a>. </p> <p>But even if there can’t be time travel in which “time for an observer goes backwards”, there can still be changes in “perceived time”, say as a result of relativistic effects associated with motion. For example, one classic relativistic effect is time dilation, in which “time goes slower” when objects go faster. And, yes, given certain assumptions, there’s a straightforward mathematical derivation of this effect. But in our effort to understand the nature of time we’re led to ask what its physical mechanism might be. And it turns out that in our Physics Project it has a surprisingly direct—and almost “mechanical”—explanation.</p> <p>One starts from the fact that in our Physics Project space and everything in it is represented by a hypergraph which is continually getting rewritten. And the evolution of any object through time is then defined by these rewritings. But if the object moves, then in effect it has to be “re-created at a different place in space”—and this process takes up a certain number of rewritings, leaving fewer for the intrinsic evolution of the object itself, and thus causing time to effectively “run slower” for it. (And, yes, while this is a qualitative description, one can make it quite formal and precise, and recover the usual formulas for relativistic time dilation.)</p> <p>Something similar happens with gravitational fields. In our Physics Project, energy-momentum (and thus gravity) is effectively associated with greater activity in the underlying hypergraph. And the presence of this greater activity leads to more rewritings, causing “time to run faster” for any object in that region of space (corresponding to the traditional “gravitational redshift”). </p> <p>More extreme versions of this occur in the context of black holes. (Indeed, one can roughly think of spacelike singularities as places where “time ran so fast that it ended”.) And in general—as we discussed above—there are many “relativistic effects” in which notions of space and time get mixed in various ways. </p> <p>But even at a much more mundane level there’s a certain crucial relationship between space and time for observers like us. The key point is that observers like us tend to “parse” the world into a sequence of “states of space” at successive “moments in time”. But the fact that we do this depends on some quite specific features of us, and in particular our effective physical scale in space as compared to time. </p> <p>In our everyday life we’re typically looking at scenes involving objects that are perhaps tens of meters away from us. And given the speed of light that means photons from these objects get to us in less than a microsecond. But it takes our brains milliseconds to register what we’ve seen. And this disparity of timescales is what leads us to view the world as consisting of a sequence of states of space at successive moments in time. </p> <p>If our brains “ran” a million times faster (i.e. at the speed of digital electronics) we’d perceive photons arriving from different parts of a scene at different times, and we’d presumably no longer view the world in terms of overall states of space existing at successive times. </p> <p>The same kind of thing would happen if we kept the speed of our brains the same, but dealt with scenes of a much larger scale (as we already do in dealing with spacecraft, astronomy, etc.).</p> <p>But while this affects what it is that we think time is “acting on”, it doesn’t ultimately affect the nature of time itself. Time remains that computational process by which successive states of the world are produced. Computational irreducibility gives time a certain rigid character, at least for computationally bounded observers like us. And the Principle of Computational Equivalence allows there to be a robust notion of time independent of the “substrate” that’s involved: whether us as observers, the everyday physical world, or, for that matter, the whole universe. </p> ]]></content:encoded> <wfw:commentRss>https://writings.stephenwolfram.com/2024/10/on-the-nature-of-time/feed/</wfw:commentRss> <slash:comments>8</slash:comments> </item> <item> <title>Nestedly Recursive Functions</title> <link>https://writings.stephenwolfram.com/2024/09/nestedly-recursive-functions/</link> <comments>https://writings.stephenwolfram.com/2024/09/nestedly-recursive-functions/#respond</comments> <pubDate>Fri, 27 Sep 2024 17:50:59 +0000</pubDate> <dc:creator><![CDATA[Stephen Wolfram]]></dc:creator> <category><![CDATA[Computational Science]]></category> <category><![CDATA[Historical Perspectives]]></category> <category><![CDATA[Mathematics]]></category> <category><![CDATA[New Kind of Science]]></category> <category><![CDATA[Ruliology]]></category> <guid isPermaLink="false">https://writings.stephenwolfram.com/?p=62558</guid> <description><![CDATA[<span class="thumbnail"><img width="128" height="108" src="https://content.wolfram.com/sites/43/2024/09/swblog-recursive-icon.png" class="attachment-thumbnail size-thumbnail wp-post-image" alt="" /></span>Yet Another Ruliological Surprise Integers. Addition. Subtraction. Maybe multiplication. Surely that’s not enough to be able to generate any serious complexity. In the early 1980s I had made the very surprising discovery that very simple programs based on cellular automata could generate great complexity. But how widespread was this phenomenon? At the beginning of the […]]]></description> <content:encoded><![CDATA[<span class="thumbnail"><img width="128" height="108" src="https://content.wolfram.com/sites/43/2024/09/swblog-recursive-icon.png" class="attachment-thumbnail size-thumbnail wp-post-image" alt="" /></span><p><img class="aligncenter" title="Nestedly Recursive Functions" src="https://content.wolfram.com/sites/43/2024/09/recursive-hero-v2-1242.png" alt="Nestedly Recursive Functions" width="621px" height="324" /></p> <h2 id="yet-another-ruliological-surprise">Yet Another Ruliological Surprise</h2> <p>Integers. Addition. Subtraction. Maybe multiplication. Surely that’s not enough to be able to generate any serious complexity. In the early 1980s I had made the very surprising discovery that <a href="https://www.wolframscience.com/nks/chap-2--the-crucial-experiment/">very simple programs based on cellular automata could generate great complexity</a>. But how widespread was this phenomenon?</p> <p>At the beginning of the 1990s I had set about exploring this. Over and over I would consider <a href="https://www.wolframscience.com/nks/chap-3--the-world-of-simple-programs/">some type of system</a> and be sure it was too simple to “do anything interesting”. And over and over again I would be wrong. And so it was that on the night of August 13, 1993, I thought I should check what could happen with integer functions defined using just addition and subtraction.<span id="more-62558"></span></p> <p>I knew, of course, about <a href="https://www.wolframscience.com/nks/notes-3-5--fibonacci-numbers/">defining functions by recursion</a>, like <tt><a href="http://reference.wolfram.com/language/ref/Fibonacci.html">Fibonacci</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09232024surpriseAimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024surpriseimg1A.png' alt='' title='' width='330' height='17'> </div> </p></div> <p>But could I find something like this that would have complex behavior? I did the analog of what I have done so many times, and just started (symbolically) enumerating possible definitions. And immediately I saw cases with nested functions, like:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09232024surpriseAimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024surpriseimg2A.png' alt='' title='' width='332' height='17'> </div> </p></div> <p>(For some reason I wanted to keep the same initial conditions as <tt>Fibonacci</tt>: <em>f</em>[1] = <em>f</em>[2] = 1.) What would functions like this do? My original notebook records the result in this case:</p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09192024surpriseimg3.png' alt='Nestedly recursive function' title='Nestedly recursive function' width='383' height='225'/></p> <p>But a few minutes later I found something very different: a simple nestedly recursive function with what seemed like highly complex behavior: </p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09192024surpriseimg4.png' alt='Simple nestedly recursive function with complex behavior' title='Simple nestedly recursive function with complex behavior' width='383' height='225'/></p> <p>I remembered seeing a <a href="https://www.wolframscience.com/nks/notes-4-3--history-of-recursive-sequences/">somewhat similarly defined function discussed before</a>. But the behavior I’d seen reported for that function, while intricate, was <a href="https://www.wolframscience.com/nks/p130--recursive-sequences/">nested and ultimately highly regular</a>. And, so far as I could tell, much like with <a href="https://www.wolframscience.com/nks/p27--how-do-simple-programs-behave/">rule 30</a> and all the other systems I’d investigated, nobody had ever seen serious complexity in simple recursive functions.</p> <p>It was a nice example. But it was one among many. And when I published <em><a href="https://www.wolframscience.com/nks/">A New Kind of Science</a></em> in 2002, I devoted just <a href="https://www.wolframscience.com/nks/chap-4--systems-based-on-numbers#sect-4-3--recursive-sequences">four pages</a> (and <a href="https://www.wolframscience.com/nks/sect-4-3--recursive-sequences--notes/">7 notes</a>) to “recursive sequences”—even though the <a href="https://www.wolframscience.com/nks/p130--recursive-sequences/">gallery I made</a> of their behavior became a favorite page of mine:</p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09192024surpriseimg5.png' alt='Recursive sequences gallery' title='Recursive sequences gallery' width='383' height='422'/></p> <p>A year after the book was published we held our first <a href="https://education.wolfram.com/summer-school">Wolfram Summer School</a>, and as an opening event I decided to <a href="https://writings.stephenwolfram.com/2007/07/science-live-and-in-public/">do a live computer experiment</a>—in which I would try to make a real-time science discovery. The subject I chose was nestedly recursive functions. It took a few hours. But then, yes, we made a discovery! We found that there was a nestedly recursive function simpler than the ones I’d discussed in <em>A New Kind of Science</em> that already seemed to have very complex behavior: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09232024surpriseBimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024surpriseimg6A.png' alt='' title='' width='388' height='17'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09192024surpriseimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024surpriseimg7.png' alt='' title='' width='585' height='153'> </div> </p></div> <p>Over the couple of decades that followed I returned many times to nestedly recursive functions—particularly in explorations I did <a href="https://education.wolfram.com/summer-camp">with high school</a> and other students, or in suggestions I made for student projects. Then recently I used them several times as <a href="https://writings.stephenwolfram.com/2020/12/combinators-a-centennial-view/#combinators-in-the-wild-some-zoology">“intuition-building examples” in various investigations</a>. </p> <p>I’d always felt my work with nestedly recursive functions was unfinished. Beginning about five years ago—particularly energized by our <a href="https://www.wolframphysics.org/" target="_blank" rel="noopener">Physics Project</a>—I started looking at harvesting seeds I’d sown in <em>A New Kind of Science</em> and before. I’ve <a href="https://writings.stephenwolfram.com/2024/08/five-most-productive-years-what-happened-and-whats-next/">been on quite a roll</a>, with a few pages or even footnotes repeatedly flowering into rich book-length stories. And finally—particularly after my work last year on “<a href="https://writings.stephenwolfram.com/2023/09/expression-evaluation-and-fundamental-physics/">Expression Evaluation and Fundamental Physics</a>”—I decided it was time to try to finish my exploration of nestedly recursive functions.</p> <p>Our modern <a href="https://www.wolfram.com/language/">Wolfram Language</a> tools—as well as ideas from our Physics Project—provided some new directions to explore. But I still thought I pretty much knew what we’d find. And perhaps after all these years I should have known better. Because somehow in the computational universe—and in the world of <a href="https://writings.stephenwolfram.com/2021/09/charting-a-course-for-complexity-metamodeling-ruliology-and-more/#the-pure-basic-science-of-ruliology">ruliology</a>—there are always surprises.</p> <p>And here, yet again, there was indeed quite a surprise.</p> <h2 id="the-basic-idea">The Basic Idea</h2> <p>Consider the definition (later we’ll call this “P312”)</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09232024basicAimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024basicimg1A.png' alt='' title='' width='350' height='17'> </div> </p></div> <p>which we can also write as:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09232024basicAimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024basicimg2A.png' alt='' title='' width='300' height='17'> </div> </p></div> <p>The first few values for <em>f</em>[<em>n</em>] generated from this definition are:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09192024basicimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024basicimg3.png' alt='' title='' width='459' height='13'> </div> </p></div> <p>Continuing further we get:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09192024basicimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024basicimg4.png' alt='' title='' width='589' height='157'> </div> </p></div> <p>But how are these values actually computed? To see that we can make an “evaluation graph” in which we show how each value of <em>f</em>[<em>n</em>] is computed from ones with smaller values of <em>n</em>, here starting from <em>f</em>[20]:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09192024basicimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024basicimg5.png' alt='' title='' width='302' height='436'> </div> </p></div> <p>The gray nodes represent initial conditions: places where <em>f</em>[<em>n</em>] was sampled for <em>n </em>≤ 0. The two different colors of edges correspond to the two different computations done in evaluating <nobr>each <em>f</em>[<em>n</em>]</nobr>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09252024finalupdatesimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024finalupdatesimg1.png' alt='' title='' width='162' height='30'> </div> </p></div> <p>Continuing to <em>f</em>[30] we get: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09192024basicimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024basicimg7.png' alt='' title='' width='416' height='489'> </div> </p></div> <p>But what’s the structure of this graph? If we pull out the “red” graph on its own, we can see that it breaks into two path graphs, that consist of the sequences of the <em>f</em>[<em>n</em>] for odd and even <em>n</em>, respectively:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09192024basicimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024basicimg8.png' alt='' title='' width='108' height='368'> </div> </p></div> <p>The “blue” graph, on the other hand, breaks into four components—each always a tree—leading respectively to the four different initial conditions:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09192024basicimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024basicimg9.png' alt='' title='' width='603' height='425'> </div> </p></div> <p>And for example we can now plot <em>f</em>[<em>n</em>], showing which tree each <em>f</em>[<em>n</em>] ends up being associated with:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09192024basicimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024basicimg10.png' alt='' title='' width='590' height='147'> </div> </p></div> <p>We’ll be using this same basic setup throughout, though for different functions. We’ll mostly consider recursive definitions with a single term (i.e. with a single “outermost <em>f</em>”, not two, as in Fibonacci recurrences). </p> <p>The specific families of recursive functions we’ll be focusing on are:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09192024basicimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024basicimg11.png' alt='' title='' width='523' height='93'> </div> </p></div> <p>And with this designation, the function we just introduced is P312.</p> <h2 id="a-closer-look-at-p312--fn--3--fn---fn---2">A Closer Look at P312 ( f[n_] := 3 + f[n – f[n – 2]] )</h2> <p>Let’s start off by looking in more detail at the function we just introduced. Here’s what it does up to <em>n </em>= 500:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg1.png' alt='' title='' width='589' height='157'> </div> </p></div> <p>It might seem as if it’s going to go on “seemingly randomly” forever. But if we take it further, we get a surprise: it seems to “resolve itself” to something potentially simpler: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg2.png' alt='' title='' width='588' height='155'> </div> </p></div> <p>What’s going on? Let’s plot this again, but now showing which “blue graph tree” each value is associated with:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg3.png' alt='' title='' width='589' height='145'> </div> </p></div> <p>And now what we see is that the <em>f</em>[–3] and <em>f</em>[–2] trees stop contributing to <em>f</em>[<em>n</em>] when <em>n</em> is (respectively) 537 and 296, and these trees are finite (and have sizes 53 and 15):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg4.png' alt='' title='' width='595' height='499'> </div> </p></div> <p>The overall structures of the “remaining” trees—here shown up to <em>f</em>[5000]—eventually start to exhibit some regularity:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg5.png' alt='' title='' width='599' height='195'> </div> </p></div> <p>We can home in on this regularity by arranging these trees in layers, starting from the root, then plotting the number of nodes in each successive layer:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg6.png' alt='' title='' width='453' height='229'> </div> </p></div> <p>Looking at these pictures suggests that there should be some kind of more-or-less direct “formula” for <em>f</em>[<em>n</em>], at least for large <em>n</em>. They also suggest that such a formula should have some kind of mod-6 structure. And, yes, there does turn out to be essentially a “formula”. Though the “formula” is quite complicated—and reminiscent of several other “strangely messy” formulas in other ruliological cases—like <a href="https://www.wolframscience.com/nks/notes-12-8--turing-machine-600720/">Turing machine 600720</a> discussed in <em>A New Kind of Science</em> or<a href="https://writings.stephenwolfram.com/2020/12/combinators-a-centennial-view/#combinators-in-the-wild-some-zoology "> combinator <span class=‘clipboard-inline’><tt>s[s[s]][s][s][s][s]</tt></span></a>. </p> <p>Later on, we’ll see the much simpler recursive function P111 (<em>f</em>[<em>n</em>_] := 1 + <em>f</em>[<em>n</em> – <em>f</em>[<em>n</em> <span style="font-size:14px">–</span> 1]]). The values for this function form a sequence in which successive blocks of length <em>k</em> have value <em>k</em>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg7.png' alt='' title='' width='466' height='13'> </div> </p></div> <p>P312 has the same kind of structure, but much embellished. First, it has 6 separate riffled (“mod”) subsequences. Each subsequence then consists of a sequence of blocks. Given a value <em>n</em>, this computes which subsequence this is on, which block for that subsequence it’s in, and where it is within that block:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg8.png' alt='' title='' width='540' height='47'> </div> </p></div> <p>So, for example, here are results for multiples of 1000:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg9.png' alt='' title='' width='671' height='150'> </div> </p></div> <p>For <em>n </em>= 1000 we’re not yet in the “simple” regime, we can’t describe the sequence in any simple way, and our “indices” calculation is meaningless. For <em>n </em>= 2000 it so happens that we are at block 0 for the mod-1 subsequence. And the way things are set up, we just start by giving exactly the form of block 0 for each mod. So for mod 1 the block is: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg10.png' alt='' title='' width='565' height='62'> </div> </p></div> <p>But now <em>n </em>= 2000 has offset 16 within this block, so the final value of <em>f</em>[2000] is simply the 16th value from this list, or 100. <em>f</em>[2001] is then simply the next element within this block, or 109. And so on—until we reach the end of the block. </p> <p>But what if we’re not dealing with block 0? For example, according to the table above, <em>f</em>[3000] is determined by mod-3 block 1. It turns out there’s a straightforward, if messy, way to compute any block <em>b</em> (for mod <em>m</em>):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg11.png' alt='' title='' width='603' height='76'> </div> </p></div> <p>So now we have a way to compute the value, say of <em>f</em>[3000], effectively just by “evaluating a formula”:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg12.png' alt='' title='' width='202' height='43'> </div> </p></div> <p>And what’s notable is that this evaluation doesn’t involve any recursion. In other words, at the cost of “messiness” we’ve—somewhat surprisingly—been able to unravel all the recursion in P312 to arrive at a “direct formula” for the value of <em>f</em>[<em>n</em>] for any <em>n</em>.</p> <p>So what else can we see about the behavior of <em>f</em>[<em>n</em>] for P312? One notable feature is its overall growth rate. For large <em>n</em>, it turns out that (as can be seen by substituting this form into the recursive definition and taking a limit):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09232024closerAimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg13A.png' alt='' title='' width='100' height='23'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg14.png' alt='' title='' width='549' height='147'> </div> </p></div> <p>One thing this means is that our evaluation graph eventually has a roughly conical form:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg15_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg15.png' alt='' title='' width='569' height='194'> </div> </p></div> <p>This can be compared to the very regular cone generated by P111 (which has asymptotic value <img style="margin-bottom: 15px" src='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg16.png' width= '35' height='20' align='absmiddle'>): </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg17_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg17.png' alt='' title='' width='531' height='152'> </div> </p></div> <p>If one just looks at the form of the recursive definition for P312 it’s far from obvious “how far back” it will need to probe, or, in other words, what values of <em>f</em>[<em>n</em>] one will need to specify as initial conditions. As it turns out, though, the only values needed are <em>f</em>[–3], <em>f</em>[–2], <em>f</em>[–1] and <em>f</em>[0].</p> <p>How can one see this? In 3 + <em>f</em>[<em>n</em> – <em>f</em>[<em>n</em> – 2]] it’s only the outer <em>f</em> that can probe “far back” values. But how far it actually goes back depends on how much larger <em>f</em>[<em>n – </em>2] gets compared to <em>n</em>. Plotting <em>f</em>[<em>n – </em>2] and <em>n</em> together we have:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg18_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg18.png' alt='' title='' width='579' height='154'> </div> </p></div> <p>And the point is that only for very few values of <em>n</em> does <em>f</em>[<em>n –</em> 2] exceed <em>n</em>—and it’s these values that probe back. Meanwhile, for larger <em>n</em>, there can never be additional “lookbacks”, because <em>f</em>[<em>n</em>] only grows like <img style="margin-bottom: 15px" src='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg19.png' width= '35' height='20' align='absmiddle'>. </p> <p>So does any P312 recursion always have the same lookback? So far, we’ve considered specifically the initial condition <em>f</em>[<em>n</em>] = 1 for all <em>n</em> ≤ 0. But what if we change the value of <em>f</em>[0]? Here are plots of <em>f</em>[<em>n</em>] for different cases:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg20_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg20.png' alt='' title='' width='608' height='532'> </div> </p></div> <p>And it turns out that with <em>f</em>[0] = <em>z</em>, the lookback goes to –<em>z</em> for <em>z </em>≥ 3, and to <em>z</em> – 4 for 1 ≤ <em>z</em> ≤ 2. </p> <p>(If <em>z </em>≤ 0 the function <em>f</em>[<em>n</em>] is basically not defined, because the recursion is trying to compute <em>f</em>[<em>n</em>] from <em>f</em>[<em>n</em>], <em>f</em>[<em>n</em> + 1], etc., so never “makes progress”.)</p> <p>The case <em>f</em>[0] = 2 (i.e. <em>z</em> = 2) is the one that involves the least lookback—and a total of 3 initial values. Here is the evaluation graph in this case:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg21_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg21.png' alt='' title='' width='346' height='361'> </div> </p></div> <p>By comparison, here is the evaluation graph for the case <em>f</em>[0] = 5, involving 6 initial values:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg22_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg22.png' alt='' title='' width='491' height='409'> </div> </p></div> <p>If we plot the value of <em>f</em>[<em>n</em>] as a function of <em>f</em>[0] we get the following:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg23_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg23.png' alt='' title='' width='610' height='219'> </div> </p></div> <p>For <em>n</em> < 3 <em>f</em>[0], <em>f</em>[<em>n</em>] always has simple behavior, and is essentially periodic in <em>n</em> with period 3:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg24_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg24.png' alt='' title='' width='334' height='94'> </div> </p></div> <p>And it turns out that for any specified initial configuration of values, there is always only bounded lookback—with the bound apparently being determined by the largest of the initial values <em>f</em>[<em>n</em><sub>init</sub>].</p> <p>So what about the behavior of <em>f</em>[<em>n</em>] for large <em>n</em>? Just like in our original <em>f</em>[0] = 1 case, we can construct “blue graph trees” rooted at each of the initial conditions. In the case <em>f</em>[0] = 1 we found that of the 4 trees only two continue to grow as <em>n</em> increases. As we vary <em>f</em>[0], the number of “surviving trees” varies quite erratically: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg26_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg26.png' alt='' title='' width='516' height='140'> </div> </p></div> <p>What if instead of just changing <em>f</em>[0], and keeping all other <em>f</em>[–<em>k</em>] = 1, we set <em>f</em>[<em>n</em>] = <em>s</em> for all <em>n</em> ≤ 0? The result is somewhat surprising:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg27_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg27.png' alt='' title='' width='611' height='153'> </div> </p></div> <p>For <em>s</em> ≥ 2, the behavior turns out to be simple—and similar to the behavior of P111. </p> <p>So what can P312 be made to do if we change its initial conditions? With <em>f</em>[<em>n</em>] = 2 for <em>n </em>< 0, we see that for small <em>f</em>[0] the behavior remains “tame”, but as <em>f</em>[0] increases it starts showing its typical complexity:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg28_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg28.png' alt='' title='' width='611' height='255'> </div> </p></div> <p>One question to ask is what set of values <em>f</em>[<em>n</em>] takes on. Given that the initial values have certain residues mod 3, all subsequent values must have the same residues. But apart from this constraint, it seems that all values for <em>f</em>[<em>n</em>] are obtained—which is not surprising given that <em>f</em>[<em>n</em>] grows only like <img style="margin-bottom: 15px" src='https://content.wolfram.com/sites/43/2024/09/sw09192024Acloserimg29.png' width= '23' height='20' align='absmiddle'>. </p> <h2 id="the-p-family-fn--a--fn--b-fn--c">The “P Family”: f[n_] := a + f[n – b f[n – c]]</h2> <p>P312 is just one example of the “P family” of sequences defined by: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024pfamimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024pfamimg1.png' alt='' title='' width='173' height='30'> </div> </p></div> <p>Here is the behavior of some other P<em>abc</em> sequences:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024pfamimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024pfamimg2.png' alt='' title='' width='608' height='456'> </div> </p></div> <p>And here are their evaluation graphs:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024pfamimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024pfamimg3.png' alt='' title='' width='576' height='309'> </div> </p></div> <p>P312 is the first “seriously complex” example. </p> <p>P111 (as mentioned earlier) has a particularly simple form</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024pfamimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024pfamimg4.png' alt='' title='' width='451' height='13'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024pfamimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024pfamimg5.png' alt='' title='' width='413' height='113'> </div> </p></div> <p>which corresponds to the simple formula:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024pfamimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024pfamimg6.png' alt='' title='' width='266' height='14'> </div> </p></div> <p>The evaluation graph in this case is just:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024pfamimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024pfamimg7.png' alt='' title='' width='208' height='101'> </div> </p></div> <p>Only a single initial condition <em>f</em>[0] = 1 is used, and there is only a single “blue graph tree” with a simple form:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024pfamimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024pfamimg8.png' alt='' title='' width='138' height='138'> </div> </p></div> <p>Another interesting case is P123:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024pfamimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024pfamimg9.png' alt='' title='' width='451' height='13'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024pfamimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024pfamimg10.png' alt='' title='' width='411' height='113'> </div> </p></div> <p>Picking out only odd values of <em>n</em> we get:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024pfamimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024pfamimg11.png' alt='' title='' width='358' height='99'> </div> </p></div> <p>This might look just like the behavior of P111. But it’s not. The lengths of the successive “plateaus” are now</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024pfamimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024pfamimg12.png' alt='' title='' width='422' height='13'> </div> </p></div> <p>with differences:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024pfamimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024pfamimg13.png' alt='' title='' width='451' height='13'> </div> </p></div> <p>But this turns out to be exactly a nested sequence generated by joining together the successive steps in the evolution of the <a href="https://www.wolframscience.com/nks/chap-3--the-world-of-simple-programs#sect-3-5--substitution-systems">substitution system</a>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024pfamimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024pfamimg14.png' alt='' title='' width='188' height='13'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024pfamimg15_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024pfamimg15.png' alt='' title='' width='345' height='99'> </div> </p></div> <p>P123 immediately “gets into its final behavior”, even for small <em>n</em>. But—as we saw rather dramatically with P312—there can be “transient behavior” that doesn’t “resolve” until <em>n</em> is large. A smaller case of this phenomenon occurs with P213. Above <em>n</em> = 68 it shows a simple “square root” pattern of behavior, basically like P111. But for smaller <em>n</em> it’s a bit more complicated: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024pfamimg16_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024pfamimg16.png' alt='' title='' width='611' height='114'> </div> </p></div> <p>And in this case the transients aren’t due to “blue graph trees” that stop growing. Instead, there are only two trees (associated with <em>f</em>[0] and <em>f</em>[–1]), but both of them soon end up growing in very regular ways:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024pfamimg17_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024pfamimg17.png' alt='' title='' width='352' height='177'> </div> </p></div> <h2 id="the-t-family">The “T Family”: f[n_] := a f[n – b f[n – c]]</h2> <p>What happens if our outermost operation is not addition, but multiplication?</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg1.png' alt='' title='' width='161' height='30'> </div> </p></div> <p>Here are some examples of the behavior one gets. In each case we’re plotting on a log scale—and we’re not including T1<em>xx</em> cases, which are always trivial:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg2.png' alt='' title='' width='610' height='686'> </div> </p></div> <p>We see that some sequences have regular and readily predictable behavior, but others do not. And this is reflected in the evaluation graphs for these functions:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg3.png' alt='' title='' width='590' height='246'> </div> </p></div> <p>The first “complicated case” is T212: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg4.png' alt='' title='' width='151' height='30'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg5.png' alt='' title='' width='607' height='160'> </div> </p></div> <p>The evaluation graph for <em>f</em>[50] in this case has the form:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg6.png' alt='' title='' width='476' height='449'> </div> </p></div> <p>And something that’s immediately notable is that in addition to “looking back” to the values of <em>f</em>[0] and <em>f</em>[–1], this also looks back to the value of <em>f</em>[<span style="font-size:14px">–</span>24]. Meanwhile, the evaluation graph for <em>f</em>[51] looks back not only to <em>f</em>[0] and<em> f</em>[–1] but also to <em>f</em>[–3] and <span style="font-style:Italic;font-size:14px">f</span>[–27]:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg7.png' alt='' title='' width='450' height='423'> </div> </p></div> <p>How far back does it look in general? Here’s a plot showing which lookbacks are made as a function of <em>n</em> (with the roots of the “blue graph trees” highlighted):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg8.png' alt='' title='' width='611' height='222'> </div> </p></div> <p>There’s alternation between behaviors for even and odd <em>n</em>. But apart from that, additional lookbacks are just steadily added as <em>n</em> increases—and indeed the total number of lookbacks seems to follow a simple pattern:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg9.png' alt='' title='' width='660' height='138'> </div> </p></div> <p>But—just for once—if one looks in more detail, it’s not so simple. The lengths of the successive “blocks” are:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg10.png' alt='' title='' width='489' height='13'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg11.png' alt='' title='' width='608' height='177'> </div> </p></div> <p>So, yes, the lookbacks are quite “unpredictable”. But the main point here is that—unlike for the P family—the number of lookbacks isn’t limited. In a sense, to compute T212 for progressively larger <em>n</em>, progressively more information about its initial conditions is needed. </p> <p>When one deals with ordinary, unnested recurrence relations, one’s always dealing with a fixed lookback. And the number of initial conditions then just depends on the lookback. (So, for example, the Fibonacci recurrence has lookback 2, so needs two initial conditions, while the standard factorial recurrence has lookback 1, so needs only one initial condition.)</p> <p>But for the nested recurrence relation T212 we see that this is no longer true; there can be an unboundedly large lookback. </p> <p>OK, but let’s look back at the actual T212 sequence. Here it is up to larger values of <em>n</em>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg12.png' alt='' title='' width='608' height='158'> </div> </p></div> <p>Or, plotting each point as a dot:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg13.png' alt='' title='' width='610' height='159'> </div> </p></div> <p>Given the recursive definition of <em>f</em>[<em>n</em>], the values of <em>f</em>[<em>n</em>] must always be powers of 2. This shows where each successive power of 2 is first reached as a function of <em>n</em>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg14.png' alt='' title='' width='545' height='146'> </div> </p></div> <p>Meanwhile, this shows the accumulated average of <span class='InlineFormula'><img src='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg15.png' width= '27' height='16' align='absmiddle'></span><em>f</em>[<em>n</em>] as a function of <em>n</em>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg16_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg16.png' alt='' title='' width='538' height='146'> </div> </p></div> <p>This is well fit by 0.38 Log[<em>n</em>], implying that, at least with this averaging, <em>f</em>[<em>n</em>] asymptotically approximates <em>n</em><sup>0.26</sup>. And, yes, it is somewhat surprising that what seems like a very “exponential” recursive definition should lead to an <em>f</em>[<em>n</em>] that increases only like a power. But, needless to say, this is the kind of surprise one has to expect in the computational universe.</p> <p>It’s worth noticing that <em>f</em>[<em>n</em>] fluctuates very intensely as a function of <em>n</em>. The overall distribution of values is very close to exponentially distributed—for example with the distribution of logarithmic values of <em>f</em>[<em>n</em>] for <em>n</em> between 9 million and 10 million being:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg18_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg18.png' alt='' title='' width='213' height='130'> </div> </p></div> <p>What else can we say about this sequence? Let’s say we reduce mod 2 the powers of 2 for each <em>f</em>[<em>n</em>]. Then we get a sequence which starts: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg19_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg19.png' alt='' title='' width='635' height='75'> </div> </p></div> <p>This is definitely not “uniformly random”. But if one look at blocks of sequential values, one can plot at what <em>n</em> each of the 2<sup><em>b</em></sup> possible configurations of a length-<em>b</em> block first appears: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamAimg21_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg21.png' alt='' title='' width='548' height='200'> </div> </p></div> <p>And eventually it seems as if all length-<em>b</em> blocks for any given <em>b</em> will appear.</p> <p>By the way, whereas in the P family, there were always a limited number of “blue graph trees” (associated with the limited number of initial conditions), for T212 the number of such trees increases with <em>n</em>, as more initial conditions are used. So, for example, here are the trees for <em>f</em>[50] and <em>f</em>[51]: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg22_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg22.png' alt='' title='' width='674' height='215'> </div> </p></div> <p>We’ve so far discussed T212 only with the initial condition <em>f</em>[<em>n</em>] = 1 for <em>n</em> ≤ 0. The fact that <em>f</em>[<em>n</em>] is always a power of 2 relies on every initial value also being a power of 2. But here’s what happens, for example, if <em>f</em>(<em>n</em>) = 2<sup><em>s</em></sup> for <em>n </em>≤ 0:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg24_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg24.png' alt='' title='' width='629' height='50'> </div> </p></div> <p>In general, one can think of T212 as transforming an ultimately infinite sequence of initial conditions into an infinite sequence of function values, with different forms of initial conditions potentially giving very different sequences of function values:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg25_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024tfamimg25.png' alt='' title='' width='332' height='191'> </div> </p></div> <p>(Note that not all choices of initial conditions are possible; some lead to “<em>f</em>[<em>n</em>] = <em>f</em>[<em>n</em>]” or <nobr>“<em>f</em>[<em>n</em>] = <em>f</em>[<em>n </em>+ 1]”</nobr> situations, where the evaluation of the function can’t “make progress”.)</p> <h2 id="the-summer-school-sequence-t311-fn--3-fn--fn--1">The “Summer School” Sequence T311 (f[n_] := 3 f[n – f[n – 1]])</h2> <p>Having explored T212, let’s now look at T311—the original one-term nestedly recursive function discovered at the 2003 Wolfram Summer School:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024summerimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024summerimg1.png' alt='' title='' width='151' height='30'> </div> </p></div> <p>Here’s its basic behavior:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024summerimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024summerimg2.png' alt='' title='' width='605' height='158'> </div> </p></div> <p>And here is its evaluation graph—which immediately reveals a lot more lookback than T212:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024summerimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024summerimg3.png' alt='' title='' width='448' height='454'> </div> </p></div> <p>Plotting lookbacks as a function of <em>n </em>we get:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024summerimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024summerimg4.png' alt='' title='' width='608' height='217'> </div> </p></div> <p>Much as with T212, the total number of lookbacks varies with <em>n</em> in the fairly simple way (~ 0.44 <em>n</em>):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024summerimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024summerimg5A.png' alt='' title='' width='540' height='119'> </div> </p></div> <p>Continuing the T311 sequence further, it looks qualitatively very much like T212:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024summerimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024summerimg6.png' alt='' title='' width='609' height='159'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024summerimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024summerimg7.png' alt='' title='' width='609' height='159'> </div> </p></div> <p>And indeed T311—despite its larger number of lookbacks—seems to basically behave like T212. In a story typical of the <a href="https://www.wolframscience.com/nks/chap-12--the-principle-of-computational-equivalence/">Principle of Computational Equivalence</a>, T212 seems to have already “filled out the computational possibilities”, so T311 “doesn’t have anything to add”. </p> <h2 id="the-s-family-fn--n--ffn--a--b">The “S Family”: f[n_] := n – f[f[n – a] – b]</h2> <p>As another (somewhat historically motivated) example of nestedly recursive functions, consider what we’ll call the “S family”, defined by:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg1.png' alt='' title='' width='159' height='30'> </div> </p></div> <p>Let’s start with the very minimal case S10 (or “S1”):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg2.png' alt='' title='' width='140' height='30'> </div> </p></div> <p>Our standard initial condition <em>f</em>[<em>n</em>] = 1 for <em>n</em> ≤ 0 doesn’t work here, because it implies that <em>f</em>[1] = 1 – <em>f</em>[1]. But if we take <em>f</em>[<em>n</em>] = 1 for <em>n </em>≤ 1 we get:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg3.png' alt='' title='' width='388' height='107'> </div> </p></div> <p>Meanwhile, with <em>f</em>[<em>n</em>] = 1 for <em>n</em> ≤ 3 we get:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg4A.png' alt='' title='' width='390' height='107'> </div> </p></div> <p>The first obvious feature of both these results is their overall slope: 1/ϕ ≈ 0.618, where ϕ is the golden ratio. It’s not too hard to see why one gets this slope. Assume that for large <em>n</em> we can take <em>f</em>[<em>n</em>] = σ <em>n</em>. Then substitute this form into both sides of the recursive definition for the S family to get σ <em>n</em> <span style="display: inline-block;width: .5rem;overflow: hidden;height: 1.25rem;">=</span><span style="display: inline-block;margin-left: -.08rem;display: inline-block;width: .4rem;overflow: hidden;height: 1.25rem;">=</span> <em>n</em> – σ (σ (<em>n</em> – <em>a</em>) – <em>b</em>). For large <em>n</em> all that survives is the condition for the coefficients <nobr>of <em>n</em></nobr></p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09262024updatesfinalBimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09262024updatesfinalBimg2.png' alt='' title='' width='70' height='14'> </div> </p></div> <p>which has solution σ = 1/ϕ.</p> <p>Plotting <em>f</em>[<em>n</em>] – <em>n</em>/ϕ for the case <em>f</em>[<em>n</em>] = 1 for <em>n</em> ≤ 1 we get:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg6.png' alt='' title='' width='571' height='150'> </div> </p></div> <p>The evaluation graph is this case has a fairly simple form</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg7.png' alt='' title='' width='594' height='248'> </div> </p></div> <p>as we can see even more clearly with a different graph layout:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg8.png' alt='' title='' width='616' height='264'> </div> </p></div> <p>It’s notable that only the initial condition <em>f</em>[1] = 1 is used—leading to a single “blue graph tree” that turns out to have a very simple “Fibonacci tree” form (which, as we’ll discuss below, has been known since the 1970s): </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg9.png' alt='' title='' width='545' height='216'> </div> </p></div> <p>From this it follows that <em>f</em>[<em>n</em>] related to the <a href="https://www.wolframscience.com/nks/notes-3-5--properties-of-substitution-systems/">“Fibonacci-like” substitution system</a></p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg10.png' alt='' title='' width='155' height='13'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg11.png' alt='' title='' width='212' height='147'> </div> </p></div> <p>and in fact the sequence of values of <em>f</em>[<em>n</em>] can be computed just as:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09262024updatesfinalBimg21_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg12A.png' alt='' title='' width='533' height='17'> </div> </p></div> <p>And indeed it turns out that in this case <em>f</em>[<em>n</em>] is given exactly by:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg13.png' alt='' title='' width='208' height='14'> </div> </p></div> <p>What about when <em>f</em>[<em>n</em>] = 1 not just for <em>n</em> ≤ 1 but beyond? For <em>n</em> ≤ 2 the results are essentially the same as for <em>n</em> ≤ 1. But for <em>n</em> ≤ 3 there’s a surprise: the behavior is considerably more complicated—as we can see if we plot <em>f</em>[<em>n</em>] – <em>n</em>/ϕ: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg14.png' alt='' title='' width='611' height='164'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg15_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg15.png' alt='' title='' width='609' height='160'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg16_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg16.png' alt='' title='' width='612' height='160'> </div> </p></div> <p>Looking at the evaluation graph in this case we see that the only initial conditions sampled are <em>f</em>[1] = 1 and <em>f</em>[3] = 1 (with <em>f</em>[2] only being reached if one specifically starts with <em>f</em>[2]):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg17_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg17.png' alt='' title='' width='298' height='246'> </div> </p></div> <p>And continuing the evaluation graph we see a mixture of irregularity and comparative regularity:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg18_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg18.png' alt='' title='' width='353' height='327'> </div> </p></div> <p>The plot of <em>f</em>[<em>n</em>] has a strange “hand-drawn” appearance, with overall regularity but detailed apparent randomness. The most obvious large-scale feature is “bursting” behavior (interspersed in an <a href="https://www.wolframcloud.com/obj/sw-writings0/NestedlyRecursive/RecursiveAudio">audio rendering</a> with an annoying hum). The bursts all seem to have approximately (though not exactly) the same structure—and get systematically larger. The lengths of successive “regions of calm” between bursts (characterized by runs with Abs[<em>f</em>[<em>n</em>] – <em>n</em>/ϕ] < 3) seem to consistently increase by a factor ϕ:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg19_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg19.png' alt='' title='' width='356' height='99'> </div> </p></div> <p>What happens to S1 with other initial conditions? Here are a few examples:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg20_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg20.png' alt='' title='' width='669' height='167'> </div> </p></div> <p>So how does S<em>a</em> depend on <em>a</em>? Sometimes there’s at least a certain amount of clear regularity; sometimes it’s more complicated:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg21_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg21.png' alt='' title='' width='664' height='166'> </div> </p></div> <p>As is very common, adding the parameter <em>b</em> in the definition doesn’t seem to lead to fundamentally new behavior—though for <em>b </em>> 0 the initial condition <em>f</em>[<em>n</em>] = 1, <em>n </em>≤ 0 can be used:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg22_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09202024sfamimg22.png' alt='' title='' width='657' height='246'> </div> </p></div> <p>In all cases, only a limited number of initial conditions are sampled (bounded by the value of <em>a </em>+ <em>b</em> in the original definition). But as we can see, the behavior can either be quite simple, or can be highly complex.</p> <h2 id="more-complicated-rules">More Complicated Rules</h2> <p>Highly complex behavior arises even from very simple rules. It’s a phenomenon one sees all over the computational universe. And we’re seeing it here in nestedly recursive functions. But if we make the rules (i.e. definitions) for our functions more complicated, will we see fundamentally different behavior, or just more of the same?</p> <p>The <a href="https://www.wolframscience.com/nks/chap-12--the-principle-of-computational-equivalence/">Principle of Computational Equivalence</a> (as well as many empirical observations of other systems) suggests that it’ll be “more of the same”: that once one’s passed a fairly low threshold the computational sophistication—and complexity—of behavior will no longer change. </p> <p>And indeed this is what one sees in nestedly recursive functions. But below the threshold different kinds of things can happen with different kinds of rules. </p> <p>There are several directions in which we can make rules more complicated. One that we won’t discuss here is to use <a href="https://community.wolfram.com/groups/-/m/t/3210833?p_p_auth=sp4pFdlb">operations (conditional, bitwise, etc.) that go beyond arithmetic</a>. Others tend to involve adding more instances of <em>f</em> in our definitions.</p> <p>An obvious way to do this is to take <em>f</em>[<em>n</em>_] to be given by a sum of terms, “Fibonacci style”. There are various specific forms one can consider. As a first example—that we can call <span class='InlineFormula'><img src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg1.png' width= '15' height='16' align='absmiddle'></span><em>ab—</em>let’s look at:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg2.png' alt='' title='' width='202' height='30'> </div> </p></div> <p>The value of <em>a</em> doesn’t seem to matter much. But changing <em>b</em> we see:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg3.png' alt='' title='' width='661' height='248'> </div> </p></div> <p><span class='InlineFormula'><img src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg4.png' width= '15' height='16' align='absmiddle'></span>12 has unbounded lookback (at least starting with <em>f</em>[<em>n</em>] = 1 for <em>n</em> ≤ 0), but for larger <em>b</em>, <span class='InlineFormula'><img src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg5.png' width= '15' height='16' align='absmiddle'></span>1<em>b</em> has bounded lookback. In both <span class='InlineFormula'><img style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg6.png' width= '15' height='16' align='absmiddle'></span>13 and <span class='InlineFormula'><img style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg7.png' width= '15' height='16' align='absmiddle'></span>15 there is continuing large-scale structure (here visible in log plots)</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg8.png' alt='' title='' width='646' height='215'> </div> </p></div> <p>though this does not seem to be reflected in the corresponding evaluation graphs:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg9.png' alt='' title='' width='626' height='147'> </div> </p></div> <p>As another level of Fibonacci-style definition, we can consider <span class='InlineFormula'><img src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg10.png' width= '15' height='16' align='absmiddle'></span><em>ab</em>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg11.png' alt='' title='' width='242' height='30'> </div> </p></div> <p>But the typical behavior here does not seem much different from what we already saw with one-term definitions involving only two <em>f</em>’s:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg12.png' alt='' title='' width='608' height='152'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg13.png' alt='' title='' width='607' height='152'> </div> </p></div> <p>(Note that <span class='InlineFormula'><img src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg14.png' width= '15' height='16' align='absmiddle'></span><em>aa</em> is equivalent to <span class='InlineFormula'><img style="margin-bottom: 2px" src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg15.png' width= '14' height='14' align='absmiddle'></span><em>a</em>. Cases like <span class='InlineFormula'><img src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg16.png' width= '15' height='16' align='absmiddle'></span>13 lead after a transient to pure exponential growth.)</p> <p>A somewhat more unusual case is what we can call <span class='InlineFormula'><img src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg17.png' width= '17' height='16' align='absmiddle'></span><em>abc</em>: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg18_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg18.png' alt='' title='' width='234' height='30'> </div> </p></div> <p>Subtracting overall linear trends we get:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg19_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg19.png' alt='' title='' width='659' height='248'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg20_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg20.png' alt='' title='' width='661' height='248'> </div> </p></div> <p>For <span class='InlineFormula'><img src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg21.png' width= '17' height='16' align='absmiddle'></span>111 using initial conditions <em>f</em>[1] = <em>f</em>[2] = 1 and plotting <em>f</em>[<em>n</em>] – <em>n</em>/2 we get</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg22_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg22.png' alt='' title='' width='667' height='122'> </div> </p></div> <p>which has a <a href="https://www.wolframscience.com/nks/p130--recursive-sequences/">nested structure</a> that is <a href="https://www.wolframscience.com/nks/notes-4-3--properties-of-recursive-sequences/">closely related</a> to the result of <a href="https://www.wolframscience.com/nks/notes-4-5--concatenation-sequences/">concatenating binary digit sequences</a> of successive integers:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg23_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg23.png' alt='' title='' width='532' height='14'> </div> </p></div> <p>But despite the regularity in the sequence of values, the evaluation graph for this function is not particularly simple:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg24_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg24.png' alt='' title='' width='465' height='230'> </div> </p></div> <p>So how else might we come up with more complicated rules? One possibility is that instead of “adding <em>f</em>’s by adding terms” we can add <em>f</em>’s by additional nesting. So, for example, we can consider what we can call S<sub>3</sub>1 (here shown with initial condition <em>f</em>[<em>n</em>] = 1 for <em>n</em> ≤ 3):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg26_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg26.png' alt='' title='' width='161' height='36'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg27_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg27.png' alt='' title='' width='436' height='119'> </div> </p></div> <p>We can estimate the overall slope here by solving for <em>x</em> in <em>x</em> <span style="display: inline-block;width: .5rem;overflow: hidden;height: 1.25rem;">=</span><span style="display: inline-block;margin-left: -.08rem;display: inline-block;width: .4rem;overflow: hidden;height: 1.25rem;">=</span> 1 – <em>x</em><sup>3</sup> to get ≈ 0.682. Subtracting this off we get:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg29_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg29.png' alt='' title='' width='608' height='161'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg30_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg30.png' alt='' title='' width='608' height='158'> </div> </p></div> <p>We can also consider deeper nestings. At depth <em>d</em> the slope is the solution to <em>x</em> <span style="display: inline-block;width: .5rem;overflow: hidden;height: 1.25rem;">=</span><span style="display: inline-block;margin-left: -.08rem;display: inline-block;width: .4rem;overflow: hidden;height: 1.25rem;">=</span> 1 – <em>x</em><sup><em>d</em></sup>. Somewhat remarkably, in all cases the only initial conditions probed are <em>f</em>[1] = 1 and <em>f</em>[3] = 1:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg32_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg32.png' alt='' title='' width='609' height='229'> </div> </p></div> <p>As another example of “higher nesting” we can consider the class of functions (that we call <span class='InlineFormula'><img style="margin-bottom: -1px" src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg33.png' width= '18' height='16' align='absmiddle'></span><em>a</em>):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg34_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg34.png' alt='' title='' width='180' height='36'> </div> </p></div> <p>Subtracting a constant 1/ϕ slope we get:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg35_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg35.png' alt='' title='' width='653' height='163'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09252024finalupdatesimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024finalupdatesimg3.png' alt='' title='' width='651' height='163'> </div> </p></div> <p>The evaluation graph for <span class='InlineFormula'><img style="margin-bottom: -1px" src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg37.png' width= '18' height='16' align='absmiddle'></span>1 is complicated, but has some definite structure:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg38_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg38.png' alt='' title='' width='632' height='300'> </div> </p></div> <p>What happens if we nest even more deeply, say defining <span class='InlineFormula'><img src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg39.png' width= '18' height='16' align='absmiddle'></span><em>a</em> functions:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg40_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg40.png' alt='' title='' width='201' height='43'> </div> </p></div> <p>With depth-<em>d</em> nesting, we can estimate the overall slope of <em>f</em>[<em>n</em>] by solving for <em>x</em> in</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09262024updatesfinalBimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09262024updatesfinalBimg5.png' alt='' title='' width='115' height='19'> </div> </p></div> <p>or</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09262024updatesfinalBimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09262024updatesfinalBimg6.png' alt='' title='' width='117' height='15'> </div> </p></div> <p>so that for the <em>d </em>= 3 case here the overall slope is the real root of <span class='InlineFormula'><img style="margin-bottom: 6px" src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg43.png' width= '103' height='14' align='absmiddle'></span> or about 0.544. Subtracting out this overall slope we get:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg44_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg44.png' alt='' title='' width='652' height='244'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09252024finalupdatesimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024finalupdatesimg5.png' alt='' title='' width='642' height='241'> </div> </p></div> <p>And, yes, the sine-curve-like form of <span class='InlineFormula'><img src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg46.png' width= '18' height='16' align='absmiddle'></span>5 is very odd. Continuing 10x longer, though, things are “squaring off”:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09262024updatesfinalBimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09262024updatesfinalBimg9A.png' alt='' title='' width='641' height='213'> </div> </p></div> <p>What happens if we continue nesting deeper? <span class='InlineFormula'><img src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg48.png' width= '29' height='14' align='absmiddle'></span> stays fairly tame:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg49_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg49.png' alt='' title='' width='645' height='242'> </div> </p></div> <p>However, <span class='InlineFormula'><img src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg50.png' width= '31' height='14' align='absmiddle'></span> already allows for more complicated behavior:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg51_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg51.png' alt='' title='' width='643' height='241'> </div> </p></div> <p>And for different values of <em>a</em> there are different regularities:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg52_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09232024rulesimg52.png' alt='' title='' width='646' height='242'> </div> </p></div> <p>There are all sorts of other extensions and generalizations one might consider. Some involve alternate functional forms; others involve introducing additional functions, or allowing multiple arguments to our function <em>f</em>.</p> <h2 id="an-aside-the-continuous-case">An Aside: The Continuous Case</h2> <p>In talking about recursive functions <em>f</em>[<em>n</em>] we’ve been assuming—as one normally does—that <em>n</em> is always an integer. But can we generalize what we’re doing to functions <em>f</em>[<em>x</em>] where <em>x</em> is a continuous real number?</p> <p>Consider for example a continuous analog of the Fibonacci recurrence:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg1_copy.txt' data-c2c-type='text/html'><img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg2.png' alt='' title='' width='292' height='14'> </div> </p></div> <p>This produces a staircase-like function whose steps correspond to the usual Fibonacci numbers:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg3.png' alt='' title='' width='422' height='113'> </div> </p></div> <p>Adjusting the initial condition produces a slightly different result:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg4_copy.txt' data-c2c-type='text/html'><img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg5.png' alt='' title='' width='292' height='14'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg6.png' alt='' title='' width='414' height='111'> </div> </p></div> <p>We can think of these as being solutions to a kind of “Fibonacci delay equation”—where we’ve given initial conditions not at discrete points, but instead on an interval.</p> <p>So what happens with nestedly recursive functions? We can define an analog of S1 as:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg7_copy.txt' data-c2c-type='text/html'><img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg8.png' alt='' title='' width='267' height='14'> </div> </p></div> <p>Plotting this along with the discrete result we get:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg9.png' alt='' title='' width='620' height='86'> </div> </p></div> <p>In more detail, we get</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg10.png' alt='' title='' width='502' height='135'> </div> </p></div> <p>where now the plateaus occur at the (“<a href="https://oeis.org/A000201" target="_blank" rel="noopener">Wythoff numbers</a>”) <span class='InlineFormula'><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg11.png' width= '150' height='14' align='absmiddle'></span>. </p> <p>Changing the initial condition to be <em>x</em> ≤ 3 we get:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg12.png' alt='' title='' width='619' height='85'> </div> </p></div> <p>Removing the overall slope by subtracting <em>x</em>/ϕ gives:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg13.png' alt='' title='' width='631' height='169'> </div> </p></div> <p>One feature of the continuous case is that one can continuously change initial conditions—though the behavior one gets typically breaks into “domains” with discontinuous boundaries, as in this case where we’re plotting the value of <em>f</em>[<em>x</em>] as a function of <em>x</em> and the “cutoff” <span class='InlineFormula'><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg14.png' width= '15' height='11' align='absmiddle'></span> in the initial conditions <em>f</em>[<em>x</em>], <em>x</em> ≤ <span class='InlineFormula'><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg15.png' width= '15' height='11' align='absmiddle'></span>: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg16_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg16.png' alt='' title='' width='618' height='229'> </div> </p></div> <p>So what about other rules? A rule like P312 (<em>f</em>[<em>n</em>_] := 3 + <em>f</em>[<em>n</em> – <em>f</em>[<em>n</em> – 2]]) given “constant” initial conditions effectively just copies and translates the initial interval, and gives a simple order-0 interpolation of the discrete case. With initial condition <em>f</em>[<em>x</em>] = <em>x</em> some segments get “tipped”:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg17_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg17.png' alt='' title='' width='385' height='106'> </div> </p></div> <p>All the cases we’ve considered here don’t “look back” to negative values, in either the discrete or continuous case. But what about a rule like T212 (<em>f</em>[<em>n</em>_] := 2 <em>f</em>[<em>n</em> – 1 <em>f</em>[<em>n</em> – 2]]) that progressively “looks back further”? With the initial condition <em>f</em>[<em>x</em>] = 1 for <em>x</em> ≤ 0, one gets the same result as in the discrete case:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09252024finalupdatesimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024finalupdatesimg7.png' alt='' title='' width='341' height='108'> </div> </p></div> <p>But if one uses the initial condition <em>f</em>[<em>x </em>] = Abs[<em>x</em> – 1] for <em>x</em> ≤ 0 (the Abs[<em>x</em> <span style="font-size:14px">–</span> 1] is needed to avoid ending up with <em>f</em>[<em>x</em>] depending on <em>f</em>[<em>y</em>] for <em>y</em> > <em>x</em>) one instead has</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg19_copy.txt' data-c2c-type='text/html'><img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg20.png' alt='' title='' width='350' height='14'> </div> </p></div> <p>yielding the rather different result:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg21_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg21.png' alt='' title='' width='608' height='173'> </div> </p></div> <p>Continuing for larger <em>x</em> (on a log scale) we get:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg22_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg22.png' alt='' title='' width='600' height='169'> </div> </p></div> <p>Successively zooming in on one of the first “regions of noise” we see that it ultimately consists just of a large number of straight segments:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg23_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg23B.png' alt='' title='' width='634' height='182'> </div> </p></div> <p>What’s going on here? If we count the number of initial conditions that are used for different values of <em>x</em> we see that this has discontinuous changes, leading to disjoint segments in <em>f</em>[<em>x</em>]:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg24_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg24.png' alt='' title='' width='635' height='185'> </div> </p></div> <p>Plotting over a larger range of <em>x</em> values the number of initial conditions used is:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg25_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg25.png' alt='' title='' width='596' height='159'> </div> </p></div> <p>And plotting the actual values of those initial conditions we get:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09262024updatesfinalBimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09262024updatesfinalBimg11A.png' alt='' title='' width='590' height='169'> </div> </p></div> <p>If we go to later, “more intense” regions of noise, we see more fragmentation—and presumably in the limit <em>x</em> <img style="margin-bottom: -1px" class='' src="https://content.wolfram.com/uploads/sites/32/2022/10/rightarrow2.png" width='15' height='11' > ∞ we get the analog of an essential singularity in <em>f</em>[<em>x</em>]:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg27_copy.txt' data-c2c-type='text/html'> <img src='https://content.wolfram.com/sites/43/2024/09/sw09252024finalupdatesimg9.png' alt='' title='' width='594' height='162'/> </div> </p></div> <p>For the S family, with its overall <em>n</em>/ϕ trend, even constant initial conditions—say for S1—already lead to tipping, here shown compared to the discrete case:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg28_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09242024asideimg28.png' alt='' title='' width='620' height='84'> </div> </p></div> <h2 id="how-do-you-actually-compute-recursive-functions">How Do You Actually Compute Recursive Functions?</h2> <p>Let’s say we have a recursive definition—like the standard Fibonacci one:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09252024computeimg1_copy.txt' data-c2c-type='text/html'><img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024computeimg2.png' alt='' title='' width='327' height='14'> </div> </p></div> <p>How do we actually use this to compute the value of, say, <em>f</em>[7]? Well, we can start from <em>f</em>[7], then use the definition to write this as <em>f</em>[6] + <em>f</em>[5], then write <em>f</em>[6] as <em>f</em>[5] + <em>f</em>[4], and so on. And we can <a href="https://writings.stephenwolfram.com/2023/09/expression-evaluation-and-fundamental-physics/">represent this using a evaluation graph</a>, in the form:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09252024finalupdatesimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024finalupdatesimg10.png' alt='' title='' width='548' height='271'> </div> </p></div> <p>But this computation is in a sense very wasteful; for example, it’s independently computing <em>f</em>[3] five separate times (and of course getting the same answer each time). But what if we just stored each <em>f</em>[<em>n</em>] as soon as we compute, and then just retrieve that stored (“cached”) value whenever we need it again?</p> <p>In the Wolfram Language, it’s a very simple change to our original definition:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09252024computeimg4_copy.txt' data-c2c-type='text/html'><img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024computeimg5.png' alt='' title='' width='381' height='14'> </div> </p></div> <p>And now our evaluation graph becomes much simpler:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09252024finalupdatesimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024finalupdatesimg11.png' alt='' title='' width='135' height='204'> </div> </p></div> <p>And indeed it’s this kind of minimal evaluation graph that we’ve been using in everything we’ve discussed so far. </p> <p>What’s the relationship between the “tree” evaluation graph, and this minimal one? The tree graph is basically an “unrolled” version of the minimal graph, in which all the possible paths that can be taken from the root node to the initial condition nodes have been treed out. </p> <p>In general, the number of edges that come out of a single node in a evaluation graph will be equal to the number of instances of the function <em>f</em> that appear on the right-hand side of the recursive definition we’re using (i.e. 2 in the case of the standard Fibonacci definition). So this means that if the maximum length of path from the root to the initial conditions is <em>s</em>, the maximum number of nodes that can appear in the “unrolled” graph is 2<sup><em>s</em></sup>. And whenever there are a fixed set of initial conditions (i.e. if there’s always the same lookback), the maximum path length is essentially <em>n</em>—implying in the end that the maximum possible number of nodes in the unrolled graph will be 2<sup><em>n</em></sup>.</p> <p>(In the actual case of the Fibonacci recurrence, the number of nodes in the unrolled graph is, or about 1.6<sup><em>n</em></sup>.)</p> <p>But if we actually evaluate <em>f</em>[7]—say in the Wolfram Language—what is the sequence of <em>f</em>[<em>n</em>]’s that we’ll end up computing? Or, in effect, how will the evaluation graph be traversed? Here are the results for the unrolled and minimal evaluation graphs—i.e. without and with caching:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09262024updatesfinalBimg22_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09262024updatesfinalBimg22.png' alt='' title='' width='621' height='84'> </div> </p></div> <p>Particularly in the first case this isn’t the only conceivable result we could have gotten. It’s the way it is here because of the particular “<a href="https://writings.stephenwolfram.com/2020/12/combinators-a-centennial-view/#the-question-of-evaluation-order">leftmost innermost</a>” evaluation order that the Wolfram Language uses by default. In effect, we’re traversing the graph in a depth-first way. In principle we could use other traversal orders, leading to <em>f</em>[<em>n</em>]’s being evaluated in different orders. But unless we allow other operations (like <em>f</em>[3] + <em>f</em>[3] <img style="margin-bottom: -1px" class='' src="https://content.wolfram.com/uploads/sites/32/2022/10/rightarrow2.png" width='15' height='11' > 2 <em>f</em>[3]) to be interspersed with <em>f</em> evaluations, we’ll still always end up with the same number of <em>f</em> evaluations for a given evaluation graph.</p> <p>But which is the “correct” evaluation graph? The unrolled one? Or the minimal one? Well, it depends on the computational primitives we’re prepared to use. With a pure stack machine, the unrolled graph is the only one possible. But if we allow (random-access) memory, then the minimal graph becomes possible.</p> <p>OK, so what happens with nestedly recursive functions? Here, for example, are unrolled and minimal graphs for T212: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09252024finalupdatesimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024finalupdatesimg12.png' alt='' title='' width='634' height='268'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09262024updatesfinalBimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09262024updatesfinalBimg13A.png' alt='' title='' width='582' height='227'> </div> </p></div> <p>Here are the sequences of <em>f</em>[<em>n</em>]’s that are computed: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09252024computeimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024computeimg10.png' alt='' title='' width='619' height='104'> </div> </p></div> <p>And here’s a comparison of the number of nodes (i.e. <em>f</em> evaluations) from unrolled and minimal evaluation graphs (roughly 1.2<sup><em>n</em></sup> and 0.5 <em>n</em>, respectively):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09252024computeimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024computeimg11.png' alt='' title='' width='523' height='126'> </div> </p></div> <p>Different recursive functions lead to different patterns of behavior. The differences are less obvious in evaluation graphs, but can be quite obvious in the actual sequence of <em>f</em>[<em>n</em>]’s that are evaluated:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09252024computeimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024computeimg12.png' alt='' title='' width='620' height='104'> </div> </p></div> <p>But although looking at evaluation sequences from unrolled evaluation graphs can be helpful as a way of classifying behavior, the exponentially more steps involved in the unrolled graph typically makes this impractical in practice.</p> <h2 id="primitive-recursive-or-not">Primitive Recursive or Not?</h2> <p>Recursive functions have a fairly long history, that we’ll be discussing below. And for nearly a hundred years there’s been a distinction made between “<a href="https://www.wolframscience.com/nks/notes-4-3--primitive-recursive-functions/">primitive recursive functions</a>” and “general recursive functions”. Primitive recursive functions are basically ones where there’s a “known-in-advance” pattern of computation that has to be done; general recursive functions are ones that may in effect make one have to “search arbitrarily far” to get what one needs.</p> <p>In Wolfram Language terms, primitive recursive functions are roughly ones that can be constructed directly using functions like <tt><a href="http://reference.wolfram.com/language/ref/Nest.html">Nest</a></tt> and <tt><a href="http://reference.wolfram.com/language/ref/Fold.html">Fold</a></tt> (perhaps nested); general recursive functions can also involve functions like <tt><a href="http://reference.wolfram.com/language/ref/NestWhile.html">NestWhile</a></tt> and <tt><a href="http://reference.wolfram.com/language/ref/FoldWhile.html">FoldWhile</a></tt>.</p> <p>So, for example, with the Fibonacci definition</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg2.png' alt='' title='' width='327' height='14'> </div> </p></div> <p>the function <em>f</em>[<em>n</em>] is primitive recursive and can be written, say, as:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09262024updatesfinalBimg15_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg4.png' alt='' title='' width='391' height='14'> </div> </p></div> <p>Lots of the functions one encounters in practice are similarly primitive recursive—including most “typical mathematical functions” (<tt><a href="http://reference.wolfram.com/language/ref/Plus.html">Plus</a></tt>, <tt><a href="http://reference.wolfram.com/language/ref/Power.html">Power</a></tt>, <tt><a href="http://reference.wolfram.com/language/ref/GCD.html">GCD</a></tt>, <tt><a href="http://reference.wolfram.com/language/ref/Prime.html">Prime</a></tt>, …). And for example functions that give the results of <em>n</em> steps in the <a href="https://reference.wolfram.com/language/ref/TuringMachine.html">evolution of a Turing machine</a>, <a href="https://reference.wolfram.com/language/ref/CellularAutomaton.html">cellular automaton</a>, etc. are also primitive recursive. But functions that for example test whether a Turing machine will ever halt (or give the state that it achieves if and when it does halt) are not in general primitive recursive. </p> <p>On the face of it, our nestedly recursive functions seem like they must be primitive recursive, since they don’t for example appear to be “searching for anything”. But things like the presence of longer and longer lookbacks raise questions. And then there’s the potential confusion of the very first example (dating from the late 1920s) of a recursively defined function known not to be primitive recursive: the <a href="https://www.wolframscience.com/nks/notes-4-3--ackermann-functions/">Ackermann function</a>. </p> <p>The Ackermann function has three (or sometimes two) arguments—and, notably, its definition (here given in its classic form) includes nested recursion:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg6_copy.txt' data-c2c-type='text/html'><img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg6.png' alt='' title='' width='518' height='14'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg7.png' alt='' title='' width='267' height='30'> </div> </p></div> <p>This is what the evaluation graphs look like for some small cases:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg8.png' alt='' title='' width='506' height='594'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09262024updatesfinalBimg18_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg9.png' alt='' title='' width='570' height='347'> </div> </p></div> <p>Looking at these graphs we can begin to see a pattern. And in fact there’s a simple interpretation: <em>f</em>[<em>m</em>, <em>x</em>, <em>y</em>] for successive <em>m</em> is doing progressively more nested iterations of integer successor operations. <em>f</em>[0, <em>x</em>, <em>y</em>] computes <em>x</em> + <em>y</em>; <em>f</em>[1, <em>x</em>, <em>y</em>] does “repeated addition”, i.e. computes <em>x</em> <em>× y</em>; <em>f</em>[2, <em>x</em>, <em>y</em>] does “repeated multiplication”, i.e. computes <em>x</em><sup<em>y</em></sup></em>; <em>f</em>[3, <em>x</em>, <em>y</em>] does “<a href="https://mathworld.wolfram.com/PowerTower.html">tetration</a>”, i.e. computes the “power tower” Nest[<em>x</em><sup>#&</sup>, 1, <em>y</em>]; etc. </p> <p>Or, alternatively, these can be given explicitly in successively more nested form:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg11.png' alt='' title='' width='147' height='14'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg13.png' alt='' title='' width='248' height='14'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg15.png' alt='' title='' width='353' height='14'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg16_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg17.png' alt='' title='' width='459' height='14'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg18_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg19.png' alt='' title='' width='566' height='14'></div> </p></div> <p><img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024primitivedots.png' alt='' title='' width='54' height='13'></p> <p>And at least in this form <em>f</em>[<em>m</em>, <em>x</em>, <em>y</em>] involves <em>m</em> nestings. But a given primitive recursive function can involve only a fixed number of nestings. It might be conceivable that we could rewrite <nobr><em>f</em>[<em>m</em>, <em>x</em>, <em>y</em>]</nobr> in certain cases to involve only a fixed number of nestings. But if we look at <nobr><em>f</em>[<em>m</em>, <em>m</em>, <em>m</em>]</nobr> then this turns out to inevitably grow too rapidly to be represented by a fixed number of nestings—and thus cannot be primitive recursive. </p> <p>But it turns out that the fact that this can happen depends critically on the Ackermann function having more than one argument—so that one can construct the “diagonal” <em>f</em>[<em>m</em>, <em>m</em>, <em>m</em>]. </p> <p>So what about our nestedly recursive functions? Well, at least in the form that we’ve used them, they can all be written in terms of <tt><a href="http://reference.wolfram.com/language/ref/Fold.html">Fold</a></tt>. The key idea is to accumulate a list of values so far (conveniently represented as an association)—sampling whichever parts are needed—and then at the end take the last element. So for example the “Summer School function” T311</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg20_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg21.png' alt='' title='' width='335' height='14'> </div> </p></div> <p>can be written: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg23_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg23.png' alt='' title='' width='518' height='38'> </div> </p></div> <p>An important feature here is that we’re getting <tt><a href="http://reference.wolfram.com/language/ref/Lookup.html">Lookup</a></tt> to give 1 if the value it’s trying to look up hasn’t been filled in yet, implementing the fact that <em>f</em>[<em>n</em>] = 1 for <em>n</em> ≤ 0. </p> <p>So, yes, our recursive definition might look back further and further. But it always just finds value 1—which is easy for us to represent without, for example, any extra nesting, etc.</p> <p>The ultimate (historical) definition of primitive recursion, though, doesn’t involve subsets of the Wolfram Language (the definition was given almost exactly 100 years too early!). Instead, it involves a <a href="https://www.wolframscience.com/nks/notes-4-3--primitive-recursive-functions/">specific set of simple primitives</a>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09252024finalupdatesimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024finalupdatesimg14.png' alt='' title='' width='669' height='174'> </div> </p></div> <p>(An alternative, equivalent definition for recursion—explicitly involving <tt><a href="http://reference.wolfram.com/language/ref/Fold.html">Fold</a></tt>—is <nobr><tt>r[g_, h_] := Fold[{u, v} <img style="margin-bottom: -2px" class='' src="https://content.wolfram.com/uploads/sites/32/2022/10/rightarrow.png" width='18' height='12' > h[u, v, ##2]], g[##2]</nobr>, </tt><tt><a href="http://reference.wolfram.com/language/ref/Range.html">Range</a></tt><tt>[0, #1 – 1]] &</tt>.)</p> <p>So can our nestedly recursive functions be written purely in terms of these primitives? The answer is yes, though it’s seriously complicated. A simple function like <tt><a href="http://reference.wolfram.com/language/ref/Plus.html">Plus</a></tt> can for example be written as <tt>r</tt><tt>[p[1], s]</tt>, so that e.g. <tt>r[p[1], s][2,3]<img style="margin-bottom: -1px" class='' src="https://content.wolfram.com/uploads/sites/32/2022/10/rightarrow2.png" width='15' height='11' >5</tt>. <tt><a href="http://reference.wolfram.com/language/ref/Times.html">Times</a></tt> can be written as <tt>r[z, c</tt>[<tt>Plus</tt>, <tt>p[1], p[3]]]</tt> or <tt>r[z, c[r[p[1], s], p[1], p[3]]]</tt>, while <tt><a href="http://reference.wolfram.com/language/ref/Factorial.html">Factorial</a></tt> can be written as <tt>r[c[s, z], c[</tt><tt>Times</tt>, <tt>p[1], c[s, p[2</tt><tt>]]]]</tt>. But even <tt><a href="http://reference.wolfram.com/language/ref/Fibonacci.html">Fibonacci</a></tt>, for example, seems to require a very much longer specification.</p> <p>In writing “primitive-recursive-style” definitions in Wolfram Language we accumulated values in lists and associations. But in the ultimate definition of primitive recursion, there are no such constructs; the only form of “data” is positive integers. But for our definitions of nestedly recursive functions we can use a “<a href="https://arxiv.org/pdf/1706.04129" target="_blank" rel="noopener">tupling function</a>” that “packages up” any list of integer values into a single integer (and an untupling function that unpacks it). And we can do this say based on a pairing (2-element-tupling) function like:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg25_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg26.png' alt='' title='' width='216' height='17'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg27_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg27.png' alt='' title='' width='117' height='126'> </div> </p></div> <p>But what about the actual <tt><a href="http://reference.wolfram.com/language/ref/If.html">If</a></tt><tt>[n ≤0, 1, ...]</tt> lookback test itself? Well, <tt>If</tt> can be written in primitive recursive form too: for example, <tt>r[c[s, z], c[f, c[s, p[2]]]][n]</tt> is equivalent to <tt>If</tt><tt>[n ≤ 0, 1, f[n]]</tt>.</p> <p>So our nestedly recursive functions as we’re using them are indeed primitive recursive. Or, more strictly, finding values <em>f</em>[<em>n</em>] is primitive recursive. Asking questions like “For what <em>n</em> does <em>f</em>[<em>n</em>] reach 1000?” might not be primitive recursive. (The obvious way of answering them involves a <tt><a href="http://reference.wolfram.com/language/ref/FoldWhile.html">FoldWhile</a></tt>-style non-primitive-recursive search, but proving that there’s no primitive recursive way to answer the question is likely very much harder.)</p> <p>By the way, it’s worth commenting that while for primitive recursive functions it’s always possible to compute a value <em>f</em>[<em>n</em>] for any <em>n</em>, that’s not necessarily true for general recursive functions. For example, if we ask “For what <em>n</em> does <em>f</em>[<em>n</em>] reach 1000?” there might simply be no answer to this; <em>f</em>[<em>n</em>] might never reach 1000. And when we look at the computations going on underneath, the key distinction is that in evaluating primitive recursive functions, the computations always halt, while for general recursive functions, they may not.</p> <p>So, OK. Our nestedly recursive functions can be represented in “official primitive recursive form”, but they’re very complicated in that form. So that raises the question: what functions can be represented simply in this form? In <em>A New Kind of Science</em> I <a href="https://www.wolframscience.com/nks/notes-4-3--primitive-recursive-functions/">gave some examples</a>, each minimal for the output it produces:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg28_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg28.png' alt='' title='' width='693' height='162'> </div> </p></div> <p>And then there’s the most interesting function I found:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg29_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg29.png' alt='' title='' width='148' height='14'> </div> </p></div> <p>It’s the simplest primitive recursive function whose output has no obvious regularity:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg30_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg30.png' alt='' title='' width='607' height='13'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg31_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg31.png' alt='' title='' width='660' height='175'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg32_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg32.png' alt='' title='' width='659' height='172'> </div> </p></div> <p>Because it’s primitive recursive, it’s possible to express it in terms of functions like <tt><a href="http://reference.wolfram.com/language/ref/Fold.html">Fold</a></tt>—though it’s two deep in those, making it in some ways more complicated (at least as far as the <a href="https://www.wolframscience.com/nks/notes-4-3--ackermann-functions/">Grzegorczyk hierarchy</a> that counts “Fold levels” is concerned) than our nestedly recursive functions:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg33_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg33.png' alt='' title='' width='532' height='65'> </div> </p></div> <p>But there’s still an issue to address with nestedly recursive functions and primitive recursion. When we have functions (like T212) that “reach back” progressively further as <em>n</em> increases, there’s a question of what they’ll find. We’ve simply assumed <em>f</em>[<em>n</em>] = 1 for <em>n</em> ≤0. But what if there was something more complicated there? Even if <em>f</em>[–<em>m</em>] was given by some primitive recursive function, say <em>p</em>[<em>m</em>], it seems possible that in computing <em>f</em>[<em>n</em>] one could end up somehow “bouncing back and forth” between positive and negative arguments, and in effect searching for an <em>m</em> for which <em>p</em>[<em>m</em>] has some particular value, and in doing that searching one could find oneself outside the domain of primitive recursive functions.</p> <p>And this raises yet another question: are all definitions we can give of nestedly recursive functions consistent? Consider for example:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg34_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024primitiveimg35.png' alt='' title='' width='319' height='14'> </div> </p></div> <p>Now ask: what is <em>f</em>[1]? We apply the recursive definition. But it gives us <nobr><em>f</em>[1] = 1 – <em>f</em>[<em>f</em>[0]]</nobr> or <em>f</em>[1] = 1 – <em>f</em>[1], or, in other words, an inconsistency. There are many such inconsistencies that seem to “happen instantly” when we apply definitions. But it seems conceivable that there could be “insidious inconsistencies” that show up only after many applications of a recursive definition. And it’s also conceivable that one could end up with “loops” like <em>f</em>[<em>i</em>] = <em>f</em>[<em>i</em>]. And things like this could be reasons that <em>f</em>[<em>n</em>] might not be a “total function”, defined for all <em>n</em>.</p> <p>We’ve seen all sorts of complex behavior in nestedly recursive functions. And what the <a href="https://www.wolframscience.com/nks/chap-12--the-principle-of-computational-equivalence/">Principle of Computational Equivalence</a> suggests is that whenever one sees complex behavior, one must in some sense be dealing with computations that are “as sophisticated as any computation can be”. And in particular one must be dealing with computations that can somehow support <a href="https://www.wolframscience.com/nks/p642--the-phenomenon-of-universality/">computation universality</a>. </p> <p>So what would it mean for a nestedly recursive function to be universal? For a start, one would need some way to “program” the function. There seem to be a couple of possibilities. First, one could imagine packing both “code” and “data” into the argument <em>n</em> of <em>f</em>[<em>n</em>]. So, for example, one might use some form of tupling function to take a description of a rule and an initial state for a Turing machine, together with a specification of a step number, then package all these things into an integer <em>n</em> that one feeds into one’s universal nestedly recursive function <em>f</em>. Then the idea would be that the value computed for <em>f</em>[<em>n</em>] could be decoded to give the state of the Turing machine at the specified step. (Such a computation by definition always halts—but much as one computes with Turing machines by successively asking for the next steps in their evolution, one can imagine setting up a “harness” that just keeps asking for values of <em>f</em>[<em>n</em>] at an infinite progression of values <em>n</em>.)</p> <p>Another possible approach to making a universal nestedly recursive function is to imagine feeding in a “program” through the initial conditions one gives for the function. There might well need to be decoding involved, but in some sense what one might hope is that just by changing its initial conditions one could get a nestedly recursive function with a specific recursive definition to emulate a nestedly recursive function with any other recursive definition (or, say, for a start, any <a href="https://reference.wolfram.com/language/ref/LinearRecurrence.html">linear recurrence</a>).</p> <p>Perhaps one could construct a complicated nestedly recursive function that would have this property. But what the Principle of Computational Equivalence suggests is that it should be possible to find the property even in “naturally occurring cases”—like P312 or T212. </p> <p>The situation is probably going to be quite analogous to what happened with the <a href="https://www.wolframscience.com/nks/chap-11--the-notion-of-computation#sect-11-8--the-rule-110-cellular-automaton">rule 110 cellular automaton</a> or the <a href="https://www.wolframscience.com/prizes/tm23/"><em>s</em> = 2, <em>k</em> = 3 596440 Turing machine</a>. By looking at the actual typical behavior of the system one got some intuition about what was likely to be going on. And then later, with great effort, it became possible to actually prove computation universality. </p> <p>In the case of nestedly recursive functions, we’ve seen here examples of just how diverse the behavior generated by changing initial conditions can be. It’s not clear how to harness this diversity to extract some form of universality. But it seems likely that the “raw material” is there. And that nestedly recursive functions will show themselves as able join so many other systems in fitting into the framework defined by the Principle of Computational Equivalence.</p> <h2 id="some-history">Some History</h2> <p>Once one has the concept of functions and the concept of recursion, nestedly recursive functions aren’t in some sense a “complicated idea”. And between this fact and the fact that nestedly recursive functions haven’t historically had a clear place in any major line of mathematical or other development it’s quite difficult to be sure one’s accurately tracing their history. But I’ll describe here at least what I currently know.</p> <p>The concept of something like recursion is very old. It’s closely related to mathematical induction, which was already being used for proofs by Euclid around 300 BC. And in a quite different vein, around the same time (though not recorded in written form until many centuries later) <a href="https://www.wolframscience.com/nks/notes-3-5--fibonacci-numbers/">Fibonacci numbers</a> arose in Indian culture in connection with the <a href="https://www.wolframscience.com/nks/notes-2-3--the-concept-of-rules/">enumeration of prosody</a> (“How many different orders are there in which to say the Sanskrit words in this veda?”). </p> <p>Then in 1202 Leonardo Fibonacci, at the end of his calculational math book <em>Liber Abaci</em> (which was notable for popularizing Hindu-Arabic numerals in the West) stated—more or less as a recreational example—his “rabbit problem” in recursive form, and explicitly listed the Fibonacci numbers up to 377. But despite this early appearance, explicit recursively defined sequences remained largely a curiosity until as late as the latter part of the twentieth century.</p> <p>The concept of an abstract function began to emerge with calculus in the late 1600s, and became more solidified in the 1700s—but basically always in the context of continuous arguments. A variety of specific examples of recurrence relations—for binomial coefficients, <a href="https://writings.stephenwolfram.com/2015/12/untangling-the-tale-of-ada-lovelace/#the-bernoulli-number-computation">Bernoulli numbers</a>, etc.—were in fairly widespread use. But there didn’t seem to have yet been a sense that there was a general mathematical structure to study. </p> <p>In the course of the 1800s there had been an <a href="https://www.wolframscience.com/nks/notes-12-9--history-of-concept-of-mathematics/">increasing emphasis on rigor and abstraction</a> in mathematics, leading by the latter part of the century to a serious effort to axiomatize concepts associated with numbers. Starting with concepts like the recursive definition of integers by repeated application of the successor operation, by the time of <a href="https://www.wolframscience.com/nks/notes-12-9--axioms-for-arithmetic/">Peano’s axioms for arithmetic</a> in 1891 there was a clear general notion (particularly related to the induction axiom) that (integer) functions could be defined recursively. And when <a href="https://writings.stephenwolfram.com/2020/12/where-did-combinators-come-from-hunting-the-story-of-moses-schonfinkel/#gottingen-center-of-the-mathematical-universe">David Hilbert</a>’s program of axiomatizing mathematics got underway at the beginning of the 1900s, it was generally assumed that all (integer) functions of interest could actually be defined specifically using primitive recursion.</p> <p>The notation for recursively specifying functions gradually got cleaner, making it easier to explore more elaborate examples. And in 1927 Wilhelm Ackermann (a student of Hilbert’s) introduced (in completely modern notation) a “<a href="https://www.wolframscience.com/nksonline/page-906c/">reasonable mathematical function</a>” that—as we discussed above—he showed was not primitive recursive. And right there, in his paper, without any particular comment, is a nestedly recursive function definition:</p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024historyimg1.png' alt='Ackermann nestedly recursive function paper' title='Ackermann nestedly recursive function paper' width='533' height='113'/></p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024historyimg2.png' alt='Ackermann nestedly recursive function definition' title='Ackermann nestedly recursive function definition' width='533' height='139'/></p> <p>In 1931 Kurt Gödel further streamlined the representation of recursion, and solidified the notion of general recursion. There soon developed a whole field of recursion theory—though most of it was concerned with general issues, not with specific, concrete recursive functions. A notable exception was the work of Rózsa Péter (Politzer), beginning in the 1930s, and leading in 1957 to her book <em>Recursive Functions—</em>which contains a chapter on “Nested Recursion” (here in English translation):</p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024historyimg3.png' alt='Nested recursion book chapter' title='Nested recursion book chapter' width='468' height='261'/></p> <p>But despite the many specific (mostly primitive) recursive functions discussed in the rest of the book, this chapter doesn’t stray far from the particular function Ackermann defined (or at least Péter’s variant of it). </p> <p>What about the recreational mathematics literature? By the late 1800s there were all sorts of publications involving numbers, games, etc. that at least implicitly involved recursion (an example being Édouard Lucas’s 1883 <a href="https://writings.stephenwolfram.com/2022/06/games-and-puzzles-as-multicomputational-systems/#towers-of-hanoi-etc">Tower of Hanoi puzzle</a>). But—perhaps because problems tended to be stated in words rather than mathematical notation—it doesn’t seem as if nestedly recursive functions ever showed up. </p> <p>In the theoretical mathematics literature, a handful of somewhat abstract papers about “nested recursion” did appear, an example being one in 1961 by William Tait, then at Stanford: </p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024historyimg4.png' alt='Nested recursion paper by William Tait' title='Nested recursion paper by William Tait' width='504' height='165'/></p> <p>But, meanwhile, the general idea of recursion was slowly beginning to go from purely theoretical to more practical. <a href="https://writings.stephenwolfram.com/2023/08/remembering-the-improbable-life-of-ed-fredkin-1934-2023-and-his-world-of-ideas-and-stories/#computers">John McCarthy</a>—who had coined the term “artificial intelligence”—was designing LISP as “the language for AI” and by 1960 was writing papers with titles like “<a href="https://writings.stephenwolfram.com/2020/12/combinators-and-the-story-of-computation/#practical-computation">Recursive Functions of Symbolic Expressions and Their Computation by Machine</a>”. </p> <p>In 1962 McCarthy came to Stanford to found the AI Lab there, bringing with him enthusiasm for both AI and recursive functions. And by 1968 these two topics had come together in an effort to use “AI methods” to prove properties of programs, and in particular programs involving recursive functions. And in doing this, John McCarthy came up with an example he intended to be awkward—that’s exactly a nestedly recursive function:</p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024historyimg5.png' alt='John McCarthy nestedly recursive function example' title='John McCarthy nestedly recursive function example' width='623' height='185'/></p> <p>In our notation, it would be:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09242024historyimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09242024historyimg6.png' alt='' title='' width='251' height='30'> </div> </p></div> <p>And it became known as “McCarthy’s 91-function” because, yes, for many <em>n</em>, <em>f</em>[<em>n</em>] = 91. These days it’s trivial to evaluate this function—and to find out that <em>f</em>[<em>n</em>] = 91 only up to <em>n</em> = 102:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09242024historyimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09242024historyimg7.png' alt='' title='' width='271' height='77'> </div> </p></div> <p>But even the evaluation graph is somewhat large</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09242024historyimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09242024historyimg8.png' alt='' title='' width='473' height='371'> </div> </p></div> <p>and in pure recursive evaluation the recursion stack can get deep—which back then was a struggle for LISP systems to handle. </p> <p>There were efforts at theoretical analysis, for example by Zohar Manna, who in 1974 published <em>Mathematical Theory of Computation</em> which—in a section entitled “Fixpoints of Functionals”—presents the 91-function and other nestedly recursively functions, particularly in the context of <a href="https://writings.stephenwolfram.com/2020/12/combinators-a-centennial-view/#the-question-of-evaluation-order">evaluation-order questions</a>. </p> <p>In the years that followed, a variety of nestedly recursive functions were considered in connection with proving theorems about programs, and with practical assessments of LISP systems, a notable example being Ikuo Takeuchi’s 1978 triple recursive function:</p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024historyimg9.png' alt='Ikuo Takeuchi triple recursive function example' title='Ikuo Takeuchi triple recursive function example' width='623' height='261'/></p> <p>But in all these cases the focus was on how these functions would be evaluated, not on what their behavior would be (and it was typically very simple). </p> <p>But now we have to follow another thread in the story. Back in 1961, right on the Stanford campus, a then-16-year-old Douglas Hofstadter was being led towards nestedly recursive functions. As Doug tells it, it all started with him seeing that squares are interspersed with gaps of 1 or 2 between triangular numbers, and then noticing patterns in those gaps (and later realizing that they <a href="https://www.wolframscience.com/nks/notes-4-2--relation-of-powers-to-substitution-systems/">showed nesting</a>). Meanwhile, at Stanford he had access to a computer running Algol, a language which (like LISP and unlike Fortran) supported recursion (though this wasn’t particularly advertised, since recursion was still generally considered quite obscure). </p> <p>And as Doug tells it, within a year or two he was using Algol to do things like recursively create trees representing English sentences. Meanwhile—in a kind of imitation of the Eleusis “guess-a-card-rule” game—Doug was apparently challenging his fellow students to a “function game” based on guessing a math function from specified values. And, as he tells it, he found that functions that were defined recursively were the ones people found it hardest to guess.</p> <p>That was all in the early 1960s, but it wasn’t until the mid-1970s that Doug Hofstadter returned to such pursuits. After various adventures, Doug was back at Stanford—writing what became his book <em>Gödel, Escher, Bach</em>. And in 1977 he sent a letter to Neil Sloane, creator of the 1973 <em>A Handbook of Integer Sequences</em> (and what’s now the <a href="https://oeis.org/" target="_blank" rel="noopener">Online Encyclopedia of Integer Sequences, or OEIS</a>):</p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024historyimg10.png' alt='Douglas Hofstadter letter to Neil Sloane' title='Douglas Hofstadter letter to Neil Sloane' width='623' height='354'/></p> <p>As suggested by the accumulation of “sequence ID” annotations on the letter, Doug’s “eta sequences” had <a href="https://www.wolframscience.com/nks/notes-4-2--relation-of-powers-to-substitution-systems/">actually been studied in number theory before</a>—in fact, since at least the 1920s (they are now usually called <a href="https://mathworld.wolfram.com/BeattySequence.html">Beatty sequences</a>). But the letter went on, now introducing some related sequences—that had nestedly recursive definitions: </p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024historyimg11.png' alt='Sequences with nestedly recursive definitions' title='Sequences with nestedly recursive definitions' width='623' height='388'/></p> <p>As Doug pointed out, these particular sequences (which were derived from golden ratio versions of his “eta sequences”) have a very regular form—which we would now call nested. And it was the properties of this form that Doug seemed most concerned about in his letter. But actually, as we saw above, just a small change in initial conditions in what I’m calling S1 would have led to much wilder behavior. But that apparently wasn’t something Doug happened to notice. A bit later in the letter, though, there was another nestedly recursive sequence—that Doug described as a “horse of an entirely nother color”: the “absolutely CRAZY” Q sequence:</p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024historyimg12.png' alt='Crazy Q sequence' title='Crazy Q sequence' width='623' height='211'/></p> <p>Two years later, Doug’s <em>Gödel, Escher, Bach</em> book was published—and in it, tucked away at the bottom of page 137, a few pages after a discussion of recursive generation of text (with examples such as “the strange bagels that the purple cow without horns gobbled”), there was the Q sequence: </p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024historyimg13.png' alt='Chaotic Q sequence' title='Chaotic Q sequence' width='623' height='195'/></p> <p>Strangely, though, there was no picture of it, and Doug listed only 17 terms (which, until I was writing this, was all I assumed he had computed):</p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024historyimg14.png' alt='17 Q-sequence terms' title='17 Q-sequence terms' width='623' height='455'/></p> <p>So now nestedly recursive sequences were out in the open—in what quickly became a very popular book. But I don’t think many people noticed them there (though, as I’ll discuss, I did). <em>Gödel, Escher, Bach</em> is primarily a playful book focused on exposition—and not the kind of place you’d expect to find a new, mathematical-style result. </p> <p>Still—quite independent of the book—Neil Sloane showed Doug’s 1977 letter to his Bell Labs colleague Ron Graham, who within a year made a small mention of the Q sequence in a staid academic math publication (in a characteristic “state-it-as-a-problem” Erdös form):</p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024historyimg15.png' alt='Erdös and Graham math paper' title='Erdös and Graham math paper' width='533' height='174'/></p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024historyimg16.png' alt='Erdös and Graham math paper continued' title='Erdös and Graham math paper continued' width='533' height='181'/></p> <p>There was a small and tight-knit circle of serious mathematicians—essentially all of whom, as it happens, I personally knew—who would chase these kinds of easy-to-state-but-hard-to-solve problems. Another was Richard Guy, who soon included the sequence as part of problem E31 in his <em>Unsolved Problems in Number Theory</em>, and mentioned it again a few years later.<em> </em></p> <p>But for most of the 1980s little was heard about the sequence. As it later turns out, a senior British applied mathematician named Brian Conolly (who wasn’t part of the aforementioned tight-knit circle) had—presumably as a kind of hobby project—made some progress, and in 1986 had written to Guy about it. Guy apparently misplaced the letter, but later told Conolly that John Conway and Sol Golomb had worked on similar things.</p> <p>Conway presumably got the idea from Hofstadter’s work (though he had a habit of obfuscating his sources). But in any case, on July 15, 1988, Conway gave a talk at Bell Labs entitled “Some Crazy Sequences” (note the word “crazy”, just like in Hofstadter’s letter to Sloane) in which he discussed the <a href="https://oeis.org/A004001" target="_blank" rel="noopener">regular-enough-to-be-mathematically-interesting sequence</a> (which we’re calling G<sub>3</sub>111 here):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09242024historyimg18_copy.txt' data-c2c-type='text/html'><img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09242024historyimg19.png' alt='' title='' width='404' height='14'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09242024historyimg20_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09242024historyimg20.png' alt='' title='' width='358' height='98'> </div> </p></div> <p>Despite its visual regularity, Conway couldn’t mathematically prove certain features of the wiggles in the sequence—and in his talk offered a $10,000 prize for anyone who could. By August a Bell Labs mathematician named Colin Mallows had done it. Conway claimed (later to be contradicted by video evidence) that he’d only offered $1000—and somehow the whole affair landed as a story in the August 30 <em>New York Times</em> under the heading “<a href="https://www.nytimes.com/1988/08/30/science/intellectual-duel-brash-challenge-swift-response.html" target="_blank" rel="noopener">Intellectual Duel: Brash Challenge, Swift Response</a>”. But in any case, this particular nestedly recursive sequence became known as “Conway’s Challenge Sequence”. </p> <p>So what about <a href="https://writings.stephenwolfram.com/2016/05/solomon-golomb-19322016/">Sol Golomb</a>? It turns out he’d started writing a paper—though never finished it:</p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024historyimg21.png' alt='Discrete Chaos paper' title='Discrete Chaos paper' width='533' height='249'/></p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024historyimg22.png' alt='Discrete Chaos paper continued' title='Discrete Chaos paper continued' width='533' height='254'/></p> <p>He’d computed 280 terms of the Q sequence (he wasn’t much of a computer user) and noticed a few coincidences. But he also mentioned another kind of nestedly recursive sequence, no doubt inspired by his work on feedback shift registers:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09242024historyimg23_copy.txt' data-c2c-type='text/html'><img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09242024historyimg24.png' alt='' title='' width='178' height='14'> </div> </p></div> <p>As he noted, the behavior depends greatly on the initial conditions, though is always eventually periodic—with his student Unjeng Cheng having found long-period examples.</p> <p>OK, so by 1988 nestedly recursive functions had at least some notoriety. So what happened next? Not so much. There’s a <a href="https://content.wolfram.com/sites/43/2024/09/NestedlyRecursiveFunctions-Bibliography-9-27.pdf">modest academic literature</a> that’s emerged over the last few decades, mostly concentrated very specifically around “Conway’s Challenge Sequence”, Hofstadter’s Q function, or very similar “meta Fibonacci” generalizations of them. And so far as I know, even the first published large-scale picture of the Q sequence only appeared in 1998 (though I had pictures of it many years earlier):</p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024historyimg25.png' alt='Klaus Pinn Q-sequence paper' title='Klaus Pinn Q-sequence paper' width='533' height='287'/></p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024historyimg26.png' alt='Klaus Pinn Q-sequence paper continued' title='Klaus Pinn Q-sequence paper continued' width='533' height='365'/></p> <p>Why wasn’t more done with nestedly recursive functions? At some level it’s because they tend to have too much computational irreducibility—making it pretty difficult to say much about them in traditional mathematical ways. But perhaps more important, studying them broadly is really a matter of <a href="https://writings.stephenwolfram.com/2021/09/charting-a-course-for-complexity-metamodeling-ruliology-and-more/#the-pure-basic-science-of-ruliology">ruliology</a>: it requires the idea of exploring spaces of rules, and of expecting the kinds of behavior and phenomena that are characteristic of systems in the computational universe. And that’s something that’s still not nearly as widely understood as it should be.</p> <h2 id="my-personal-story-with-nestedly-recursive-functions">My Personal Story with Nestedly Recursive Functions</h2> <p>I think 1979 was the year when I first took recursion seriously. I’d heard about the Fibonacci sequence (though not under that name) as a young child a decade earlier. I’d implicitly (and sometimes explicitly) encountered recursion (sometimes through error messages!) in computer algebra systems I’d used. In science, I’d studied fractals quite extensively (<a href="https://www.stephenwolfram.com/publications/the-father-of-fractals/?ref=josephnoelwalker.com">Benoit Mandelbrot</a>’s book having appeared in 1977), and I’d been exposed to things like iterated maps. And I’d quite extensively studied cascade processes, <a href="https://content.wolfram.com/sw-publications/2020/07/qcd-model-annihilation.pdf">notably of quarks and gluons in QCD</a>.</p> <p>As I think about it now, I realize that for several years I’d written programs that made use of recursion (and I had quite a lot of exposure to LISP, and the culture around it). But it was in 1979—<a href="https://writings.stephenwolfram.com/2016/04/my-life-in-technology-as-told-at-the-computer-history-museum/">having just started using C</a>—that I first remember writing a program (for doing percolation theory) where I explicitly thought “this is using recursion”. But then, in late 1979, I began to design <a href="https://writings.stephenwolfram.com/2013/06/there-was-a-time-before-mathematica/">SMP (“Symbolic Manipulation Program”)</a>, the forerunner of the modern Wolfram Language. And in doing this I quickly solidified my knowledge of mathematical logic and the (then-fledgling) field of theoretical computer science.</p> <p>My concept of repeated transformations for symbolic expressions—which is still the core of Wolfram Language today—is somehow fundamentally recursive. And by the time we had the first signs of life for our SMP system, Fibonacci was one of our very first tests. We soon tried the Ackermann function too. And in 1980 I became very interested in the problem of evaluation order, particularly for recursive functions—and the best treatment I found of it (though at the time not very useful to me) was in none other than the book by Zohar Manna that I mentioned above. (In a strange twist, I was at that time also studying gauge choices in physics—and it was <a href="https://writings.stephenwolfram.com/2023/09/expression-evaluation-and-fundamental-physics/">only last year that I realized</a> that they’re fundamentally the same thing as evaluation orders.) </p> <p>It was soon after it came out in 1979 that I first saw Douglas Hofstadter’s book. At the time I wasn’t too interested in its Lewis-Carroll-like aspects, or its exposition; I just wanted to know what the “science meat” in it was. And somehow I found the page about the Q sequence, and filed it away as “something interesting”. </p> <p>I’m not sure when I first implemented the Q sequence in SMP, but by the time we released Version 1.0 in July 1981, there it was: an external package (hence the “X” prefix) <a href="https://files.wolframcdn.com/pub/www.stephenwolfram.com/pdf/smp-library.pdf" target="_blank">for evaluating “Hofstadter’s recursive function”</a>, elegantly using memoization—with the description I gave saying (presumably because that’s what I’d noticed) that its values “have several properties of randomness”:</p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024personalimg1.png' alt='Hofstadter recursive function' title='Hofstadter recursive function' width='583' height='315'/></p> <p>Firing up a copy of SMP today—running on a virtual machine that still thinks it’s 1986—I can run this code, and easily compute the function:</p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09252024personalAimg2A.png' alt='SMP evaluation' title='SMP evaluation' width='581' height='345'/></p> <p>I can even plot it—though without an emulator for a 1980s-period storage-tube display, only the ASCIIfied rendering works:</p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09252024personalAimg3A-v3.png' alt='ASCIIfied rendering' title='ASCIIfied rendering' width='581' height='314'/></p> <p>So what did I make of the function back in 1981? I was interested in <a href="https://www.wolframscience.com/nks/chap-1--the-foundations-for-a-new-kind-of-science/#sect-1-4--the-personal-story-of-the-science-in-this-book">how complexity and randomness could occur in nature</a>. But at the time, I didn’t have enough of a framework to understand the connection. And, as it was, I was just starting to explore cellular automata, which seemed a lot more “nature like”—and which soon led me to things like <a href="https://writings.stephenwolfram.com/2019/10/announcing-the-rule-30-prizes">rule 30</a> and the <a href="https://www.wolframscience.com/nks/chap-12--the-principle-of-computational-equivalence#sect-12-6--computational-irreducibility">phenomenon of computational irreducibility</a>. </p> <p>Still, I didn’t forget the Q sequence. And when I was building Mathematica I again used it as a test (the .tq file extension came from the brief period in 1987 when we were trying out “Technique” as the name of the system):</p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024personalimg4.png' alt='Combinatorial functions' title='Combinatorial functions' width='583' height='123'/></p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024personalimg5.png' alt='Combinatorial functions continued' title='Combinatorial functions continued' width='583' height='200'/></p> <p>When Mathematica 1.0 was released on June 23, 1988, the Q sequence appeared again, this time as an example in the soon-in-every-major-bookstore Mathematica book:</p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024personalimg6.png' alt='Q sequence in Mathematica book' title='Q sequence in Mathematica book' width='623' height='81'/></p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024personalimg7.png' alt='Q sequence in Mathematica book continued' title='Q sequence in Mathematica book continued' width='623' height='254'/></p> <p>I don’t think I was aware of Conway’s lecture that occurred just 18 days later. And for a couple of years I was consumed with tending to a young product and a young company. But by 1991, I was getting ready to launch into basic science again. Meanwhile, the number theorist (and today horologist) Ilan Vardi—yet again from Stanford—had been working at Wolfram Research and writing a book entitled <em>Computational Recreations in Mathematica</em>, which included a long section on the analysis of Takeuchi’s nested recursive function (“TAK”). My email archive records an exchange I had with him:</p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024personalimg8.png' alt='Wolfram–Vardi email' title='Wolfram–Vardi email' width='483' height='603'/></p> <p>He suggested a “more symmetrical” nested recursive function. I responded, including a picture that made it fairly clear that this particular function would have only nested behavior, and not seem “random”:</p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024personalimg9.png' alt='Wolfram–Vardi followup email' title='Wolfram–Vardi follow-up email' width='483' height='392'/></p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024personalimg10.png' alt='Nested recursive function graphic' title='Nested recursive function graphic' width='263' height='218'/></p> <p>By the summer of 1991 I was in the thick of exploring different kinds of systems with simple rules, discovering the remarkable complexity they could produce, and filling out what became <a href="https://www.wolframscience.com/nks/chap-3--the-world-of-simple-programs/">Chapter 3 of <em>A New Kind of Science</em>: “The World of Simple Programs”</a>. But then came <a href="https://www.wolframscience.com/nks/chap-4--systems-based-on-numbers/">Chapter 4: “Systems Based on Numbers”</a>. I had known since the mid-1980s about the <a href="https://www.wolframscience.com/nks/p120--elementary-arithmetic/">randomness in things like digit sequences</a> produced by successive arithmetic operations. But what about randomness in pure sequences of integers? I resolved to find out just what it would take to produce randomness there. And so it was that on August 13, 1993, I came to be enumerating possible symbolic forms for recursive functions—and selecting ones that could generate at least 10 terms:</p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024personalimg11.png' alt='Symbolic forms for recursive functions' title='Symbolic forms for recursive functions' width='423' height='203'/></p> <p>As soon as I plotted the “survivors” I could see that interesting things were going to happen:</p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024personalimg12.png' alt='Recursive function graphs' title='Recursive function graphs' width='612' height='492'/></p> <p>Was this complexity somehow going to end? I checked out to 10 million terms. And soon I started <a href="https://www.wolframscience.com/nks/p129--recursive-sequences/">collecting my “prize specimens” and making a gallery of them</a>:</p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024personalimg13.png' alt='Recursive functions gallery' title='Recursive functions gallery' width='533' height='434'/></p> <p>I had some one-term recurrences, and some two-term ones. Somewhat shortsightedly I was always using “Fibonacci-like” initial conditions <em>f</em>[1] = <em>f</em>[2] = 1—and I rejected any recurrence that tried to “look back” to <em>f</em>[0], <em>f</em>[–1], etc. And at the time, with this constraint, I only found “really interesting” behavior in two-term recurrences. </p> <p>In 1994 I returned briefly to recursive sequences, <a href="https://www.wolframscience.com/nks/notes-4-3--properties-of-recursive-sequences/">adding a note</a> “solving” a few of them, and discussing the evaluation graphs of others:</p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024personalimg14.png' alt='Properties of sequences' title='Properties of sequences' width='483' height='284'/></p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024personalimg15.png' alt='Evaluation graphs' title='Evaluation graphs' width='483' height='360'/></p> <p>When I finally finished <em>A New Kind of Science</em> in 2002, I included a <a href="https://www.wolframscience.com/nks/notes-2-3--close-approaches-to-core-discoveries/">list of historical “Close approaches”</a> to its core discoveries, one of them being Douglas Hofstadter’s work on recursive sequences:</p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024personalimg16-v2.png' alt='Douglas Hofstadter work on recursive sequences' title='Douglas Hofstadter work on recursive sequences' width='620' height='511'/></p> <p>In retrospect, back in 1981 I should have been able to take the “Q sequence” and recognize in it the essential “rule 30 phenomenon”. But as it was, it took another decade—and many other explorations in the computational universe—for me to build up the right conceptual framework to see this. In <em>A New Kind of Science</em> I studied many kinds of systems, probing them far enough, I hoped, to see their most important features. But recursive functions were an example where I always felt there was more to do; I felt I’d only just scratched the surface.</p> <p>In June 2003—a year <a href="https://writings.stephenwolfram.com/2022/05/the-making-of-a-new-kind-of-science/#how-to-publish-a-book">after <em>A New Kind of Science</em> was published</a>—we held our first <a href="https://education.wolfram.com/summer-school">summer school</a>. And as a way to introduce methodology—and be sure that people knew I was fallible and approachable—I decided on the first day of the summer school to do a “<a href="https://writings.stephenwolfram.com/2007/07/science-live-and-in-public/">live experiment</a>”, and try to stumble my way to discovering something new, live and in public. </p> <p>A few minutes before the session started, I picked the subject: recursive functions. I began with some examples I knew. Then it was time to go exploring. At first lots of functions “didn’t work” because they were looking back too far. But then someone piped up “Why not just say that <em>f</em>[<em>n</em>] is 1 whenever <em>n</em> isn’t a positive integer?” Good idea! And very easy to try. </p> <p>Soon we had the “obvious” functions written (today <tt><a href="http://reference.wolfram.com/language/ref/Apply.html">Apply</a></tt><tt>[</tt><tt><a href="http://reference.wolfram.com/language/ref/Plus.html">Plus</a></tt><tt>, ...]</tt> could be just <tt><a href="http://reference.wolfram.com/language/ref/Total.html">Total</a></tt><tt>[...]</tt>, but otherwise there’s nothing “out of date” here):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09252024personalboxc2cimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09252024personalboxc2cimg1.png' alt='' title='' width='595' height='164'> </div> </p></div> <p>In a typical story of Wolfram-Language-helps-one-think-clearly, the obvious function was also very general, and allowed a recurrence with any number of terms. So why not start with just one term? And immediately, there it was—what we’re now calling T311:</p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024personalimg19.png' alt='T311' title='T311' width='593' height='304'/></p> <p>And then a plot (yes, after Version 6 one didn’t need the <tt><a href="http://reference.wolfram.com/language/ref/Show.html">Show</a></tt> or the trailing “;”):</p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024personalimg20.png' alt='RSValues plot' title='RSValues plot' width='589' height='400'/></p> <p>Of course, as is the nature of computational constructions, this is something timeless—that looks the same today as it did 21 years ago (well, except that now our plots display with color by default).</p> <p>I thought this was a pretty neat discovery. And I just couldn’t believe that years earlier I’d failed to see the obvious generalization of having “infinite” initial conditions. </p> <p>The next week I did a followup session, this time talking about how one would write up a discovery like this. We started off with possible titles (including audience suggestions):</p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024personalimg21-v2.png' alt='Suggested titles' title='Suggested titles' width='610' height='320'/></p> <p>And, yes, the first title listed is exactly the one I’ve now used here. In the notebook I created back then, there were first some notes (some of which should still be explored!):</p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024personalimg22.png' alt='Title notes' title='Title notes' width='353' height='474'/></p> <p>Three hours later (on the afternoon of July 11, 2003) there’s another notebook, with the beginnings of a writeup:</p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024personalimg23.png' alt='Initial recursive functions writeup' title='Initial recursive functions writeup' width='632' height='423'/></p> <p>By the way, another thing we came up with at the summer school was the (non-nestedly) recursive function: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09242024personalimg24_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09242024personalimg25.png' alt='' title='' width='352' height='14'> </div> </p></div> <p>Plotting <em>g</em>[<em>n</em> + 1] – <em>g</em>[<em>n</em>] gives:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/09/sw09252024personalAimg26_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/09/sw09242024personalimg26.png' alt='' title='' width='482' height='129'> </div> </p></div> <p>And, yes, bizarrely (and reminiscent of McCarthy’s 91-function) for <em>n</em> ≥ 396, <em>g</em>[<em>n</em> + 1] – <em>g</em>[<em>n</em>] is always 97, and <em>g</em>[<em>n</em>] = 38606 + 97 (<em>n</em> – 396).</p> <p>But in any case, a week or so after my “writeups” session, the summer school was over. In January 2004 I briefly picked the project up again, and made some pictures that, yes, show interesting structure that perhaps I should investigate now:</p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024personalimg27.png' alt='f[n - f[n - 1]]' title='f[n - f[n - 1]]' width='453' height='619'/></p> <p>In the years that followed, I would occasionally bring nestedly recursive functions out again—particularly in interacting with high school and other students. And at our summer programs I suggested projects with them for a number of students.</p> <p>In 2008, they seemed like an “obvious interesting thing” to add to our <a href="https://demonstrations.wolfram.com/">Demonstrations Project</a>:</p> <p><img src='https://content.wolfram.com/sites/43/2024/09/sw09242024personalimg28.png' alt='NKS summer school live experiment' title='NKS summer school live experiment' width='623' height='305'/></p> <p>But mostly, they languished. Until, that is, my <a href="https://writings.stephenwolfram.com/2024/08/five-most-productive-years-what-happened-and-whats-next/">burst of “finish this” intellectual energy</a> that followed the <a href="https://writings.stephenwolfram.com/2020/04/finally-we-may-have-a-path-to-the-fundamental-theory-of-physics-and-its-beautiful/">launch of our Physics Project in 2020</a>. So here now, finally, after a journey of 43 years, I feel like I’ve been able to do some justice to nestedly recursive functions, and provided a bit more illumination to yet another corner of the computational universe. </p> <p>(Needless to say, there are many, many additional questions and issues to explore. Different primitives, e.g. <tt><a href="http://reference.wolfram.com/language/ref/Mod.html">Mod</a></tt>, <tt><a href="http://reference.wolfram.com/language/ref/Floor.html">Floor</a></tt>, etc. Multiple functions that refer to each other. Multiway cases. Functions based on rational numbers. And endless potential approaches to analysis, identifying pockets of regularity and computational reducibility.)</p> <h2 id="thanks" style='font-size:1.2rem'>Thanks</h2> <p style='font-size:90%'>Thanks to Brad Klee for extensive help. Thanks also to those who’ve worked on nestedly recursive functions as students at our summer programs over the years, including Roberto Martinez (2003), Eric Rowland (2003), Chris Song (2021) and Thomas Adler (2024). I’ve benefitted from interactions about nestedly recursive functions with Ilan Vardi (1991), Tal Kubo (1993), Robby Villegas (2003), Todd Rowland (2003 etc.), Jim Propp (2004), Matthew Szudzik (2005 etc.), Joseph Stocke (2021 etc.), Christopher Gilbert (2024) and Max Niedermann (2024). Thanks to Doug Hofstadter for extensive answers to questions about history for this piece. It’s perhaps worth noting that I’ve personally known many of the people mentioned in the history section here (with the dates I met them indicated): John Conway (1984), Paul Erdös (1986), Sol Golomb (1981), Ron Graham (1983), Benoit Mandelbrot (1986), John McCarthy (1981) and Neil Sloane (1983).</p> <p><span></p> <h2 id="bibliography-of-nestedly-recursive-functions"><img src="https://content.wolfram.com/sites/43/2021/05/biblio-icon.png" width="164" style="float:left;margin-right:5px;margin-top: -21px;margin-bottom:30px;"><a href="https://content.wolfram.com/sites/43/2024/09/NestedlyRecursiveFunctions-Bibliography-9-27.pdf">Bibliography of Nestedly Recursive Functions »</a></h2> ]]></content:encoded> <wfw:commentRss>https://writings.stephenwolfram.com/2024/09/nestedly-recursive-functions/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item> <title>Five Most Productive Years: What Happened and What’s Next</title> <link>https://writings.stephenwolfram.com/2024/08/five-most-productive-years-what-happened-and-whats-next/</link> <comments>https://writings.stephenwolfram.com/2024/08/five-most-productive-years-what-happened-and-whats-next/#comments</comments> <pubDate>Thu, 29 Aug 2024 16:31:46 +0000</pubDate> <dc:creator><![CDATA[Stephen Wolfram]]></dc:creator> <category><![CDATA[Life & Times]]></category> <category><![CDATA[Other]]></category> <category><![CDATA[Personal Analytics]]></category> <guid isPermaLink="false">https://writings.stephenwolfram.com/?p=62482</guid> <description><![CDATA[<span class="thumbnail"><img width="128" height="108" src="https://content.wolfram.com/sites/43/2024/08/swblog-5years-icon.png" class="attachment-thumbnail size-thumbnail wp-post-image" alt="" /></span>So… What Happened? Today is my birthday—for the 65th time. Five years ago, on my 60th birthday, I did a livestream where I talked about some of my plans. So… what happened? Well, what happened was great. And in fact I’ve just had the most productive five years of my life. Nine books. 3939 pages […]]]></description> <content:encoded><![CDATA[<span class="thumbnail"><img width="128" height="108" src="https://content.wolfram.com/sites/43/2024/08/swblog-5years-icon.png" class="attachment-thumbnail size-thumbnail wp-post-image" alt="" /></span><p><img class="aligncenter" src="https://content.wolfram.com/sites/43/2024/08/sw-5-years-hero-v6.png" max-width="650px" height="auto" alt="Five Most Productive Years: What Happened and What's Next" title="Five Most Productive Years: What Happened and What's Next"></p> <h2 id="so--what-happened">So… What Happened?</h2> <style type="text/css"> #blog #content img#sideStripe {float: right; margin-left: 25px; width: 50px;} @media (max-width: 1200px) and (min-width: 1180px) {#blog #content img#sideStripe {width:auto;height:31650px}} @media (max-width: 1020px) and (min-width: 901px) {#blog #content img#sideStripe {width:auto;height:31250px}} @media (max-width: 500px) {#blog #content img#sideStripe {display:none;}} body#blog.home #content img#sideStripe {display:none;} </style> <p><img src="https://content.wolfram.com/sites/43/2024/08/blog-image-strip-v2.png" id="sideStripe"></p> <p><a href="https://www.wolframalpha.com/input?i=how+old+is+stephen+wolfram">Today is my birthday</a>—for the 65th time. Five years ago, on my 60th birthday, I did a <a href="https://www.youtube.com/watch?v=2-aAi6QXsl0" target="_blank" rel="noopener">livestream where I talked about some of my plans</a>. So… what happened? Well, what happened was great. And in fact I’ve just had the most productive five years of my life. Nine <a href="https://www.amazon.com/stores/Stephen-Wolfram/author/B01HTZP7PQ?ref=sr_ntt _srch _lnk _ 1&qid=1724798966&sr=8-1&isDramIntegrated=true&shoppingPortalEnabled=true" target="_blank" rel="noopener">books</a>. 3939 pages of <a href="https://writings.stephenwolfram.com/">writings</a> (1,283,267 words). 499 hours of <a href="https://podcasters.spotify.com/pod/show/stephenwolfram" target="_blank" rel="noopener">podcasts</a> and 1369 hours of <a href="https://livestreams.stephenwolfram.com/" target="_blank" rel="noopener">livestreams</a>. 14 <a href="https://writings.stephenwolfram.com/category/new-technology/">software product releases</a> (with our great team). Oh, and a bunch of big—and beautiful—ideas and results.</p> <p>It’s been wonderful. And unexpected. I’ve spent my life alternating between technology and basic science, progressively building a taller and taller tower of practical capabilities and intellectual concepts (and sharing what I’ve done with the world). Five years ago everything was going well, and making steady progress. But then there were the questions I never got to. Over the years I’d come up with a certain number of big questions. And some of them, within a few years, I’d answered. But others I never managed to get around to. </p> <p>And five years ago, as I explained in my birthday livestream, I began to think “it’s now or never”. I had no idea how hard the questions were. Yes, I’d spent a lifetime building up tools and knowledge. But would they be enough? Or were the questions just not for our time, but only perhaps for some future century?<span id="more-62482"></span></p> <p>At several points before in my life I’d faced such issues—and things had worked out well (<em><a href="https://www.wolframscience.com/nks/">A New Kind of Science</a></em>, <a href="https://www.wolframalpha.com/">Wolfram|Alpha</a>, etc.). And from this, I had gotten a certain confidence about what might be possible. In addition, as a serious <a href="https://writings.stephenwolfram.com/category/historical-perspectives">student of intellectual history</a>, I had a sense of what kind of boldness was needed. Five years ago there wasn’t really anything that made me need to do something big and new. But I thought: “What the heck. I might as well try. I’ll never know what’s possible unless I try.”</p> <p>A major theme of my work since the early 1980s had been <a href="https://www.wolframscience.com/nksonline/toc.html">exploring the consequences of simple computational rules</a>. And I had found the surprising result that even extremely simple rules could lead to immensely complex behavior. So what about the universe? Could it be that at a fundamental level <a href="https://www.wolframscience.com/nks/chap-9--fundamental-physics/">our whole universe is just following some simple computational rule</a>? </p> <p>I had begun my career in the 1970s as a teenager <a href="https://www.stephenwolfram.com/publications/academic/">studying the frontiers of existing physics</a>. And at <a href="https://writings.stephenwolfram.com/2020/04/how-we-got-here-the-backstory-of-the-wolfram-physics-project/#maybe-it-could-apply-to-physics">first I couldn’t see</a> how computational rules could connect to what is known in physics. But in the early 1990s I had an idea, and by the late 1990s I had developed it and gotten some very suggestive results. But when I published these in <em>A New Kind of Science</em> in 2002, even my friends in the physics community <a href="https://writings.stephenwolfram.com/2020/04/how-we-got-here-the-backstory-of-the-wolfram-physics-project/#please-dont-do-that-project">didn’t seem to care</a>—and I decided to concentrate my efforts elsewhere (e.g. building Wolfram|Alpha, <a href="https://reference.wolfram.com/language/">Wolfram Language</a>, etc.).</p> <p>But I didn’t stop thinking “one day I need to get back to my physics project”. And in 2019 I decided: “What the heck. Let’s try it now.” It helped that I’d made a piece of technical progress the year before, and that <a href="https://writings.stephenwolfram.com/2020/04/how-we-got-here-the-backstory-of-the-wolfram-physics-project/#two-young-physicists-and-a-little-idea">now two young physicists</a> were enthusiastic to work with me on the project. </p> <p>And so it was, soon after <a href="https://x.com/stephen_wolfram/status/1167118524537610240" target="_blank" rel="noopener">my birthday in 2019</a>, that we embarked on our <a href="https://www.wolframphysics.org/" target="_blank" rel="noopener">Physics Project</a>. It was a mixture of computer experiments and big concepts. But before the end of 2019 it was clear: <a href="https://writings.stephenwolfram.com/2020/04/how-we-got-here-the-backstory-of-the-wolfram-physics-project/#oh-my-gosh-its-actually-going-to-work">it was going to work</a>! It was an amazing experience. Thing after thing in physics that had always been mysterious I suddenly understood. And it was beautiful—a theory of such strength built on a structure of such incredible simplicity and elegance. </p> <p><a href="https://writings.stephenwolfram.com/2020/04/finally-we-may-have-a-path-to-the-fundamental-theory-of-physics-and-its-beautiful/">We announced</a> what we’d figured out in April 2020, right when the pandemic was in full swing. There was still much to do (and there still is today). But the overall picture was clear. I later learned that a century earlier many well-known physicists were beginning to think in a similar direction (matter is discrete, light is discrete; space must be too) but back then they hadn’t had the computational paradigm or the other tools needed to move this forward. And now the responsibility had fallen on us to do this. (Pleasantly enough, given our framework, many modern areas of mathematical physics seemed to fit right in.)</p> <p>And, yes, figuring out the basic “machine code” for our universe was of course pretty exciting. But seeing an old idea of mine blossom like this had another very big effect on me. It made me think: “OK, what about all those other projects I’ve been meaning to do? Maybe it’s time to do those too.”</p> <p>And something else had happened as well. In doing the Physics Project we’d developed a new way of thinking about things—not just computational, <a href="https://writings.stephenwolfram.com/2021/09/multicomputation-a-fourth-paradigm-for-theoretical-science/">but “multicomputational”</a>. And actually, the <a href="https://www.wolframscience.com/nks/chap-5--two-dimensions-and-beyond#sect-5-6--multiway-systems">core ideas behind this</a> were in <em>A New Kind of Science</em> too. But somehow I’d never taken them seriously enough before, and never extended my intuition to encompass them. But now with the Physics Project I was doing this. And I could see that the ideas could also go much further. </p> <p>So, yes, I had a new and powerful conceptual framework for doing science. And I had all the technology of the modern <a href="https://www.wolfram.com/language/">Wolfram Language</a>. But in 2020 I had another thing too—in effect, a new distribution channel for my ideas and efforts. Early in my career I had used <a href="https://www.stephenwolfram.com/publications/academic/">academic papers</a> as my “channel” (at one point in 1979 even averaging a paper every few weeks). But in the late 1980s I had a very different kind of channel: embodying my ideas in the design and implementation of <a href="https://www.wolfram.com/mathematica/">Mathematica</a> and what’s now the Wolfram Language. Then in the 1990s I had another channel: putting everything together into what became my book <em>A New Kind of Science</em>. </p> <p>After that was published in 2002 I would occasionally write small posts—for the <a href="https://community.wolfram.com/content?curTag=wolfram%20science">community site</a> around the science in my book, for our <a href="https://blog.wolfram.com/">corporate blog</a>, etc. And in 2010 I <a href="https://writings.stephenwolfram.com/2010/09/welcome-to-the-blog/">started my own blog</a>. At first I mostly just wrote small, fun pieces. But by 2015—partly driven by telling historical stories (<a href="https://writings.stephenwolfram.com/2015/11/george-boole-a-200-year-view/">200th anniversary of George Boole</a>, <a href="https://writings.stephenwolfram.com/2015/12/untangling-the-tale-of-ada-lovelace/">200th anniversary of Ada Lovelace</a>, …)—the things I was writing were getting ever meatier. (There’d actually already been some <a href="https://writings.stephenwolfram.com/2012/03/the-personal-analytics-of-my-life/">meaty ones</a> about <a href="https://writings.stephenwolfram.com/category/personal-analytics/">personal analytics</a> in 2012.)</p> <p>And by 2020 my pattern was set and I would routinely write 50+ -page pieces, full of pictures (all with immediately runnable “<a href="https://reference.wolfram.com/language/ref/ClickToCopy.html">click-to-copy</a>” code) and intended for anyone who cared to read them. Finally I had a good channel again. And I started using it. As I’d found over the years—whether with <a href="https://reference.wolfram.com/language/">language documentation</a> or with <em>A New Kind of Science</em>—the very act of exposition was a critical part of organizing and developing my ideas. </p> <p>And now I started producing pieces. Some were directly about specific topics around the Physics Project. But within two months I was already writing about a “spinoff”: “<a href="https://www.wolframphysics.org/bulletins/2020/06/exploring-rulial-space-the-case-of-turing-machines/" target="_blank" rel="noopener">Exploring Rulial Space: The Case of Turing Machines</a>”. I had realized that one of the places the ideas of the Physics Project should apply was to the foundations of mathematics, and to metamathematics. In a footnote to <em>A New Kind of Science</em> I had introduced the idea of “<a href="https://www.wolframscience.com/nks/notes-12-9--empirical-metamathematics/">empirical metamathematics</a>”. And in the summer of 2020, fuelled by my newfound “finish those old projects” mindset, I ended up writing an 80-page piece on “<a href="https://writings.stephenwolfram.com/2020/09/the-empirical-metamathematics-of-euclid-and-beyond/">The Empirical Metamathematics of Euclid and Beyond</a>”. </p> <p>December 7, 1920 was the date a certain <a href="https://writings.stephenwolfram.com/2021/03/a-little-closer-to-finding-what-became-of-moses-schonfinkel-inventor-of-combinators/">Moses Schönfinkel introduced</a> what we <a href="https://writings.stephenwolfram.com/2020/12/combinators-and-the-story-of-computation/">now call combinators</a>: the very first clear foundations for universal computation. I had always found combinators interesting (if hard to understand). I had used ideas from them back around 1980 in <a href="https://writings.stephenwolfram.com/2013/06/there-was-a-time-before-mathematica/">the predecessor of what’s now the Wolfram Language</a>. And I had <a href="https://www.wolframscience.com/nks/chap-3--the-world-of-simple-programs#sect-3-10--symbolic-systems">talked about them a bit</a> in <em>A New Kind of Science</em>. But as the centenary approached, I decided to make a <a href="https://writings.stephenwolfram.com/2020/12/combinators-a-centennial-view/">more definitive study</a>, in particular using methods from the Physics Project. And, for good measure, even in the middle of the pandemic I tracked down the <a href="https://writings.stephenwolfram.com/2021/03/a-little-closer-to-finding-what-became-of-moses-schonfinkel-inventor-of-combinators/">mysterious history of Moses Schönfinkel</a>.</p> <p>In March 2021, there was <a href="https://www.youtube.com/watch?v=ultMxODJE7o" target="_blank" rel="noopener">another centenary</a>, this time of <a href="https://writings.stephenwolfram.com/2021/03/after-100-years-can-we-finally-crack-posts-problem-of-tag-a-story-of-computational-irreducibility-and-more/#more-about-the-history">Emil Post’s tag system</a>, and again I decided to finish <a href="https://www.wolframscience.com/nks/chap-3--the-world-of-simple-programs#sect-3-7--tag-systems">what I’d started</a> in <em>A New Kind of Science</em>, and write <a href="https://writings.stephenwolfram.com/2021/03/after-100-years-can-we-finally-crack-posts-problem-of-tag-a-story-of-computational-irreducibility-and-more/">a definitive piece</a>, this time running to about 75 pages.</p> <p>One might have thought that the excursions into <a href="https://writings.stephenwolfram.com/2020/09/the-empirical-metamathematics-of-euclid-and-beyond/">empirical metamathematics</a>, <a href="https://writings.stephenwolfram.com/2020/12/combinators-a-centennial-view/">combinators</a>, <a href="https://writings.stephenwolfram.com/2021/03/after-100-years-can-we-finally-crack-posts-problem-of-tag-a-story-of-computational-irreducibility-and-more/">tag systems</a>, <a href="https://www.wolframphysics.org/bulletins/2020/06/exploring-rulial-space-the-case-of-turing-machines/" target="_blank" rel="noopener">rulial</a> and <a href="https://www.wolframphysics.org/bulletins/2021/02/multiway-turing-machines/" target="_blank" rel="noopener">multiway Turing machines</a> would be distractions. But they were not. Instead, they just deepened my understanding and intuition for the new ideas and methods that had come out of the Physics Project. As well as finishing projects that I’d wondered about for decades (and the world had had open for a century). </p> <p>Perhaps not surprisingly given its fundamental nature, the Physics Project also engaged with some deep <a href="https://writings.stephenwolfram.com/category/philosophy/">philosophical issues</a>. People would ask me about them with some regularity. And in March 2021 I started writing a bit about them, beginning with <a href="https://writings.stephenwolfram.com/2021/03/what-is-consciousness-some-new-perspectives-from-our-physics-project/">a piece on consciousness</a>. The next month I wrote “<a href="https://writings.stephenwolfram.com/2021/04/why-does-the-universe-exist-some-perspectives-from-our-physics-project/">Why Does the Universe Exist? Some Perspectives from Our Physics Project</a>”. (This piece of writing happened to coincide with the few days in my life when I’ve needed to do active cryptocurrency trading—so I was in the amusing position of thinking about a philosophical question about as deep as they come, interspersed with making cryptocurrency trades.)</p> <p>Everything kept weaving together. These philosophical questions made me internalize just how important the nature of the observer is in our Physics Project. Meanwhile I started thinking about the relationship of methods from the Physics Project to distributed computing, and to economics. And in May 2021 that intersected with practical <a href="https://www.youtube.com/watch?v=h94VrSuPFJc" target="_blank" rel="noopener">blockchain questions</a>, which caused me to write about “<a href="https://writings.stephenwolfram.com/2021/05/the-problem-of-distributed-consensus/">The Problem of Distributed Consensus</a>”—which would soon show up again in the <a href="https://writings.stephenwolfram.com/2023/12/observer-theory/">science and philosophy of observers</a>. </p> <p>The fall of 2021 involved really leaning into the new <a href="https://writings.stephenwolfram.com/2021/09/multicomputation-a-fourth-paradigm-for-theoretical-science/">multicomputational paradigm</a>, among other things giving a <a href="https://writings.stephenwolfram.com/2021/09/multicomputation-a-fourth-paradigm-for-theoretical-science/#potential-application-areas">long list of where it might apply</a>: metamathematics, chemistry, molecular biology, evolutionary biology, neuroscience, immunology, linguistics, economics, machine learning, distributed computing. And, yes, in a sense this was my “to do” list. In many ways, half the battle was just defining this. And I’m happy to say that just three years later, we’ve already made a big dent in it. </p> <p>While all of this was going on, I was also energetically pursuing my “day job” as CEO of Wolfram Research. <a href="https://writings.stephenwolfram.com/2020/03/in-less-than-a-year-so-much-new-launching-version-12-1-of-wolfram-language-mathematica/">Version 12.1</a> of the Wolfram Language had come out less than a month before the Physics Project was announced. <a href="https://writings.stephenwolfram.com/2020/12/launching-version-12-2-of-wolfram-language-mathematica-228-new-functions-and-much-more/">Version 12.2</a> right after the combinator centenary. And in 2021 there were two new versions. In all 635 new functions, all of which <a href="https://livestreams.stephenwolfram.com/category/live-ceoing/" target="_blank" rel="noopener">I had carefully reviewed</a>, and many of which I’d been deeply involved in designing. </p> <p>It’s a pattern in the history of science (as well as technology): some new methodology or some new paradigm is introduced. And suddenly vast new areas are opened up. And there’s lots of juicy “low-hanging fruit” to be picked. Well, that’s what had happened with the ideas from our Physics Project, and the concept of multicomputation. There were many directions to go, and many people wanting to get involved. And in 2021 it was becoming clear that something organizational had to be done: this wasn’t a job for a company (even for one as terrific and innovative as ours is), it was <a href="https://writings.stephenwolfram.com/2022/04/weve-got-a-science-opportunity-overload-its-time-to-launch-the-wolfram-institute/">a job for something like an institute</a>. (And, yes, in 2022, we indeed launched what’s now the <a href="https://www.wolframinstitute.org/" target="_blank" rel="noopener">Wolfram Institute for Computational Foundations of Science</a>.)</p> <p>But back in 1986, I had started the <a href="https://writings.stephenwolfram.com/2021/09/charting-a-course-for-complexity-metamodeling-ruliology-and-more/#theres-a-whole-new-field-to-build">very first institute concentrating on complexity</a> and how it could arise from simple rules. Running it hadn’t been a good fit for me back then, and <a href="https://writings.stephenwolfram.com/2016/04/my-life-in-technology-as-told-at-the-computer-history-museum/">very quickly I started our company</a>. In 2002, when <a href="https://writings.stephenwolfram.com/2022/05/the-making-of-a-new-kind-of-science/"><em>A New Kind of Science</em> was published</a>, I’d thought again about starting an institute. But it didn’t happen. But now there really seemed to be no choice. I started reflecting on what had happened to “complexity”, and whether there was something to leverage from the institutional structure that had grown up around it. Nearly 20 years after the publication of <em>A New Kind of Science</em>, what should “complexity” be now?</p> <p>I wrote “<a href="https://writings.stephenwolfram.com/2021/09/charting-a-course-for-complexity-metamodeling-ruliology-and-more/">Charting a Course for ‘Complexity’: Metamodeling, Ruliology and More</a>”—and in doing so, finally invented a word for the “pure basic science of what simple rules do”: <a href="https://writings.stephenwolfram.com/2021/09/charting-a-course-for-complexity-metamodeling-ruliology-and-more/#the-pure-basic-science-of-ruliology">ruliology</a>. </p> <p>My original framing of what became our Physics Project had been to try to “find a computational rule that gives our universe”. But I’d always found this unsatisfying. Because even if we had the rule, we’d still be left asking “why this one, and not another?” But in 2020 there’d been a dawning awareness of a possible answer.</p> <p>Our Physics Project is based on the idea of <a href="https://writings.stephenwolfram.com/2020/04/finally-we-may-have-a-path-to-the-fundamental-theory-of-physics-and-its-beautiful/#how-it-works">applying rules to abstract hypergraphs</a> that represent space and everything in it. But given a particular rule, there are in general many ways it can be applied. And a key idea in our Physics Project is that somehow it’s always <a href="https://www.wolframphysics.org/technical-introduction/the-updating-process-for-string-substitution-systems/" target="_blank" rel="noopener">applied in all these ways</a>—leading to many separate threads of history, that branch and merge—and, importantly, giving us a way to <a href="https://writings.stephenwolfram.com/2020/04/finally-we-may-have-a-path-to-the-fundamental-theory-of-physics-and-its-beautiful/#the-inevitability-of-quantum-mechanics">understand quantum mechanics</a>.</p> <p>We talked about these different threads of history corresponding to different places in <a href="https://www.wolframphysics.org/technical-introduction/the-updating-process-for-string-substitution-systems/the-concept-of-branchial-graphs/" target="_blank" rel="noopener">branchial space</a>—and about how the laws of quantum mechanics are <a href="https://writings.stephenwolfram.com/2020/04/finally-we-may-have-a-path-to-the-fundamental-theory-of-physics-and-its-beautiful/#general-relativity-and-quantum-mechanics-are-the-same-idea">the direct analogs in branchial space</a> (or branchtime) of the laws of classical mechanics (and gravity) in physical space (or spacetime). But what if instead of just applying a given rule in all possible ways, we applied all possible rules in all possible ways?</p> <p>What would one get? In November 2021 I came up with a name for it: <a href="https://writings.stephenwolfram.com/2021/11/the-concept-of-the-ruliad/">the ruliad</a>. A year and a half earlier I’d already been <a href="https://writings.stephenwolfram.com/2020/04/finally-we-may-have-a-path-to-the-fundamental-theory-of-physics-and-its-beautiful/#why-this-universe-the-relativity-of-rules">starting to talk about rulial space</a>—and the idea of us as observers perceiving the universe in terms of our particular sampling of rulial space. But naming the ruliad really helped to crystallize the concept. And I began to realize that I had come upon a breathtakingly broad intellectual arc. </p> <p>The ruliad is the biggest computational thing there can be: it’s the entangled limit of all possible computations. It’s abstract and it’s unique—and it’s as inevitable in its structure as 2 + 2 = 4. It encompasses everything computational—including us. So what then is physics? Well, it’s a description of how <a href="https://writings.stephenwolfram.com/2021/11/the-concept-of-the-ruliad/#experiencing-the-ruliad">observers like us embedded in the ruliad</a> perceive the ruliad.</p> <p><a href="https://writings.stephenwolfram.com/2023/02/a-50-year-quest-my-personal-journey-with-the-second-law-of-thermodynamics/#computational-irreducibility-and-rule-30">Back in 1984 I’d introduced</a> what<a href="https://www.wolframscience.com/nks/notes-12-6--history-of-computational-irreducibility/"> I saw as being the very central concept</a> of <a href="https://www.wolframscience.com/nks/chap-12--the-principle-of-computational-equivalence#sect-12-6--computational-irreducibility">computational irreducibility</a>: the idea that there are many computational processes whose outcomes can be found only by following them step by step—with no possibility of doing what mathematical science was used to, and being able to “jump ahead” and make predictions without going through each step. At the beginning of the 1990s, when I began to work on <em>A New Kind of Science</em>, I’d invented the <a href="https://www.wolframscience.com/nks/chap-12--the-principle-of-computational-equivalence/">Principle of Computational Equivalence</a>—the idea that systems whose behavior isn’t obviously simple will always tend to be equivalent in the sophistication of the computations they do. </p> <p>Given the Principle of Computational Equivalence, computational irreducibility was inevitable. It followed from the fact that the observer could only be as computationally sophisticated as the system they were observing, and so would never be able to “jump ahead” and shortcut the computation. There’d come to be a belief that eventually science would always let one predict (and control) things. But here—from inside science—was a fundamental limitation on the power of science. All these things I’d known in some form since the 1980s, and with clarity since the 1990s. </p> <p>But the ruliad took things to another level. For now I could see that the very laws of physics we know were determined by the way we are as observers. I’d always imagined that the laws of physics just are the way they are. But now I realized that we could potentially derive them from the inevitable structure of the ruliad, and very basic features of what we’re like as observers. </p> <p>I hadn’t seen this philosophical twist coming. But somehow it immediately made sense. We weren’t getting our laws of physics from nothing; we were getting them from being the way we are. Two things seemed to be critical: that as observers we are computationally bounded, and that (somewhat relatedly) we believe we are persistent in time (i.e. we have a continuing thread of experience through time). </p> <p>But even as I was homing in on the idea of the ruliad as it applied to physics, I was also thinking about another application: the <a href="https://www.wolframscience.com/metamathematics/">foundations of mathematics</a>. I’d been interested in the foundations of mathematics for a very long time; in fact, in the design of Mathematica (and what’s now the Wolfram Language) and <a href="https://writings.stephenwolfram.com/2013/06/there-was-a-time-before-mathematica/">its predecessor SMP</a>, I’d made central use of ideas that I’d developed from thinking about the foundations of mathematics. And in <em>A New Kind of Science</em>, I’d included <a href="https://www.wolframscience.com/nks/chap-12--the-principle-of-computational-equivalence#sect-12-9--implications-for-mathematics-and-its-foundations">a long section on the foundations of mathematics</a>, discussing things like the network of all possible theorems, and the space of all possible axiom systems.</p> <p>But now I was developing a clearer picture. The ruliad represented not only all possible physics, but also all possible mathematics. And the actual mathematics that we perceive—like the actual physics—would be determined by our nature as observers, in this case mathematical observers. There were lots of technical details, and it wasn’t until March 2022 that I published “<a href="https://writings.stephenwolfram.com/2022/03/the-physicalization-of-metamathematics-and-its-implications-for-the-foundations-of-mathematics/">The Physicalization of Metamathematics and Its Implications for the Foundations of Mathematics</a>”. </p> <p>In some ways this finished what I’d started in the mid-1990s. But it went much further than I expected, in particular in providing a sweeping unification of the foundations of physics and mathematics. It talked about what the ultimate limit of mathematics would be like. And it talked about how “human-level mathematics”—where we can discuss things like the Pythagorean theorem rather than just the microdetails of underlying axioms—emerges for observers like us just like our human-level impression of physical space emerges from the underlying network of atoms of space. </p> <p>One of the things I’d discovered in computational systems is how common computational irreducibility is, along with undecidability. And I had always wondered why undecidability wasn’t more common in typical mathematics. But now I had an answer: it just isn’t what mathematical observers like us “see” in the ruliad. At some level, this was a very philosophical result. But for me it also had <a href="https://www.wolframscience.com/metamathematics/implications-for-the-future-of-mathematics/">practical implications</a>, notably greatly validating the idea of using higher-level computational language to represent useful human-level mathematics, rather than trying to drill down to “axiomatic machine code”. </p> <p>October 22, 2021 had marked <a href="https://writings.stephenwolfram.com/2021/10/celebrating-a-third-of-a-century-of-mathematica-and-looking-forward/">a third of a century of Mathematica</a>. And May 14, 2022 was the <a href="https://writings.stephenwolfram.com/2022/05/twenty-years-later-the-surprising-greater-implications-of-a-new-kind-of-science/">20th anniversary of <em>A New Kind of Science</a></em>. And in contextualizing my activities, and planning for the future, I’ve increasingly found it useful to reflect on what I’ve done before, and how it’s worked out. And in both these cases I could see that seeds I’d planted many years earlier had blossomed, sometimes in ways I’d suspected they might, and sometimes in ways that far exceeded what I’d imagined. </p> <p>What had I done right? The key, it seemed, was drilling down to find the essence of things, and then developing that. Even if I hadn’t been able to imagine quite what could be built on them, I’d been able to construct solid foundations, that successfully encapsulated things in the cleanest and simplest ways. </p> <p>In talking about observers and the ruliad—and in fact our Physics Project in general—I kept on making analogies to the way that the gas laws and fluid dynamics emerge from the complicated underlying dynamics of molecules. And at the core of this is the <a href="https://writings.stephenwolfram.com/2023/02/computational-foundations-for-the-second-law-of-thermodynamics/">Second Law of thermodynamics</a>. </p> <p>Well, as it happens, the very first foundational question in physics that I ever seriously studied was the origin of the Second Law. But that was <a href="https://writings.stephenwolfram.com/2023/02/a-50-year-quest-my-personal-journey-with-the-second-law-of-thermodynamics/">when I was 12 years old</a>, in 1972. For more than a century the Second Law <a href="https://writings.stephenwolfram.com/2023/01/how-did-we-get-here-the-tangled-history-of-the-second-law-of-thermodynamics/">had been quite mysterious</a>. But when I discovered computational irreducibility in 1984 I soon realized that it <a href="https://writings.stephenwolfram.com/2023/02/a-50-year-quest-my-personal-journey-with-the-second-law-of-thermodynamics/#computational-irreducibility-and-rule-30">might be the key to the Second Law</a>. And in the summer of 2022—armed with a new perspective on the importance of observers—I decided I’d better once and for all write down how the Second Law works.</p> <p>Once again, there were lots of <a href="https://writings.stephenwolfram.com/2023/02/computational-foundations-for-the-second-law-of-thermodynamics/">technical details</a>. And as a way to check my ideas I decided to go back and try to untangle the rather confused <a href="https://writings.stephenwolfram.com/2023/01/how-did-we-get-here-the-tangled-history-of-the-second-law-of-thermodynamics/">150-year history of the Second Law</a>. It was an interesting exercise, satisfying for seeing how my new ways of thinking clarified things, but cautionary in seeing how wrong turns had been taken—and solidified—in the past. But in the end, there it was: the Second Law was a consequence of the interplay between underlying computational irreducibility, and our limitations as observers.</p> <p>It had taken half a century, but finally I had finished the project I’d started when I was 12 years old. I was on a roll finishing things. But I was also realizing that a bigger structure than I’d ever imagined was emerging. The Second Law project completed what I think is the most beautiful thing I’ve ever discovered. That all three of the core theories of twentieth century physics—general relativity, quantum mechanics and the Second Law (statistical mechanics)—have the same origin: the interplay between the underlying computational structure of the ruliad, and our characteristics and limitations as observers. </p> <p>And I knew it didn’t stop there. I’d already applied the same kind of thinking to the foundations of mathematics. And I was ready to start applying it to all sorts of deep questions in science, in philosophy, and beyond. But at the end of 2022, just as I was finishing my pieces about the Second Law, there was a surprise: <a href="https://openai.com/index/chatgpt/" target="_blank" rel="noopener">ChatGPT</a>.</p> <p>I’d been <a href="https://writings.stephenwolfram.com/2015/05/wolfram-language-artificial-intelligence-the-image-identification-project/#personal-backstory">following AI and neural nets</a> for decades. I first simulated a neural net in 1981. <a href="https://writings.stephenwolfram.com/2016/04/my-life-in-technology-as-told-at-the-computer-history-museum/">My first company, started in 1981</a>, had, to my chagrin, been labeled an “AI company”. And from the early 2010s we’d integrated neural nets into the Wolfram Language. But—like the creators of ChatGPT—I didn’t expect the capabilities that emerged in ChatGPT. And as soon as I saw ChatGPT I started trying to understand it. What was it really doing? What would its capabilities be?</p> <p>In the world at large, there was a sense of shock: if AI can do this now, soon it’ll be able to do everything. But I immediately thought about computational irreducibility. And it gave us limitations. But those limitations would inevitably apply to AIs as well. There would be things that couldn’t be “quickly figured out by pure thought”—by humans and AIs alike. And, by the way, I’d just spent four decades building a way to represent things computationally, and actually do systematic computations on them—because that was the point of the Wolfram Language. </p> <p>So immediately I could see <a href="https://writings.stephenwolfram.com/2023/01/wolframalpha-as-the-way-to-bring-computational-knowledge-superpowers-to-chatgpt/">we were in a very interesting position</a>. The Wolfram Language had the completely unique mission of creating a full-scale computational language. And now this was a crucial tool for AIs. The AIs could provide a very interesting and useful broad linguistic interface. But when it came to solid computation, they were—like humans—going to need a tool. Conveniently, Wolfram|Alpha already communicated in natural language. And it took only a few weeks to hook up Wolfram|Alpha—and Wolfram Language—to ChatGPT. We’d <a href="https://writings.stephenwolfram.com/2023/03/chatgpt-gets-its-wolfram-superpowers/">given “computational superpowers” to the AI</a>. </p> <p>ChatGPT was everywhere. And people kept asking me about it. And over and over again I ended up explaining things about it. So at the beginning of February 2023 I decided it’d be better for me just to write down once and for all what I knew. It took a little over a week (yes, I’m a fast writer)—and then I had an “<a href="https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/">explainer</a>” (that ran altogether to 76 pages) of ChatGPT. </p> <p>Partly it talked in general about how machine learning and neural nets work, and how ChatGPT in particular works. But what a lot of people wanted to know was not “how” but “why” ChatGPT works. Why was something like that possible? Well, in effect ChatGPT was <a href="https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/#what-really-lets-chatgpt-work">showing us a new science discovery</a>—about language. Everyone knows that there’s a certain syntactic grammar of language—like that, in English, sentences typically have the form noun-verb-noun. But what ChatGPT was showing us is that there’s also a <a href="https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/#semantic-grammar-and-the-power-of-computational-language">semantic grammar</a>—some pattern of rules for what words can be put together and make sense. </p> <p>I’ve thought about the <a href="https://writings.stephenwolfram.com/category/language-and-communication">foundations of language</a> for a long time (which isn’t too surprising, given the four decades I’ve spent as a computational language designer). So in effect I was well primed to think about its interaction with ChatGPT. And it also helped that—as I’ll talk about below—one of my long-unfinished projects is precisely on a formal framework for capturing meaning that I call “<a href="https://writings.stephenwolfram.com/2016/10/computational-law-symbolic-discourse-and-the-ai-constitution/">symbolic discourse language</a>”.</p> <p>In technology and other things I always like best situations where basically nothing is known, and one has to invent everything from scratch. And that’s what was happening for functionality based on LLMs in the middle of 2023. How would <a href="https://writings.stephenwolfram.com/2023/05/the-new-world-of-llm-functions-integrating-llm-technology-into-the-wolfram-language/">LLM-based Wolfram Language functions</a> work? How would a <a href="https://writings.stephenwolfram.com/2023/06/prompts-for-work-play-launching-the-wolfram-prompt-repository/">prompt repository</a> work? How would <a href="https://writings.stephenwolfram.com/2023/06/introducing-chat-notebooks-integrating-llms-into-the-notebook-paradigm/">LLMs interact with notebooks</a>?</p> <p>Meanwhile, there was still lots of foment in the world about the “AI shock”. Before the arrival of the Physics Project in 2019—I’d been <a href="https://writings.stephenwolfram.com/2019/06/testifying-at-the-senate-about-a-i-selected-content-on-the-internet/">quite involved</a> in <a href="https://writings.stephenwolfram.com/2016/10/a-short-talk-on-ai-ethics/">AI philosophy</a>, AI ethics, etc. And in March 2023 I wrote a piece on “<a href="https://writings.stephenwolfram.com/2023/03/will-ais-take-all-our-jobs-and-end-human-history-or-not-well-its-complicated/">Will AIs Take All Our Jobs and End Human History—or Not?</a>” In the end—after all sorts of philosophical arguments, and an <a href="https://writings.stephenwolfram.com/2023/03/will-ais-take-all-our-jobs-and-end-human-history-or-not-well-its-complicated/#afterword-looking-at-some-actual-data">analysis of actual historical data</a>—the answer was: “It’s Complicated”. But along the way computational irreducibility and the ruliad were central elements: limiting the controllability of AIs, allowing for an infinite frontier of invention, and highlighting the inevitable meaninglessness of everything in the absence of human choice. </p> <p>By this point (and actually, with remarkable speed) my explainer on ChatGPT had turned into a <a href="https://www.wolfram-media.com/products/what-is-chatgpt-doing-and-why-does-it-work/">book</a>—that proved extremely popular (and now, for example, exists in over 10 languages). It was nice that people found the book useful—and perhaps it helped remove some of the alarming mystique of AI. But I couldn’t help noticing that of all the many things I’d written, this had been one of the fastest to write, yet it was garnering one of the largest readerships.</p> <p>One might have imagined that AI was pretty far from our Physics Project, the ruliad, etc. But actually it soon became clear that there were close connections, and that there were things to learn in both directions. In particular, I’d come to think of minds that work in different ways as occupying different positions in the ruliad. But how could one get intuition about what such minds would experience—or observe? Well, I realized, one could just look at generative AI. In July I wrote “<a href="https://writings.stephenwolfram.com/2023/07/generative-ai-space-and-the-mental-imagery-of-alien-minds/">Generative AI Space and the Mental Imagery of Alien Minds</a>”. I called this the “cats in hats piece”, because, yes, it has lots of pictures of (often bizarrely distorted) cats (in hats)—used as examples of what happens if one moves a mind around in rulial space. But despite the whimsy of the cats, this piece provided a surprisingly useful window into what for me has been a very longstanding question of how <a href="https://writings.stephenwolfram.com/2018/01/showing-off-to-the-universe-beacons-for-the-afterlife-of-our-civilization/">other minds might perceive things</a>. </p> <p>And this fed quite directly into my piece on “<a href="https://writings.stephenwolfram.com/2023/12/observer-theory/">Observer Theory</a>” in December 2023. Ever since things like Turing machines we’ve had a formal model for the process of computation. My goal was to do the same kind of thing for the process of observation. In a sense, computation constructs sequences of new things, say with time. Observation, on the other hand, equivalences things together, so they fit in finite minds. And just what equivalencing is done—by our senses, our measuring devices, our thinking—determines what our ultimate perceptions will be. Or, put another way, if we can characterize well enough what we’re like as observers, it’ll show us how we sample the ruliad, and what we’ll perceive the laws of physics to be.</p> <p>When I started the Physics Project I wasn’t counting on it having any applications for hundreds of years. But quite soon it became clear that actually there were going to be all sorts of near-term applications, particularly of the formalism of multicomputation. And every time one used that formalism one could get more intuition about features of the Physics Project, particularly related to quantum mechanics. I ended up writing a variety of “ruliological” pieces, all, as it happens, expanding on footnotes in <em>A New Kind of Science</em>. There was “<a href="https://www.wolframphysics.org/bulletins/2021/10/multicomputation-with-numbers-the-case-of-simple-multiway-systems/" target="_blank" rel="noopener">Multicomputation with Numbers</a>” (October 2021), “<a href="https://writings.stephenwolfram.com/2022/06/games-and-puzzles-as-multicomputational-systems/">Games and Puzzles as Multicomputational Systems</a>” (June 2022) and “<a href="https://writings.stephenwolfram.com/2023/11/aggregation-and-tiling-as-multicomputational-processes/">Aggregation and Tiling as Multicomputational Processes</a>” (November 2023). And in September 2023 there was also “<a href="https://writings.stephenwolfram.com/2023/09/expression-evaluation-and-fundamental-physics/">Expression Evaluation and Fundamental Physics</a>”. </p> <p>Back around 1980—when I was working on SMP—I’d become interested in the theory of expression evaluation. And finally, now, with the Physics Project—and my work on combinators and metamathematics—four decades later I had a principled way to study it (potentially with immediate application in distributed computing and computational language design around that). And I could check off progress on another long-pending project.</p> <p>I give many talks, and do many podcasts and livestreams—essentially all unprepared. But in October 2023 I agreed to give a TED talk. And I just didn’t see any way to fit a reasonable snapshot of my activities into 18 minutes without preparation. How was I to coherently explain the Physics Project, the ruliad and computational language in such a short time? I called the talk “<a href="https://writings.stephenwolfram.com/2023/10/how-to-think-computationally-about-ai-the-universe-and-everything/">How to Think Computationally about AI, the Universe and Everything</a>”. And I began with what for me was a new condensation: “Human language. Mathematics. Logic. These are all ways to formalize the world. And in our century there’s a new and yet more powerful one: computation.” </p> <p>Over the years I’d done all sorts of seemingly very different projects in science and in technology. But somehow it seemed like they were now all converging. Back in 1979, for example, I’d invented the idea of transformations for symbolic expressions as a foundation for computational language. But now—more than four decades later—our Physics Project was saying that those kinds of transformations (specifically on hypergraphs) were just what the “machine code of the universe” was made of. </p> <p>Since the 1980s I’d thought that computation was a useful paradigm with which to think about the world. But now our Physics Project and the ruliad were saying that it wasn’t just useful; it was the underlying paradigm of the world. For some time I’d been viewing our whole Wolfram Language effort as a way to provide a way to formalize computation for the purposes of both humans and machines. Four hundred years ago mathematical notation had streamlined mathematical thinking, allowing what became the mathematical sciences to develop. I saw what we were doing with our <a href="https://writings.stephenwolfram.com/2019/05/what-weve-built-is-a-computational-language-and-thats-very-important/">computational language</a> as a way to streamline computational thinking, and allow “computational X” for all fields “X” to develop. </p> <p>I began to see computational thinking as a way to “humanize” the ruliad; to pick out those parts that are meaningful to humans. And I began to see computational language as the bridge between the power of raw computation, and the kinds of things we humans think about. </p> <p>But how did AI fit in? At the beginning of 2024, lots of people were still asking in effect “Can AI Solve Science?” So I <a href="https://writings.stephenwolfram.com/2024/03/can-ai-solve-science/">decided to analyze that</a>. I certainly didn’t expect AI to be able to “break computational irreducibility”. And it didn’t. Yes, it could automate much of what humans could do in a quick look. But formalized, irreducible computation: that was going to need computational language, not AI. </p> <p>It’s easy to be original in the computational universe: if you pick a rule at random, it’s overwhelmingly likely <a href="https://x.com/stephen_wolfram/status/1420206573096513539/photo/1" target="_blank" rel="noopener">nobody’s ever looked at it before</a>. But will anyone care? They’ll care if in effect that part of the ruliad has been “colonized”; if there’s already a human connection to it. But what if you define some attribute that you want, then just “search out there” for a rule that exhibits it? That’s basically what biological evolution—or machine learning training—seems to do.</p> <p>And as a kind of <a href="https://writings.stephenwolfram.com/2024/03/can-ai-solve-science/#exploring-spaces-of-systems">off-hand note</a> I decided to just see if I could make a minimal model for that. <a href="https://content.wolfram.com/sw-publications/2020/07/approaches-complexity-engineering.pdf" target="_blank" rel="noopener">I’d tried before—in the mid-1980s</a>. And in the 1990s when I was writing <em>A New Kind of Science</em> I’d become convinced that computational irreducibility was in a sense a <a href="https://www.wolframscience.com/nks/chap-8--implications-for-everyday-systems#sect-8-5--fundamental-issues-in-biology">stronger force than adaptive evolution</a>, and that when complex behavior was seen in biology, it was computational irreducibility that should take most of the credit.</p> <p>But I decided to just do the experiment and see. And although computational irreducibility in a sense tells one to always “expect the unexpected”, in all these years I’ve never fully come to terms with that—and I’m still regularly surprised by what simple systems somehow “cleverly” manage to do. And so it was with <a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/">my minimal model of biological evolution</a>.</p> <p>I’d always wondered why biological evolution managed to work at all, why it didn’t “get stuck”, and how it managed to come up with the ornate “solutions” it did. Well, now I knew: and it turned out it was, once again, a story of computational irreducibility. And I’d managed to finish another project that I started in the 1980s.</p> <p>But then there was machine learning. And despite all the energy around it—as well as <a href="https://www.wolfram.com/language/core-areas/machine-learning/">practical experience with it</a>—it didn’t seem like there was a good foundational understanding of what it was doing or why it worked. For a couple of years I’d been asking all the machine learning experts I ran into what they knew. But mostly they confirmed that, yes, it wasn’t well understood. And in fact several of them suggested that I’d be the best person to figure it out. </p> <p>So just a few weeks ago, starting with ideas from the biological evolution project, and mixing in some things I tried back in 1985, I decided to embark on exploring minimal models of machine learning. I just <a href="https://writings.stephenwolfram.com/2024/08/whats-really-going-on-in-machine-learning-some-minimal-models/">posted the results last week</a>. And, yes, one seems to be able to see the essence of machine learning in systems vastly simpler than neural nets. In these systems one can visualize what’s going on—and it’s basically a story of finding ways to put together lumps of irreducible computation to do the tasks we want. Like stones one might pick up off the ground to put together into a stone wall, one gets something that works, but there’s no reason for there to be any understandable structure to it. </p> <p>Like so many of the projects I’ve done in the past five years, I could in principle have done this project much earlier—even in the 1980s. But back then I didn’t have the intuition, the tools or the intellectual confidence to actually dive in and get the project done. And what’s been particularly exciting over the past five years is that I can feel—and even very tangibly see—how what I can do has grown. With every project I’ve done I’ve further honed my intuition, developed more tools (both conceptual and practical), and built my intellectual confidence. Could I have gotten here earlier in my life? I don’t think so. I think to get to where I am now required the kind of journey I’ve taken through science, technology and the other things I’ve done. A living example of the phenomenon of computational irreducibility. </p> <h2 id="the-process-of-getting-things-done">The Process of Getting Things Done</h2> <p>I started my career young—and usually found myself the “youngest person in the room”. But shockingly fast all those years whizzed by, and now I’m usually the “oldest person in the room”. But somehow I always still seem to feel like a young whippersnapper—not settled into some expected pattern, and “pushing for the future”. </p> <p>I’ve always done projects that are hard. Projects that many people thought were impossible. Projects that stretched my capabilities to the limit. And to do this has required a certain mixture of confidence and humility. Confidence that it’s worth me trying the project. Humility in not assuming that it’ll be easy for me. </p> <p>I’ve learned a lot of fields by now, and with them a lot of different ways of thinking. But somehow it’s never enough to make the projects I do easy. Somehow the projects are always far enough out on the frontier that I have to learn new things and new ways of thinking to succeed at them. And so there I am, often the only person in the room whose project isn’t somehow easy for them. And who still has to be pushing, whippersnapper style.</p> <p>At this point, a fair fraction of the projects I do are ones that I’ve thought about for a long time; a smaller fraction are opportunistic—coming into scope just now as a result of something I’ve done, or something that’s happened in the world at large. Before the past five years I had a lot of projects that had languished, often for decades. Yes, I thought they would be interesting, and I gradually collected information about them. But somehow I wasn’t quite in a place to tackle them.</p> <p>But now I feel quite differently. In the past five years, I’ve gone back and finished a fair fraction of all those languishing projects. And it’s been great. Without exception, the projects turned out to be richer and more interesting than I expected. Often I realized I really couldn’t have done them without the tools and ideas (and infrastructure) I now have. And—often to my great surprise—the projects turned out to have very direct connections to big themes around the ruliad, the Physics Project and, for that matter, computational language. </p> <p>Why was this happening? Partly it’s a tribute to the breadth of the computational (and now multicomputational) paradigm. But partly it has to do with the specific character of projects I was choosing—always seeking what seemed like the simplest, most foundational versions of things. </p> <p>I’ve done quite a few big projects in my life, many seemingly very different. But as I look back, I realize that all my projects have a certain overall pattern to them. They’re all about taking something that seems complicated, then drilling down to find the foundations of what’s going on, and then building up from these—often with considerable engineering-style effort. And the methods and tools I’ve developed have in a sense implicitly been optimized for this pattern of work. </p> <p>I suppose one gets used to the rhythm of it all. The time when one’s drilling down, slowly trying to understand things. The time when one’s doing all the work to build the big structure up. And yes, it’s all hard. But by now I know the signs of progress, and they’re always energizing to see. </p> <p>At any given time, I’ll have many projects gestating—often for years or decades. But once a project becomes active, it’s usually the only one I’m working on. And I’ll work on it with great intensity, pushing hard to keep going until it’s done. Often I’ll be working with other people, usually much younger than me. And I think it’s always a surprise that I’ll routinely be the one who works with the greatest intensity—every day, at all hours. </p> <p>I think I’m <a href="https://writings.stephenwolfram.com/2019/02/seeking-the-productive-life-some-details-of-my-personal-infrastructure/">pretty efficient too</a>. Of course, it helps that I have a tool—Wolfram Language—that I’ve been building for decades to support me. And it helps that I’ve developed all kinds of practices around how I organize code and notebooks I create, and how I set up my process of writing about things. Of course, it also helps that I have very capable people around me to make suggestions, explore additional directions, fill in details, check things, and get my write-ups produced and published.</p> <p>As I have <a href="https://writings.stephenwolfram.com/2019/02/seeking-the-productive-life-some-details-of-my-personal-infrastructure/">written about elsewhere</a>, my life is in many ways set up to be quite simple and routine. I get up at the same time every day, eat the same thing for breakfast, and so on. But in a sense this frees me to concentrate on the intellectual things I’m doing—which are different every day, often in unexpected ways. </p> <p>But how is it that I even get the time to do all these intellectual things? After all, I am—as I have been for the past 38 years—the CEO of a <a href="https://www.wolfram.com/">very active tech company</a>. Two things I think help (in addition, of course, to the fact that I have such a great long-term team at the company). First, organization. And second, resolve. Every day I’ll have <a href="https://writings.stephenwolfram.com/2012/03/the-personal-analytics-of-my-life/">tightly scheduled meetings</a> over the course of the working day. (And there are lots of details to this. I get up in the late morning, then do my first two meetings while walking, and so on.) But somehow—mostly on evenings and weekends—I find time to work intensely on my intellectual projects.</p> <p>It’s not as if I ignore everything else in the world. But I do have a certain drive—and resolve—that fills any time available with my projects, and somehow seems to succeed in getting them done. (And, yes, there are many optimizations in the details of my life, saving me all sorts of time. And it probably helps that I’ve been a work-from-home CEO now for 33 years.)</p> <p>One might have thought that CEOing would greatly detract from being able to do intellectual work. But I find the exact opposite. Because in my experience the discipline of strategy and decision making (as well as communicating thoughts and ideas to other people) that comes with CEOing is critical to being able to do incisive intellectual work. And, by the way, the kind of thinking that goes with intellectual work is also incredibly valuable in being an effective CEO. </p> <p>There’s another critical part to my “formula”. And that has to do with exposition. For me, the exposition of a project is an integral part of the project. Part of it is that the very definition of the question is often one of the most important parts of a project. But more than that, it’s through exposition that I find I really understand things. It takes a certain discipline. It can be easy enough to make some highfalutin technical statement. But can one grind it down into truly simple pieces that one can immediately understand? Yes, that means other people will be able to understand it too. But for me, what’s critical is that that’s the way I can tell if I’m getting things right. And for me the exposition is what in the end defines the backbone of a project. </p> <p>Normally I write quickly, and basically without revision. But whenever there’s a piece I’m finding unduly hard to write I know that’s where I’m muddled, and need to go back and understand what’s going on. Some of my projects (like creating this piece, for example) end up being essentially “pure writing”. But most are deeply computational—and full of computer experiments. And just as I put a lot of effort into making written exposition clear, I do the same for computational language, and for pictures. Indeed, many of my projects are in large measure driven by pictures. Usually these are what one can think of as “algorithmic diagrams”—created automatically with a <a href="https://writings.stephenwolfram.com/2017/11/what-is-a-computational-essay/">structure optimized for exposition</a>.</p> <p>And the pictures aren’t just useful for presenting what I’ve done; they’re also critical to my own efforts to figure things out. And I’ve learned that it’s important to get the presentational details of pictures right as early as possible in a project—to give myself the best chance to notice things. </p> <p>Often the projects I do require exploring large numbers of possible systems. And somehow with great regularity this leads to me ending up looking at <a href="https://www.wolframscience.com/nks/p55--more-cellular-automata/">large arrays of little pictures</a>. Yes, there’s a lot of “looking” <a href="https://www.wolframphysics.org/technical-introduction/typical-behaviors/random-rules-and-overall-classification-of-behavior/#p-144" target="_blank" rel="noopener">that can be automated</a>. But in the end computational irreducibility means there’ll always be the unexpected, that I basically have to see for myself. </p> <p>A great thing about the Wolfram Language is that it’s been very stable ever since it was first released. And that means that I can take notebooks even from the 1980s and immediately run them today. And, yes, given all the “old” projects I’ve worked on in the past five years, that’s been very important. </p> <p>But in addition to being very stable, the Wolfram Language is also very self contained—and very much <a href="https://writings.stephenwolfram.com/2019/05/what-weve-built-is-a-computational-language-and-thats-very-important/">intended to be readable by humans</a>. And the result is something that I’ve found increasingly important: every computational picture in everything I write has Wolfram Language code “behind it”, that you can get by clicking. All the time I find myself going back to previous things I’ve written, and picking up <a href="https://reference.wolfram.com/language/ref/ClickToCopy.html">click-to-copy</a> code to run for some new case, or use as the basis for something new I’m doing. </p> <p>And of course that click-to-copy code is open for anyone to use. Not only for its “computational content”, but also for the often-elaborate visuals it implements.</p> <p>Most of my writings over the past five years have been about new basic science. But interspersed with this—along with pieces about technology and about philosophy—are <a href="https://writings.stephenwolfram.com/category/historical-perspectives/">pieces about history</a>. And in fact many of my scientific pieces have had extensive historical sections as well. </p> <p>Why do I put such effort into history? Partly I just find it fun to figure out. But mostly it’s to contextualize my understanding of things. Particularly in the past five years I’ve ended up working on a whole sequence of projects that are in a sense about changing longstanding directions in science. And to feel confident about making such changes, one has to know why people went in those directions in the first place. And that requires studying history.</p> <p>Make no mistake: history—or at least good history—is hard. Often there’ll be a standard simple story about how some discovery was suddenly made, or how some direction was immediately defined. But the real story is usually much more complicated—and much more revealing of the true intellectual foundations of what was figured out. Almost never did someone discover something “one day”; almost always it took many years to build up the conceptual framework so that “one day” the key thing could even be noticed. </p> <p>When I do history I always make a big effort to look at the original documents. And often I realize that’s critical—because it’s only with whatever new understanding I’ve developed that one would stand a chance of correctly interpreting what’s in the documents. And even if one’s mainly interested in the history of ideas, I’ve always found it’s crucial to also understand the people who were involved with them. What was their motivation? What was their practical situation? What kinds of things did they know about? What was their intellectual style in thinking about things?</p> <p>It has helped me greatly that I’ve had my own experiences in making discoveries—that gives me an intuition for how the process of discovery works. And it also helps that I’ve had my fair share of “worldly” experiences. Still, often it’s at first a mystery how some idea developed or some discovery got made. But my consistent experience is that with enough effort one can almost always solve it. </p> <p>Particularly for the projects I’ve done in recent years, it often leaves me with a strange feeling of connection. For in many cases I find out that the things I’ve now done can be viewed as direct follow-ons to ideas that were thought about a century or more ago, and for one reason or another ignored or abandoned since. </p> <p>And I’m then usually left with a strong sense of responsibility. An idea that was someone’s great achievement had been buried and lost to the world. But now I have found it again, and it rests on me to bring it into the future.</p> <p>In addition to writing about “other people’s history”, I’ve also been writing quite a bit about my own history. And in the last few years I’ve made a point of explaining my personal history around the science—and technology—I describe. In doing this, it helps a lot that I have <a href="https://writings.stephenwolfram.com/2019/02/seeking-the-productive-life-some-details-of-my-personal-infrastructure/#archiving-and-searching">excellent personal archives</a>—that routinely let me track to within minutes <a href="https://writings.stephenwolfram.com/2024/06/ruliology-of-the-forgotten-code-10/">discoveries I made even four decades ago</a>. </p> <p>My goal in describing my own history is to help other people contextualize things I write about. But I have to say that time and time again I’ve found the effort to piece together my own history extremely valuable just for me. As I go through life, I try to build up a repertoire of patterns for how things I do fit together. But often those patterns aren’t visible at the time. And it takes going back—often years later—to see them. </p> <p>I do the projects I do first and foremost for myself. But I’ve always liked the idea that other people can get their own pleasure and benefit from my projects. And—basically starting with the Physics Project—I’ve tried to open to the world not just the results of my projects, but the process by which they’re done. </p> <p>I post my <a href="https://www.wolframphysics.org/archives/index/" target="_blank" rel="noopener">working notebooks</a>. Whenever practical I <a href="https://livestreams.stephenwolfram.com/" target="_blank" rel="noopener">livestream</a> my working meetings. And, perhaps taking things to an extreme, I record even my own solitary work, posting it in “<a href="https://livestreams.stephenwolfram.com/category/personal-video-worklogs/" target="_blank" rel="noopener">video work logs</a>”. (Except I just realized I forgot to record the writing I’m doing right now!) </p> <p>A couple of years before the Physics Project I actually also <a href="https://writings.stephenwolfram.com/2017/12/what-do-i-do-all-day-livestreamed-technology-ceoing/">opened up my technology development activities</a>—livestreaming our <a href="https://livestreams.stephenwolfram.com/category/live-ceoing/" target="_blank" rel="noopener">software design reviews</a>, in the past five years 692 hours of them. (And, yes, I put a lot of work and effort into designing the Wolfram Language!)</p> <p>At the beginning of the pandemic I thought: “There are all these kids out of school. Let me try to do a little bit of public service and livestream something about science and technology for them.” And that’s how I started my “<a href="https://livestreams.stephenwolfram.com/category/science-technology-qa-for-kids-and-others" target="_blank" rel="noopener">Science & Technology Q&A for Kids & Others</a>” livestreams, that I’ve now been doing for four and a half years. Along the way, I’ve added “<a href="https://livestreams.stephenwolfram.com/category/history-of-science-and-technology-qa" target="_blank" rel="noopener">History of Science & Technology Q&A</a>”, “<a href="https://livestreams.stephenwolfram.com/category/future-of-science-technology-qa/" target="_blank" rel="noopener">Future of Science & Technology Q&A</a>”, and “<a href="https://livestreams.stephenwolfram.com/category/business-innovation-and-managing-life-qa" target="_blank" rel="noopener">Business, Innovation & Managing Life Q&A</a>”. Altogether I’ve done 272 hours of these, that have generated 376 <a href="https://podcasters.spotify.com/pod/show/stephenwolfram" target="_blank" rel="noopener">podcast episodes</a>.</p> <p>Twice a week I sit down in front of a camera, watch the feed of questions, and try to answer them. It’s always off the cuff, completely unprepared. And I find it a great experience. I can tell that over the time I’ve been doing this, I’ve become a better and more fluent explainer, which no doubt helps my written exposition too. Often in answering questions I’ll come up with a new way to explain something, that I’ve never thought of before. And often there’ll be questions that make me think about things I’ve never thought about at all before. Indeed, several of my recent projects actually got started as a result of questions people asked. </p> <p>When I was younger I always just wanted to get on with research, create things, and so on; I wasn’t interested in education. But as I’ve gotten older I’ve come to really like education. Partly it’s because I feel I learn a lot myself from it, but mostly it’s because I find it fulfilling to use what I know and try to help people develop. </p> <p>I’ve always been interested in people—a useful attribute in running a talent-rich company for four decades. (I’m particularly interested in how people develop through their lives—leading me recently, for example, to organize a 50-year reunion for my elementary school class.) I’ve had a <a href="https://writings.stephenwolfram.com/2019/08/fifty-years-of-mentoring/">long-time “hobby” of mentoring CEOs and kids</a> (both being categories of people who tend to believe that anything is possible). </p> <p>But my main educational efforts are concentrated in a few weeks of the year when we do our <a href="https://education.wolfram.com/summer-school">Wolfram Summer School</a> (started in 2003) and our <a href="https://education.wolfram.com/summer-research-high-school/">Wolfram High School Summer Research Program</a> (started in 2012). All the students in these programs (775 of them over the past five years) do an original project, and one of my jobs is to come up with what all these projects should be. Over the course of the year I’ll accumulate ideas—though rather often when I actually meet a student I’ll invent something new. </p> <p>I obviously do plenty of projects myself. But it’s always an interesting—and invigorating—experience to see so many projects get done with such intensity at our summer programs. Plus, I get lots of extra practice in framing projects that helps when I come to frame my own projects.</p> <p>At this point, I’ve spent years trying to organize my life to optimize it for what I want to get out of it. I need long stretches of time when I can concentrate coherently. But I like having a diversity of activities, and I’m pretty sure I wouldn’t have the energy and effectiveness I do without that. Over the years, I’ve added in little pieces. Like my weekly virtual sessions where I “do my homework” with a group of kids, working on something that I need to get done, but that doesn’t quite fit elsewhere. Or my weekly sessions with local kids, talking about things that make me and them think. Or, for that matter, my “call while driving” list of calls it’s good to make, but wouldn’t usually quite get the priority to happen. </p> <p>Doing all the things I do is hard work. But it’s what I want to do. Yes, things can drag from time to time. But at this point I’m so used to the rhythm of projects that I don’t think I notice much. And, yes, I work basically every hour of every day I can. Do I have hobbies? Well, back when I was an academic, business was my main “hobby”. When I started CEOing, science became a “hobby”. Writing. Education. Livestreaming. These were all “hobbies” too. But somehow one of the patterns of my life is that nothing really stays quite as a “true hobby”.</p> <h2 id="whats-next">What’s Next?</h2> <p>The past five years have not only been my most productive ever, but they’ve also built more “productivity momentum” than I’ve had before. So, what’s next? I have a lot of projects currently “in motion”, or ready to “get into motion”. Then I have many more that are in gestation, for which the time may finally have come. But I know there’ll also be surprises: projects that suddenly occur to me, or that I suddenly realize are possible. And one of the great challenges is to be in a position to actually jump into such things. </p> <p>It has to be said that there’s always a potentially complicated tradeoff. To what extent should one “tend” the things one’s already done, and to what extent should one do new things? Of course, there are some things that are never “done”—like the Wolfram Language, which I started building 38 years ago, and still (energetically) work on every day. Or the Physics Project, where there’s just so much to figure out. But one of the things that’s worked well in most of the basic science projects I’ve done in the past five years or is that once I’ve written my piece about the project, I can usually consider the project “done for now”. It always takes a lot of effort to get a project to the point where I can write about it. But I work hard to make sure I only have to do it once; that I’ve “picked the low-hanging fruit”, so I don’t feel I have to come back “to add a little more”.</p> <p>I put a lot of effort into the pieces I write about my projects. And I also give talks, do interviews, etc. (about 500 altogether in the past five years). But I certainly don’t “market” my efforts as much as I could. It’s a decision I’ve made: that at this point in my life—particularly with the burst of productivity I’m experiencing—I want to spend as much of my time as possible doing new things. And so I need to count on others to follow up and spread knowledge about what I’ve done, whether in the academic world, on Wikipedia, the web, etc. (And, yes, pieces I write and the pictures they contain are <a href="https://writings.stephenwolfram.com/terms/">set up to be immediately reproducible wherever appropriate</a>.)</p> <p>OK, so what specific new things are currently in my pipeline? Well, there’s lots of science (and related intellectual things). And there’s also lots of technology. But let’s talk about science first. </p> <p>A big story is the Physics Project—where there’s a lot to be done, in many different directions. There’s foundational theory to be developed. And there are experimental implications to be found.</p> <p>It’d be great if we could find experimental evidence of the discreteness of space, or maximum entanglement speed, or a host of other unexpected phenomena in our models. A century or so ago it was something of a stroke of luck that atoms were big enough that they could be detected. And we don’t know if the discreteness of space is something we’ll be able to detect now—or only centuries from now. </p> <p>There are phenomena—particularly associated with black holes—that might effectively serve as powerful “spacetime microscopes”. And there are phenomena like dimension fluctuations that could potentially show up in a variety of astrophysical settings. But one direction I’m particularly interested in exploring is what one might call “spacetime heat”—the effect of detailed microscopic dynamics in the hypergraph that makes up spacetime. Could “dark matter”, for example, not be “matter” at all, but instead be associated with spacetime heat?</p> <p>Part of investigating this involves building practical simulation software to investigate our models on as large a scale as possible. And part of it involves “good, old-fashioned physics”, figuring out how to go from underlying foundational effects to observable phenomena. </p> <p>And there’s a foundational piece to this too. How does one set up mathematics—and mathematical physics—when one’s starting from a hypergraph? A traditional manifold is ultimately built up from Euclidean space. But what kind of object is the limit of a hypergraph? To understand this, we need to construct what I’m calling infrageometry—and infracalculus alongside it. Infrageometry—as its name suggests—starts from something lower level than traditional geometry. And the challenge is in effect to build a “21st century Euclid”, then Newton, etc.—eventually finding generalizations of things like differential geometry and algebraic topology that answer questions like what 3<img style="margin-bottom: -8px" title="" src="https://content.wolfram.com/sites/43/2024/08/one-quarter-fraction.png" alt="" width="9" height="24" />-dimensional curvature tensors are like, or how we might distinguish local gauge degrees of freedom from spatial ones in a limiting hypergraph.</p> <p>Another direction has to do with particles—like electrons. The fact is that existing quantum field theory in a sense only really deals with particles indirectly, by thinking of them as perturbations in a field—which in turn is full of (usually unobservable) zero-point fluctuations. In our models, the structure of everything—from spacetime up—is determined by the “fluctuating” structure of the underlying hypergraph (or, more accurately, by the whole multiway graph of “possible fluctuations”). And what this suggests is that there’s in a sense a much lower level version of the Feynman diagrams we use in quantum field theory and where we can discuss the “effect of particles” without ever having to say exactly what a particle “is”.</p> <p>I must say that I expected we’d have to know what particles were even to talk about energy. But it turned out there was a “bulk” way to do that. And maybe similarly there’s an indirect way to talk about interactions between particles. My guess is that in our model particles are structures a bit like black holes—but we may be able to go a very long way without having to know the details.</p> <p>One of the important features of our models is that quantum mechanics is “inevitable” in them. And one of the projects I’m hoping to do is to finally “really understand quantum mechanics”. In general terms, it’s connected to the way branching observers (like us) perceive branching universes. But how do we get intuition for this, and what effects can we expect? Several projects over the past years (like <a href="https://www.wolframphysics.org/bulletins/2021/02/multiway-turing-machines/" target="_blank" rel="noopener">multiway Turing machines</a>, <a href="https://writings.stephenwolfram.com/2022/06/games-and-puzzles-as-multicomputational-systems/">multiway games</a>, <a href="https://writings.stephenwolfram.com/2023/11/aggregation-and-tiling-as-multicomputational-processes/">multiway aggregation</a>, etc.) I’ve done in large part to bolster my intuition about branchial space and quantum mechanics. </p> <p>I first <a href="https://x.com/stephen_wolfram/status/1370144700003463168" target="_blank" rel="noopener">worked on quantum computers back in 1980</a>. And at the time, I thought that the measurement process (whose mechanism isn’t described in the standard formalism of quantum mechanics) would be a big problem for them. Years have gone by, and enthusiasm for quantum computers has skyrocketed. In our models there’s a rather clear picture that inside a quantum computer there are “many threads of history” that can in effect do computations in parallel. But for an observer like us to “know what the answer is” we have to knit those threads together. And in our models (particularly with my observer theory efforts) we start to be able to see how that might happen, and what the limitations might be.</p> <p>Meanwhile, in the world at large there are all sorts of experimental quantum computers being built. But what are their limitations? I have a suspicion that there’s some as-yet-unknown fundamental physics associated with these limitations. It’s like building telescopes: you polish the mirror, and keep on making engineering tweaks. But unless you know about diffraction, you won’t understand why your resolution is limited. And I have a slight hope that even existing results on quantum computers may be enough to see limitations perhaps associated with <a href="https://writings.stephenwolfram.com/2020/04/finally-we-may-have-a-path-to-the-fundamental-theory-of-physics-and-its-beautiful/#branchial-motion-and-the-entanglement-horizon">maximum entanglement speed</a> in our models. And the way our models work, knowing this speed, you can for example <a href="https://www.wolframphysics.org/technical-introduction/potential-relation-to-physics/units-and-scales/" target="_blank" rel="noopener">immediately deduce</a> the discreteness scale of space. </p> <p>Back in 1982, I and another physicist wrote two papers on “Properties of the Vacuum”. <a href="https://content.wolfram.com/sw-publications/2020/07/properties-vacuum-mechanical-thermodynamic.pdf" target="_blank" rel="noopener">Part 1 was mechanical properties</a>. <a href="https://content.wolfram.com/sw-publications/2020/07/properties-vacuum-electrodynamic.pdf" target="_blank" rel="noopener">Part 2 was electrodynamic</a>. We announced a part 3, on gravitational properties. But we never wrote it. Well, finally, it looks as if our Physics Project shows us how to think about such properties. So perhaps it’s time to finally write “Part 3”, and respond to all those people who sent preprint request cards for it four decades ago. </p> <p>One of the great conclusions of our Physics Project—and the concept of the ruliad—is that we have the laws of physics we do because we are observers of the kind we are. And just knowing very coarsely about us as observers seems to already imply the major laws of twentieth century physics. And to be able to say more, I think we need more characterization of us as observers. And my guess is, for example, that some feature of us that we probably consider completely obvious is what leads us to perceive space as (roughly) three dimensional. And indeed I increasingly suspect that the whole structure of our Physics Project can be derived—a bit like early derivations of special relativity—from certain axiomatic assumptions about our nature as observers, and fundamental features of computation.</p> <p>There’s plenty to do on our Physics Project, and I’m looking forward to making progress with all of it. But the ideas of the Physics Project—and multicomputation in general—<a href="https://writings.stephenwolfram.com/2021/09/multicomputation-a-fourth-paradigm-for-theoretical-science/#potential-application-areas">apply to lots of other fields too</a>. And I have many projects planned on these.</p> <p>Let’s talk first about chemistry. I never found chemistry interesting as a kid. But as we’ve added <a href="https://www.wolfram.com/language/core-areas/chemistry/">chemistry functionality in the Wolfram Language</a>, I’ve understood more about it, and why it’s interesting. And I’ve also followed molecular computing since the 1980s. And now, largely inspired by thinking about multicomputation, I’ve become very interested in what one might call the foundations of chemistry. Actually, what I’m most interested in is what I’m calling “subchemistry”. I suppose one can think of it as having a similar kind of relation to chemistry as infrageometry has to geometry. </p> <p>In ordinary chemistry, one thinks about reactions between different species of molecules. And to calculate rates of reactions, one multiplies concentrations of different species, implicitly assuming that there’s perfect randomness in which specific molecules interact. But what if one goes to a lower level, and starts talking about the interactions not of species of molecules, but individual molecules? From our Physics Project we get the idea of making causal graphs that represent the causal relations between different specific interaction events. </p> <p>In a gas the assumption of molecular-level randomness will probably be pretty good. But even in a liquid it’ll be more questionable. And in more exotic materials it’ll be a completely different story. And I suspect that there are “subchemical” processes that can potentially be important, perhaps in a sense finding a new “slice of computational reducibility” within the general computational irreducibility associated with the Second Law.</p> <p>But the most important potential application of subchemistry is in biology. If we look at biological tissue, a basic question might be: “What phase of matter is it?” One of the major takeaways from molecular biology in the last few decades has been that in biological systems, molecules (or at least large ones) are basically never just “bouncing around randomly”. Instead, their motion is typically carefully orchestrated. </p> <p>So when we look at biological tissue—or a biological system—we’re basically seeing the result of “bulk orchestration”. But what are the laws of bulk orchestration? We don’t know. But I want to find out. I think the “<a href="https://writings.stephenwolfram.com/2023/02/computational-foundations-for-the-second-law-of-thermodynamics/#class-4-and-the-mechanoidal-phase">mechanoidal phase</a>” that I identified in studying the Second Law is potentially a good test case. </p> <p>If we look at a microprocessor, it’s not very useful to describe it as “containing a gas of electrons”. And similarly, it’s not useful to describe a biological cell as “being liquid inside”. But just what kind of theory is needed to have a more useful description we don’t know. And my guess is that there’ll be some new level of abstraction that’s needed to think about this (perhaps a bit like the new abstraction that was needed to formulate information theory).</p> <p>Biology is not big on theory. Yes, there’s natural selection. And there’s the digital nature of biomolecules. But mostly biology has ended up just accumulating vast amounts of data (using ever better instrumentation) without any overarching theory. But I suspect that in fact there’s another foundational theory to be found in biology. And if we find it, a lot of the data that’s been collected will suddenly fall into place.</p> <p>There’s the “frankly molecular” level of biology. And there’s the more “functional” level. And I was surprised recently to be able to find a very minimal model that seems to capture <a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/">“functional” aspects of biological evolution</a>. It’s a surprisingly rich model, and there’s much more to explore with it, notably about how different “ideas” get propagated and developed in the process of adaptive evolution—and what kinds of tree-of-life-style branchings occur. </p> <p>And then there’s the question of <a href="https://www.wolframscience.com/nks/notes-12-10--self-reproduction/">self replication</a>—a core feature of biology. Just how simple a system can exhibit it in a “biologically relevant way”? I had thought that self replication was “just relevant for biology”. But in thinking about the problem of observers in the ruliad, I’ve come to realize that it’s also relevant at a foundational level there. It’s no good to just have one observer; you have to have a whole “rulial flock” of similar ones. And to get similar ones you need something like self replication. </p> <p>Talking of “societies of observers” brings me to another area I want to study: economics. How does a coherent economic system emerge from all the microscopic transactions and other events in a society? I suspect it’s a story that’s in the end similar to the theories we’ve studied in physics—from the emergence of bulk properties in fluids, to the emergence of continuum spacetime, and so on. But now in economics we’re dealing not with fluid density or metric, but instead with things like price. I don’t yet know how it will work out. Maybe computational reducibility will be associated with value. Maybe computational irreducibility will be what determines robustness of value. But I suspect that there’s a way of thinking about “economic observers” in the ruliad—and figuring out what “natural laws” they’ll “inevitably observe”. And maybe some of those natural laws will be relevant in thinking about the kind of questions we humans care about in economics.</p> <p>It’s rather amazing in how many different areas one seems to be able to apply the kind of approach that’s emerged from the Physics Project, the ruliad, etc. One that I’ve very <a href="https://writings.stephenwolfram.com/2024/08/whats-really-going-on-in-machine-learning-some-minimal-models/">recently tackled is machine learning</a>. And in my effort to understand its foundations, I’ve ended up coming up with some very minimal models. My purpose was to understand the essence of machine learning. But—somewhat to my surprise—it looks as if these minimal models can actually be practical ways to do machine learning. Their hardware-level tradeoffs are somewhat different. But—given my interest in practical technology—I want to see if one can build out a practical machine-learning framework that’s based on these (fundamentally discrete) models. </p> <p>And while I’m not currently planning to investigate this myself, I suspect that the approach I’ve used to study machine learning can also be applied to neuroscience, and perhaps to linguistics. And, yes, there’ll probably be a lot of computational irreducibility in evidence. And once again one has to hope that the pockets of computational reducibility that exist will give rise to “natural laws” that are useful for what we care about in these fields. </p> <p>In addition to these “big” projects, I’m also hoping to do a variety of “smaller” projects. Many I started decades ago, and in fact mentioned in <em>A New Kind of Science</em>. But now I feel I have the tools, intuition and intellectual momentum to finally finish them. <a href="https://www.wolframscience.com/nks/p130--recursive-sequences/">Nestedly recursive functions</a>. <a href="https://www.wolframscience.com/nks/p221--systems-based-on-constraints/">Deterministic random tilings</a>. <a href="https://www.wolframscience.com/nks/notes-12-8--undecidability-in-natural-systems/">Undecidability in the three-body problem</a>. <a href="https://www.wolframscience.com/nks/notes-6-8--structures-in-the-game-of-life/">“Meta-engineering” in the Game of Life</a>. These might on their own seem esoteric. But my repeated experience—particularly in the past five years—is that by solving problems like these one builds examples and intuition that have surprisingly broad application.</p> <p>And then there are history projects. Just what did happen to theories of discrete space in the early twentieth century (and how close did people like Einstein get to the ideas of our Physics Project)? What was “ancient history” of neural nets, and why did people come to assume they should be based on continuous real numbers? I fully expect that as I investigate these things, I’ll encounter all sorts of “if only” situations—where for example some unpublished note languishing in an archive (or attic) would have changed the course of science if it had seen the light of day long ago. And when I find something like this, it’s yet more motivation to actually finish those projects of mine that have been languishing so long in the filesystem of my computer. </p> <p>There’s a lot I want to do “down in the computational trenches”, in physics, chemistry, biology, economics, etc. But there are also things at a more abstract level in the ruliad. There’s more to study about metamathematics, and about how mathematics that we humans care about can emerge from the ruliad. And there are also foundational questions in computer science. P vs. NP, for example, can be formulated as an essentially geometric problem in the ruliad—and conceivably there are mathematical methods (say from higher category theory) that might give insight into it. </p> <p>Then there are questions about hyperruliads and hyporuliads. In a hyperruliad that’s based on hypercomputation, there will be hyperobservers. But is there a kind of “rulial relativity” that makes their perception of things just the same as “ordinary observers” in the ordinary ruliad? A way to get some insight into this may be to study hyporuliads—versions of the ruliad in which there are only limited levels of computation possible. A bit like the way a spacelike singularity associated with a black hole supports only limited time histories, or a decidable axiomatic theory supports only proofs of limited length, there will be limitations in the hyporuliad. And by studying them, there’s a possibility that we’ll be able to see more about issues like what kinds of mathematical axioms can be compatible with observers like us. </p> <p>It’s worth commenting that our Physics Project—and the ruliad—have all sorts of connections and resonances with long-studied ideas in philosophy. “Didn’t Kant talk about that? Isn’t that similar to Leibniz?”, etc. I’ve wanted to try to understand these historical connections. But while I’ve done a lot of work on the historical development of ideas, the ideas in question have tended to be more focused, and more tied to concrete formalism than they usually are in philosophy. “Did Kant actually mean that, or something completely different?” You might have to understand all his works to know. And that’s more than I think I can do.</p> <p>I invented the concept of the ruliad as a matter of science. But it’s now clear that the ruliad has all sorts of connections and resonances not only with philosophy but also with theology. Indeed, in a great many belief systems there’s always been the idea that somehow in the end “everything is one”. In cases where this gets slightly more formalized, there’s often some kind of combinatorial enumeration involved (think: <em>I Ching</em>, or various versions of “counting the names of God”). </p> <p>There are all sorts of examples where long-surviving “ancient beliefs” end up having something to them, even if the specific methods of post-1600s science don’t have much to say about them. One example is the notion of a soul, which we might now see as an ancient premonition of the modern notion of abstract computation. And whenever there’s a belief that’s ancient, there’s likely to have been lots of thinking done around it over the millennia. So if we can, for example, see a connection to the ruliad, we can expect to leverage that thinking. And perhaps also be able to provide new input that can refine the belief system in interesting and valuable ways. </p> <p>I’m always interested in different viewpoints about things—whether from science, philosophy, theology, wherever. And an extreme version of this is to think about how other “alien” minds might view things. Nowadays I think of different minds as effectively being at different places in the ruliad. Humans with similar backgrounds have minds that are close in rulial space. Cats and dogs have minds that are further away. And the weather (with <a href="https://www.wolframscience.com/nks/p845--historical-perspectives/">its “mind of its own”</a>) is still further. </p> <p>Now that we have AIs we potentially have a way to study the correspondence—and communication—between “different minds”. I looked at one aspect of this in my <a href="https://writings.stephenwolfram.com/2023/07/generative-ai-space-and-the-mental-imagery-of-alien-minds/">“cats” piece</a>. But my <a href="https://writings.stephenwolfram.com/2024/08/whats-really-going-on-in-machine-learning-some-minimal-models/">recent work on the foundations of machine learning</a> suggests a broader approach, that can also potentially tell us things about the fundamental character of language, and about how it serves as a medium that can “<a href="https://writings.stephenwolfram.com/2021/11/the-concept-of-the-ruliad/#communicating-across-rulial-space">transport thoughts</a>” from one mind to another.</p> <p>Many non-human animals seem to have at least some form of language—though mostly in effect just a few standalone words. But pretty unquestionably the greatest single invention of our species is language—and particularly compositional language where words and phrases can fit together in an infinite number of ways. But is there something beyond compositional language? And, for example, where might we get if our brains were bigger?</p> <p>With the 100 billion neurons in our brains, we seem to be able to handle about 50,000 words. If we had a trillion neurons we’d probably be able to handle more words (though perhaps more slowly), in effect letting us describe more things more easily. But what about something fundamentally beyond compositional language? Something perhaps “higher order”? </p> <p>With a word we are in effect conflating all instances of a certain concept into a single object that we can then work with. But typically with ordinary words we’re dealing with what we might call “static concepts”. So what about “ways of thinking”, or paradigms? They’re more like active, functional concepts. And it’s a bit like dogs versus us: dogs deal with a few standalone words; we “package” those together into whole sentences and beyond. And at the next level, we could imagine in effect packaging things like generators of meaningful sentences.</p> <p>Interestingly enough, we have something of a preview of ideas like this—in computational language. And this is one of those places where my efforts in science—and philosophy—start to directly intersect with my efforts in technology.</p> <p>The foundation of the Wolfram Language is the idea of representing everything in computational terms, and in particular in symbolic computational terms. And one feature of such a representation is that it can encompass both “data” and “code”—i.e. both things one might think about, and ways one might think about them. </p> <p>I first started building Wolfram Language as a practical tool—though one very much informed by my foundational ideas. And now, four decades later, the Wolfram Language has emerged as the largest single project of my life, and something that, yes, I expect to always put immense effort into. It wasn’t long ago that we finally finished my 1991 to-do list for Wolfram Language—and we have many projects running now that will take years to complete. But the <a href="https://writings.stephenwolfram.com/2020/10/our-mission-and-the-opportunity-of-artifacts-from-the-future/">mission has always remained the same</a>: to take the concept of computation and apply it as broadly as possible, through the medium of computational language. </p> <p>Now, however, I have some additional context for that—viewing computational language as a bridge from what we humans think about to what’s possible in the computational universe. And this helps in framing some of the ways to expand the foundations of our computational language, for example to multicomputation, or to hypergraph-based representations. It also helps in understanding the character of current AI, and how it needs to interact with computational language.</p> <p>In the Wolfram Language we’ve been steadily trying to create a representation for everything. And when it comes to definitive, objective things we’ve gotten a long way. But there’s more than that in everyday discourse. For example, I might say “I’m going to drink a glass of orange juice.” Well, we do just fine at <a href="https://www.wolfram.com/language/12/food-and-nutrition-entities/">representing “a glass of orange juice”</a> in the Wolfram Language, and we can compute lots of things—like nutrition content—about it. But what about “I’m going to drink…”? For that we need something different. </p> <p>And, actually, I’ve been thinking for a shockingly long time about what one might need. I first considered the question in the early 1980s, in connection with “extending SMP to AI”. I learned about the attempts to make “philosophical languages” in the 1600s, and about some of the thinking around modern conlangs (constructed languages). Something that always held me back, though, was use cases. Yes, I could see how one could use things like this for tasks like customer service. But I wasn’t too excited about that. </p> <p>But finally there was blockchain, and with it, smart contracts. And around 2015 I started thinking about how one might represent contracts in general not in legalese but in some precise computational way. And the result was that I began to <a href="https://writings.stephenwolfram.com/2016/10/computational-law-symbolic-discourse-and-the-ai-constitution/">crispen my ideas about what I called “symbolic discourse language”</a>. I thought about how this might relate to questions like a “constitution for AIs” and so on. But I never quite got around to actually starting to design the specifics of the symbolic discourse language.</p> <p>But then along came LLMs, together with my theory that their success had to do with a “semantic grammar” of language. And finally now we’ve launched a serious project to build a symbolic discourse language. And, yes, it’s a difficult language design problem, deeply entangled with a whole range of foundational issues in philosophy. But as, by now at least, the world’s most experienced language designer (for better or worse), I feel a responsibility to try to do it. </p> <p>In addition to language design, there’s also the question of making all the various “symbolic calculi” that describe in appropriately coarse terms the operation of the world. Calculi of motion. Calculi of life (eating, dying, etc.). Calculi of human desires. Etc. As well as calculi that are directly supported by the computation and knowledge in the Wolfram Language.</p> <p>And just as LLMs can provide a kind of conversational linguistic interface to the Wolfram Language, one can expect them also to do this to our symbolic discourse language. So the pattern will be similar to what it is for Wolfram Language: the symbolic discourse language will provide a formal and (at least within its purview) correct underpinning for the LLM. It may lose the poetry of language that the LLM handles. But from the outset it’ll get its reasoning straight.</p> <p>The symbolic discourse language is a broad project. But in some sense breadth is what I have specialized in. Because that’s what’s needed to build out the Wolfram Language, and that’s what’s needed in my efforts to pull together the foundations of so many fields. </p> <p>And in maintaining a broad range of interests there are some where I imagine that someday there’ll be a project I can do, but there may for example be many years of “ambient technology” that are needed before that project will be feasible. Usually, though, I have some “conceptual idea” of what the project might be. For example, I’ve followed robotics, imagining that one day there’ll be a way to do “general-purpose robotics”, perhaps constructing everything out of modular elements. I’ve followed biomedicine, partly out of personal self interest, and partly because I think it’ll relate to some of the foundational questions I’m asking in biology. </p> <p>But in addition to all the projects where the goal is basic research, or technology development, I’m also hoping to pursue my interests in education. Much of what I hope to do relates to content, but some of it relates to access and motivation. I don’t have perfect evidence, but I strongly believe there’s a lot of young talent out there in the world that never manages to connect for example with things like the educational programs we put on. We–and I—have tried quite hard over the years to “bridge the gap”. But with the world as it is, it’s proved remarkably difficult. But it’s still a problem I’d like to solve, and I’ll keep picking away at it, hoping to change for the better some kids’ “trajectories”. </p> <p>But about content I believe my path is clearer. With the modern Wolfram Language I think we’ve gone a long way towards being able to take computational thinking about almost anything, and being able to represent it in a formalized way, and compute from it. But how do people manage to do the computational thinking in the first place? Well, like mathematical thinking and other formalized kinds of thinking, they <a href="https://writings.stephenwolfram.com/2016/09/how-to-teach-computational-thinking/">have to learn how to do it</a>. </p> <p>For years people have been telling me I should “write the book” to teach this. And finally in January of this year I started. I’m not sure how long it will take, but I’ll soon be starting to post sections I’ve written so far.</p> <p>My goal is to create a general book—and course—that’s an introduction to computational thinking at a level suitable for typical first-year college students. Lots of college students these days say they want to study “computer science”. But really it’s computational X for some field X that they’re ultimately interested in. And neither the theoretical nor the engineering aspects of typical “computer science” are what’s most relevant to them. What they need to know is computational thinking as it might be applied to computational X—not “CS” but what one might call “CX”. </p> <p>So what will CX101 be like? In some ways more like a philosophy course than a CS one. Because in the end it’s about generally learning to think, albeit in the new paradigm of computation. And the point is that once someone has a clear computational conceptualization of something, then it’s our job in the Wolfram Language to make sure that <a href="https://www.wolfram.com/language/elementary-introduction/3rd-ed/">it’s easy for them to concretely implement it</a>. </p> <p>But how does one teach computational conceptualization? What I’ve concluded is that one needs to anchor it in actual things in the world. Geography. Video. Genomics. Yes, there are principles to explain. But they need practical context to make them useful, or even understandable. And what I’m finding is that framing everything computationally makes things incredibly much easier to explain than before. (A test example coming soon is whether I can easily explain math ideas like algebra and calculus this way.)</p> <p>OK, so that’s a lot of projects. But I’m excited about all of them, and can’t wait to make them happen. At an age when many of my contemporaries are retiring, I feel like I’m just getting started. And somehow the way my projects keep on connecting back to things I did decades ago makes me feel—in a computational irreducibility kind of way—that there’s something necessary about all the steps I’ve taken. I feel like the things I’ve done have let me climb some hills. But now there are many more hills that have come into view. And I look forward to being able to climb those too. For myself and for the world.</p> ]]></content:encoded> <wfw:commentRss>https://writings.stephenwolfram.com/2024/08/five-most-productive-years-what-happened-and-whats-next/feed/</wfw:commentRss> <slash:comments>7</slash:comments> </item> <item> <title>What’s Really Going On in Machine Learning? Some Minimal Models</title> <link>https://writings.stephenwolfram.com/2024/08/whats-really-going-on-in-machine-learning-some-minimal-models/</link> <comments>https://writings.stephenwolfram.com/2024/08/whats-really-going-on-in-machine-learning-some-minimal-models/#comments</comments> <pubDate>Thu, 22 Aug 2024 18:28:17 +0000</pubDate> <dc:creator><![CDATA[Stephen Wolfram]]></dc:creator> <category><![CDATA[Artificial Intelligence]]></category> <category><![CDATA[Computational Science]]></category> <category><![CDATA[New Kind of Science]]></category> <category><![CDATA[Ruliology]]></category> <guid isPermaLink="false">https://writings.stephenwolfram.com/?p=61728</guid> <description><![CDATA[<span class="thumbnail"><img width="128" height="108" src="https://content.wolfram.com/sites/43/2024/08/swblog-ml-icon-v2.png" class="attachment-thumbnail size-thumbnail wp-post-image" alt="" /></span>The Mystery of Machine Learning It’s surprising how little is known about the foundations of machine learning. Yes, from an engineering point of view, an immense amount has been figured out about how to build neural nets that do all kinds of impressive and sometimes almost magical things. But at a fundamental level we still […]]]></description> <content:encoded><![CDATA[<span class="thumbnail"><img width="128" height="108" src="https://content.wolfram.com/sites/43/2024/08/swblog-ml-icon-v2.png" class="attachment-thumbnail size-thumbnail wp-post-image" alt="" /></span><p><img class="aligncenter" src="https://content.wolfram.com/sites/43/2024/08/sw-ml-hero-v3.png" max-width="650px" height="auto" alt="What's Really Going On in Machine Learning? Some Minimal Models" title="What's Really Going On in Machine Learning? Some Minimal Models"></p> <h2 id="the-mystery-of-machine-learning">The Mystery of Machine Learning</h2> <p>It’s surprising how little is known about the foundations of machine learning. Yes, from an engineering point of view, an immense amount has been figured out about how to build neural nets that do all kinds of impressive and <a href="https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/">sometimes almost magical things</a>. But at a fundamental level we still don’t really know why neural nets “work”—and we don’t have any kind of “scientific big picture” of what’s going on inside them. </p> <p>The basic structure of neural networks can be pretty simple. But by the time they’re trained up with all their weights, etc. it’s been hard to tell what’s going on—or even to get any good visualization of it. And indeed it’s far from clear even what aspects of the whole setup are actually essential, and what are just “details” that have perhaps been “grandfathered” all the way from when computational neural nets were first invented in the 1940s.</p> <p>Well, what I’m going to try to do here is to get “underneath” this—and to “strip things down” as much as possible. I’m going to explore some very minimal models—that, among other things, are more directly amenable to visualization. At the outset, I wasn’t at all sure that these minimal models would be able to reproduce any of the kinds of things we see in machine learning. But, rather surprisingly, it seems they can.<span id="more-61728"></span></p> <p>And the simplicity of their construction makes it much easier to “see inside them”—and to get more of a sense of what essential phenomena actually underlie machine learning. One might have imagined that even though the training of a machine learning system might be circuitous, somehow in the end the system would do what it does through some kind of identifiable and “explainable” mechanism. But we’ll see that in fact that’s typically not at all what happens. </p> <p>Instead it looks much more as if the training manages to home in on some quite wild computation that “just happens to achieve the right results”. Machine learning, it seems, isn’t building structured mechanisms; rather, it’s basically just sampling from the typical complexity one sees in the computational universe, picking out pieces whose behavior turns out to overlap what’s needed. And in a sense, therefore, the possibility of machine learning is ultimately yet another consequence of the phenomenon of <a href="https://www.wolframscience.com/nks/chap-12--the-principle-of-computational-equivalence#sect-12-6--computational-irreducibility">computational irreducibility</a>. </p> <p>Why is that? Well, it’s only because of computational irreducibility that there’s all that richness in the computational universe. And, more than that, it’s because of computational irreducibility that things end up being effectively random enough that the adaptive process of training a machine learning system can reach success without getting stuck. </p> <p>But the presence of computational irreducibility also has another important implication: that even though we can expect to find limited pockets of computational reducibility, we can’t expect a “general narrative explanation” of what a machine learning system does. In other words, there won’t be a traditional (say, mathematical) “general science” of machine learning (or, for that matter, probably also neuroscience). Instead, the story will be much closer to the fundamentally computational “<a href="https://www.wolframscience.com/nks/">new kind of science</a>” that I’ve explored for so long, and that has brought us our <a href="https://www.wolframphysics.org/" target="_blank" rel="noopener">Physics Project</a> and <a href="https://writings.stephenwolfram.com/2021/11/the-concept-of-the-ruliad/">the ruliad</a>.</p> <p>In many ways, the problem of machine learning is a version of the <a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/">general problem of adaptive evolution</a>, as encountered <a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/">for example in biology</a>. In biology we typically imagine that we want to adaptively optimize some overall “fitness” of a system; in machine learning we typically try to adaptively “train” a system to make it align with certain goals or behaviors, most often defined by examples. (And, yes, in practice this is often done by trying to minimize a quantity normally called the “loss”.)</p> <p>And while in biology there’s a general sense that “things arise through evolution”, quite how this works has always been rather mysterious. But (rather to my surprise) I recently <a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/">found a very simple model</a> that seems to do well at capturing at least some of the most essential features of biological evolution. And while the model isn’t the same as what we’ll explore here for machine learning, it has some definite similarities. And in the end we’ll find that the core phenomena of machine learning and of biological evolution appear to be remarkably aligned—and both fundamentally connected to the phenomenon of computational irreducibility.</p> <p>Most of what I’ll do here focuses on foundational, theoretical questions. But in understanding more about what’s really going on in machine learning—and what’s essential and what’s not—we’ll also be able to begin to see how in practice machine learning might be done differently, potentially with more efficiency and more generality. </p> <h2 id="traditional-neural-nets">Traditional Neural Nets</h2> <p><span></p> <div id="gpt-stripe" style="background: #f6fcff87; padding: 0.75rem 1.5rem;border: 1px solid #aeccd987;font-family: 'Source Sans Pro', sans-serif;margin-bottom: 2.5rem;max-width: 620px;/* font-size: .6rem; */"> <p style="font-size: .85rem;color: #3f5f6a;line-height: 1.5;padding-bottom: 0;display: block;"><em>Note: Click any diagram to get Wolfram Language code to reproduce it.</em></p> </div> <p>To begin the process of understanding the essence of machine learning, let’s start from a very traditional—and familiar—example: a <a href="https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/#machine-learning-and-the-training-of-neural-nets">fully connected (“multilayer perceptron”) neural net</a> that’s been trained to compute a certain function <em>f</em>[<em>x</em>]:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08192024traditionalimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08192024traditionalimg1.png' alt='' title='' width='445' height='283'> </div> </p></div> <p>If one gives a value <em>x</em> as input at the top, then after “rippling through the layers of the network” one gets a value at the bottom that (almost exactly) corresponds to our function <em>f</em>[<em>x</em>]:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08192024traditionalimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08192024traditionalimg2.png' alt='' title='' width='546' height='321'> </div> </p></div> <p>Scanning through different inputs <em>x</em>, we see different patterns of intermediate values inside the network: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08192024traditionalimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08192024traditionalimg3.png' alt='' title='' width='683' height='281'> </div> </p></div> <p>And here’s (on a linear and log scale) how each of these intermediate values changes with <em>x</em>. And, yes, the way the final value (highlighted here) emerges looks very complicated: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08192024traditionalimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08192024traditionalimg4.png' alt='' title='' width='682' height='292'> </div> </p></div> <p>So how is the neural net ultimately put together? How are these values that we’re plotting determined? We’re using the <a href="https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/#neural-nets">standard setup for a fully connected multilayer network</a>. Each node (“neuron”) on each layer is connected to all nodes on the layer above—and values “flow” down from one layer to the next, being multiplied by the (positive or negative) “weight” (indicated by color in our pictures) associated with the connection through which they flow. The value of a given neuron is found by totaling up all its (weighted) inputs from the layer before, adding a “bias” value for that neuron, and then applying to the result a certain (nonlinear) “<a href="https://reference.wolfram.com/language/ref/ElementwiseLayer.html">activation function</a>” (here ReLU or <tt><a href="http://reference.wolfram.com/language/ref/Ramp.html">Ramp</a></tt>[<em>z</em>], i.e. <tt><a href="http://reference.wolfram.com/language/ref/If.html">If</a></tt>[<em>z</em> < 0, 0, <em>z</em>]).</p> <p>What overall function a given neural net will compute is determined by the collection of weights and biases that appear in the neural net (along with its overall connection architecture, and the activation function it’s using). The idea of machine learning is to find weights and biases that produce a particular function by adaptively “learning” from examples of that function. Typically we might start from a random collection of weights, then successively tweak weights and biases to <a href="https://reference.wolfram.com/language/ref/NetTrain.html">“train” the neural net</a> to reproduce the function: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08192024traditionalimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08192024traditionalimg5.png' alt='' title='' width='664' height='205'> </div> </p></div> <p>We can get a sense of how this progresses (and, yes, it’s complicated) by plotting successive changes in individual weights over the course of the training process (the spikes near the end come from “neutral changes” that don’t affect the overall behavior):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08192024traditionalimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08192024traditionalimg6.png' alt='' title='' width='614' height='180'> </div> </p></div> <p>The overall objective in the training is progressively to decrease <a href="https://reference.wolfram.com/language/ref/LossFunction.html">the “loss”</a>—the average (squared) difference between true values of <em>f</em>[<em>x</em>] and those generated by the neural net. The evolution of the loss defines a “learning curve” for the neural net, with the downward glitches corresponding to points where the neural net in effect “made a breakthrough” in being able to represent the function better: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08192024traditionalimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08192024traditionalimg7.png' alt='' title='' width='348' height='120'> </div> </p></div> <p>It’s important to note that typically there’s randomness injected into neural net training. So if one runs the training multiple times, one will get different networks—and different learning curves—every time: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08192024traditionalimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08192024traditionalimg8.png' alt='' title='' width='600' height='136'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08192024traditionalimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08192024traditionalimg9.png' alt='' title='' width='356' height='123'> </div> </p></div> <p>But what’s really going on in neural net training? Effectively we’re finding a way to “compile” a function (at least to some approximation) into a neural net with a certain number of (real-valued) parameters. And in the example here we happen to be using about 100 parameters.</p> <p>But what happens if we use a different number of parameters, or set up the architecture of our neural net differently? Here are a few examples, indicating that for the function we’re trying to generate, the network we’ve been using so far is pretty much the smallest that will work:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08192024traditionalimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08192024traditionalimg10.png' alt='' title='' width='526' height='421'> </div> </p></div> <p>And, by the way, here’s what happens if we change our activation function from ReLU<br /> <img loading='lazy' style="margin-bottom: -7px" src='https://content.wolfram.com/sites/43/2024/08/sw08192024traditionalimg11.png' alt='' title='' width='37' height='23'/> to the smoother ELU <img loading='lazy' style="margin-bottom: -8px" src='https://content.wolfram.com/sites/43/2024/08/sw08192024traditionalimg12.png' alt='' title='' width='44' height='28'/>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08202024traditionalC2Cupdateimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08192024traditionalimg13.png' alt='' title='' width='525' height='420'> </div> </p></div> <p>Later we’ll talk about what happens when we do machine learning with discrete systems. And in anticipation of that, it’s interesting to see what happens if we take a neural net of the kind we’ve discussed here, and “quantize” its weights (and biases) in discrete levels:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08192024traditionalimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08192024traditionalimg14.png' alt='' title='' width='553' height='197'> </div> </p></div> <p>The result is that (as recent experience with large-scale neural nets has also shown) the basic “operation” of the neural net does not require precise real numbers, but survives even when the numbers are at least somewhat discrete—as this 3D rendering as a function of the discreteness level <em>δ</em> also indicates:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08292024discreteCimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08192024traditionalimg15.png' alt='' title='' width='341' height='364'> </div> </p></div> <h2 id="simplifying-the-topology-mesh-neural-nets">Simplifying the Topology: Mesh Neural Nets</h2> <p>So far we’ve been discussing very traditional neural nets. But to do machine learning, do we really need systems that have all those details? For example, do we really need every neuron on each layer to get an input from every neuron on the previous layer? What happens if instead every neuron just gets input from at most two others—say with the neurons effectively laid out in a simple mesh? Quite surprisingly, it turns out that such a network is still perfectly able to generate a function like the one we’ve been using as an example: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08202024topologyimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08202024topologyimg1.png' alt='' title='' width='403' height='270'> </div> </p></div> <p>And one advantage of such a “mesh neural net” is that—like a cellular automaton—its “internal behavior” can readily be visualized in a rather direct way. So, for example, here are visualizations of “how the mesh net generates its output”, stepping through different input values <em>x</em>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08202024topologyimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08202024topologyimg2.png' alt='' title='' width='642' height='531'> </div> </p></div> <p>And, yes, even though we can visualize it, it’s still hard to understand “what’s going on inside”. Looking at the intermediate values of each individual node in the network as a function of <em>x</em> doesn’t help much, though we can “see something happening” at places where our function <em>f</em>[<em>x</em>] has jumps:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08222024MLupdatesAimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08222024MLupdatesAimg1.png' alt='' title='' width='612' height='324'> </div> </p></div> <p>So how do we train a mesh neural net? Basically we can use the same procedure as for a fully connected network of the kind we saw above (ReLU activation functions don’t seem to work well for mesh nets, so we’re using ELU here):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08202024topologyimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08202024topologyimg4.png' alt='' title='' width='616' height='173'> </div> </p></div> <p>Here’s the evolution of differences in each individual weight during the training process: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08202024topologyimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08202024topologyimg5.png' alt='' title='' width='615' height='180'> </div> </p></div> <p>And here are results for different random seeds: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08202024topologyimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08202024topologyimg6.png' alt='' title='' width='591' height='108'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08202024topologyimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08202024topologyimg7.png' alt='' title='' width='412' height='136'> </div> </p></div> <p>At the size we’re using, our mesh neural nets have about the same number of connections (and thus weights) as our main example of a fully connected network above. And we see that if we try to reduce the size of our mesh neural net, it doesn’t do well at reproducing our function:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08202024topologyimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08202024topologyimg8.png' alt='' title='' width='593' height='459'> </div> </p></div> <h2 id="making-everything-discrete-a-biological-evolution-analog">Making Everything Discrete: A Biological Evolution Analog</h2> <p>Mesh neural nets simplify the topology of neural net connections. But, somewhat surprisingly at first, it seems as if we can go much further in simplifying the systems we’re using—and still successfully do versions of machine learning. And in particular we’ll find that we can make our systems completely discrete. </p> <p>The typical methodology of neural net training involves progressively tweaking real-valued parameters, usually using methods based on calculus, and on finding derivatives. And one might imagine that any successful adaptive process would ultimately have to rely on being able to make arbitrarily small changes, of the kind that are possible with real-valued parameters. </p> <p>But in <a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/">studying simple idealizations of biological evolution</a> I recently found striking examples where this isn’t the case—and where completely discrete systems seemed able to capture the essence of what’s going on. </p> <p>As an example consider a (3-color) cellular automaton. The rule is shown on the left, and the behavior one generates by repeatedly applying that rule (starting from a single-cell initial condition) is shown on the right:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08192027discreteimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08192027discreteimg1.png' alt='' title='' width='476' height='349'> </div> </p></div> <p>The rule has the property that the pattern it generates (from a single-cell initial condition) survives for exactly 40 steps, and then dies out (i.e. every cell becomes white). And the important point is that this rule can be found by a discrete adaptive process. The idea is to start, say, from a null rule, and then at each step to randomly change a single outcome out of the 27 in the rule (i.e. make a “single-point mutation” in the rule). Most such changes will cause the “lifetime” of the pattern to get further from our target of 40—and these we discard. But gradually we can build up “beneficial mutations”</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08192027discreteimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08192027discreteimg2.png' alt='' title='' width='213' height='186'> </div> </p></div> <p>that through “progressive adaptation” eventually get to our original lifetime-40 rule:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08192027discreteimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08192027discreteimg3.png' alt='' title='' width='595' height='140'> </div> </p></div> <p>We can make a plot of all the attempts we made that eventually let us reach lifetime 40—and we can think of this progressive “fitness” curve as being directly analogous to the loss curves in machine learning that we saw before: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08192027discreteimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08192027discreteimg4.png' alt='' title='' width='392' height='139'> </div> </p></div> <p>If we make different sequences of random mutations, we’ll get different paths of adaptive evolution, and different “solutions” for rules that have lifetime 40:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08192027discreteimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08192027discreteimg5.png' alt='' title='' width='567' height='664'> </div> </p></div> <p>Two things are immediately notable about these. First, that they essentially all seem to be “using different ideas” to reach their goal (presumably analogous to the phenomenon of different branches in the tree of life). And second, that none of them seem to be using a clear “mechanical procedure” (of the kind we might construct through traditional engineering) to reach their goal. Instead, they seem to be finding “natural” complicated behavior that just “happens” to achieve the goal.</p> <p>It’s nontrivial, of course, that this behavior can achieve a goal like the one we’ve set here, as well as that simple selection based on random point mutations can successfully reach the necessary behavior. But <a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/">as I discussed in connection with biological evolution</a>, this is ultimately a story of <a href="https://www.wolframscience.com/nks/chap-12--the-principle-of-computational-equivalence#sect-12-6--computational-irreducibility">computational irreducibility</a>—particularly in generating diversity both in behavior, and in the paths necessary to reach it. </p> <p>But, OK, so how does this model of adaptive evolution relate to systems like neural nets? In the standard language of neural nets, our model is like a discrete analog of a recurrent convolutional network. It’s “convolutional” because at any given step the same rule is applied—locally—throughout an array of elements. It’s “recurrent” because in effect data is repeatedly “passed through” the same rule. The kinds of procedures (like “backpropagation”) typically used to train traditional neural nets wouldn’t be able to train such a system. But it turns out that—essentially as a consequence of computational irreducibility—the very simple method of successive random mutation can be successful.</p> <h2 id="machine-learning-in-discrete-rule-arrays">Machine Learning in Discrete Rule Arrays</h2> <p>Let’s say we want to set up a system like a neural net—or at least a mesh neural net—but we want it to be completely discrete. (And I mean “born discrete”, not just discretized from an existing continuous system.) How can we do this? One approach (that, as it happens, <a href="https://content.wolfram.com/sw-publications/2020/07/approaches-complexity-engineering.pdf">I first considered in the mid-1980s</a>—but never seriously explored) is to make what we can call a “rule array”. Like in a cellular automaton there’s an array of cells. But instead of these cells always being updated according to the same rule, each cell at each place in the cellular automaton analog of “spacetime” can make a different choice of what rule it will use. (And although it’s a fairly extreme idealization, we can potentially imagine that these different rules represent a discrete analog of different local choices of weights in a mesh neural net.)</p> <p>As a first example, let’s consider a rule array in which there are two possible choices of rules: <nobr><em>k </em>= 2, <em>r </em>= 1</nobr> <a href="https://www.wolframscience.com/nks/chap-3--the-world-of-simple-programs#sect-3-2--more-cellular-automata">cellular automaton rules 4 and 146</a> (which are respectively <a href="https://www.wolframscience.com/nks/chap-6--starting-from-randomness#sect-6-2--four-classes-of-behavior">class 2 and class 3</a>):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08192024arraysimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08192024arraysimg1.png' alt='' title='' width='366' height='116'> </div> </p></div> <p>A particular rule array is defined by which of these rules is going to be used at each (“spacetime”) position in the array. Here are a few examples. In all cases we’re starting from the same single-cell initial condition. But in each case the rule array has a different arrangement of rule choices—with cells “running” rule 4 being given a <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/08/sw08202024arraysaquabox.png' alt='' title='' width='15' height='15'> background, and those running rule 146 a <img loading='lazy' style="margin-bottom: -1px" src='https://content.wolfram.com/sites/43/2024/08/sw08192024arrayspinkbox.png' alt='' title='' width='15' height='15'> one:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08202024arraysBimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08202024arraysBimg2.png' alt='' title='' width='675' height='171'> </div> </p></div> <p>We can see that different choices of rule array can yield very different behaviors. But (in the spirit of machine learning) can we in effect “invert this”, and find a rule array that will give some particular behavior we want?</p> <p>A simple approach is to do the direct analog of what we did in our minimal modeling of biological evolution: progressively make random “single-point mutations”—here “flipping” the identity of just one rule in the rule array—and then keeping only those mutations that don’t make things worse. </p> <p>As our sample objective, let’s ask to find a rule array that makes the pattern generated from a single cell using that rule array “survive” for exactly 50 steps. At first it might not be obvious that we’d be able to find such a rule array. But in fact our simple adaptive procedure easily manages to do this:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08202024arraysBimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08202024arraysBimg3.png' alt='' title='' width='595' height='505'> </div> </p></div> <p>As the dots here indicate, many mutations don’t lead to longer lifetimes. But every so often, the adaptive process has a “breakthrough” that increases the lifetime—eventually reaching 50:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08202024arraysBimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08202024arraysBimg4.png' alt='' title='' width='337' height='119'> </div> </p></div> <p>Just as in our model of biological evolution, different random sequences of mutations lead to different “solutions”, here to the problem of “living for exactly 50 steps”:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08202024arraysBimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08202024arraysBimg5.png' alt='' title='' width='618' height='406'> </div> </p></div> <p>Some of these are in effect “simple solutions” that require only a few mutations. But most—like most of our examples in biological evolution—seem more as if they just “happen to work”, effectively by tapping into just the right, fairly complex behavior.</p> <p>Is there a sharp distinction between these cases? Looking at the collection of “fitness” (AKA “learning”) curves for the examples above, it doesn’t seem so:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08202024arraysBimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08202024arraysBimg6.png' alt='' title='' width='612' height='163'> </div> </p></div> <p>It’s not too difficult to see how to “construct a simple solution” just by strategically placing a single instance of the second rule in the rule array:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08202024arraysBimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08202024arraysBimg7.png' alt='' title='' width='309' height='57'> </div> </p></div> <p>But the point is that adaptive evolution by repeated mutation normally won’t “discover” this simple solution. And what’s significant is that the adaptive evolution can nevertheless still successfully find some solution—even though it’s not one that’s “understandable” like this.</p> <p>The cellular automaton rules we’ve been using so far take 3 inputs. But it turns out that we can make things even simpler by just putting ordinary <a href="https://www.wolframscience.com/nks/p806--implications-for-mathematics-and-its-foundations/">2-input Boolean functions</a> into our rule array. For example, we can make a rule array from <tt><a href="https://reference.wolfram.com/language/ref/And.html">And</a></tt> and <tt><a href="https://reference.wolfram.com/language/ref/Xor.html">Xor</a></tt> functions (<a href="https://www.wolframscience.com/nks/p806--implications-for-mathematics-and-its-foundations/"><em>r</em> = 1/2 rules 8 and 6</a>):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08202024arraysBimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08202024arraysBimg8.png' alt='' title='' width='309' height='40'> </div> </p></div> <p>Different <tt>And</tt>+<tt>Xor</tt> (<img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/08/sw08202024arraysmintbox.png' alt='' title='' width='14' height='14'/> + <img loading='lazy' style="margin-bottom: -1px" src='https://content.wolfram.com/sites/43/2024/08/sw08192024arraysimg10.png' alt='' title='' width='11' height='13'/>) rule arrays show different behavior:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08202024arraysBimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08202024arraysBimg11.png' alt='' title='' width='594' height='168'> </div> </p></div> <p>But are there for example <tt>And</tt>+<tt>Xor</tt> rule arrays that will compute any of the 16 possible (2-input) functions? We can’t get <tt><a href="https://reference.wolfram.com/language/ref/Not.html">Not</a></tt> or any of the 8 other functions with <img loading='lazy' style="margin-bottom: -3px" src="https://content.wolfram.com/sites/43/2024/08/sw08192024arrayshexes.png" alt='' title='' width="52" height="26"/>—but it turns out we can get all 8 functions with <img loading='lazy' style="margin-bottom: -1px" src='https://content.wolfram.com/sites/43/2024/08/sw08192024arraysimg13.png' alt='' title='' width="50" height="13"/> (additional inputs here are assumed to be <img loading='lazy' style="margin-bottom: -1px" src='https://content.wolfram.com/sites/43/2024/08/sw08192024arraysimg14.png' alt='' title='' width='12' height='13'/>):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08202024arraysBimg15_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08202024arraysBimg15.png' alt='' title='' width='662' height='179'> </div> </p></div> <p>And in fact we can also set up <tt>And</tt>+<tt>Xor</tt> rule arrays for all other “even” Boolean functions. For example, here are rule arrays for the 3-input rule 30 and rule 110 Boolean functions:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08202024arraysBimg16_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08202024arraysBimg16.png' alt='' title='' width='573' height='53'> </div> </p></div> <p>It may be worth commenting that the ability to set up such rule arrays is related to <a href="https://www.wolframscience.com/nks/p807--implications-for-mathematics-and-its-foundations/">functional completeness</a> of the underlying rules we’re using—though it’s not quite the same thing. Functional completeness is about setting up arbitrary formulas, that can in effect allow long-range connections between intermediate results. Here, all information has to explicitly flow through the array. But for example the functional completeness of <tt><a href="http://reference.wolfram.com/language/ref/Nand.html">Nand</a></tt> (<em>r</em> = 1/2 rule 7, <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/08/sw08222024arraysorangehex.png' alt='' title='' width='15' height='15'/>) allows it to generate all Boolean functions when combined for example with <tt><a href="http://reference.wolfram.com/language/ref/First.html">First</a></tt> (<em>r</em> = 1/2 rule 12, <img loading='lazy' style="margin-bottom: -3px" src='https://content.wolfram.com/sites/43/2024/08/sw08222024arraysbluehex.png' alt='' title='' width='15' height='15'/>), though sometimes the rule arrays required are quite large: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08222024MLupdatesAimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08222024MLupdatesAimg2.png' alt='' title='' width='632' height='165'> </div> </p></div> <p>OK, but what happens if we try to use our adaptive evolution process—say to solve the problem of finding a pattern that survives for exactly 30 steps? Here’s a result for <tt>And</tt>+<tt>Xor</tt> rule arrays:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08202024arraysBimg20_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08202024arraysBimg20.png' alt='' title='' width='591' height='397'> </div> </p></div> <p>And here are examples of other “solutions” (none of which in this case look particularly “mechanistic” or “constructed”):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024arraysBimg21_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024arraysBimg21.png' alt='' title='' width='596' height='311'> </div> </p></div> <p>But what about learning our original <em>f</em>[<em>x</em>] = <img loading='lazy' style="margin-bottom: -1px" src='https://content.wolfram.com/sites/43/2024/08/sw08192024arraysimg22.png' alt='' title='' width='51' height='13'/> function? Well, first we have to decide how we’re going to represent the numbers <em>x</em> and <em>f</em>[<em>x</em>] in our discrete rule array system. And one approach is to do this simply in terms of the position of a black cell (“one-hot encoding”). So, for example, in this case there’s an initial black cell at a position corresponding to about <em>x</em> = –1.1. And then the result after passing through the rule array is a black cell at a position corresponding to <em>f</em>[<em>x</em>] = 1.0:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024arraysCimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024arraysCimg1.png' alt='' title='' width='188' height='304'> </div> </p></div> <p>So now the question is whether we can find a rule array that successfully maps initial to final cell positions according to the mapping <em>x</em> <img style="margin-bottom: -2px" class='' src="https://content.wolfram.com/uploads/sites/32/2022/10/rightarrow.png" width='18' height='12'> <em>f</em>[<em>x</em>] we want. Well, here’s an example that comes at least close to doing this (note that the array is taken to be cyclic):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024arraysBimg24_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024arraysBimg24.png' alt='' title='' width='687' height='414'> </div> </p></div> <p>So how did we find this? Well, we just used a simple adaptive evolution process. In direct analogy to the way it’s usually done in machine learning, we set up “training examples”, here of the form:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08222024MLupdatesAimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08222024MLupdatesAimg3.png' alt='' title='' width='573' height='76'> </div> </p></div> <p>Then we repeatedly made single-point mutations in our rule array, keeping those mutations where the total difference from all the training examples didn’t increase. And after 50,000 mutations this gave the final result above. </p> <p>We can get some sense of “how we got there” by showing the sequence of intermediate results where we got closer to the goal (as opposed to just not getting further from it):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024arraysBimg26_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024arraysBimg26.png' alt='' title='' width='597' height='246'> </div> </p></div> <p>Here are the corresponding rule arrays, in each case highlighting elements that have changed (and showing the computation of <em>f</em>[0] in the arrays):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024arraysBimg27_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024arraysBimg27.png' alt='' title='' width='609' height='289'> </div> </p></div> <p>Different sequences of random mutations will lead to different rule arrays. But with the setup defined here, the resulting rule arrays will almost always succeed in accurately computing <em>f</em>[<em>x</em>]. Here are a few examples—in which we’re specifically showing the computation of <em>f</em>[0]:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024arraysBimg28_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024arraysBimg28.png' alt='' title='' width='642' height='109'> </div> </p></div> <p>And once again an important takeaway is that we don’t see “identifiable mechanism” in what’s going on. Instead, it looks more as if the rule arrays we’ve got just “happen” to do the computations we want. Their behavior is complicated, but somehow we can manage to “tap into it” to compute our <em>f</em>[<em>x</em>].</p> <p>But how robust is this computation? A key feature of typical machine learning is that it can “generalize” away from the specific examples it’s been given. It’s never been clear just how to characterize that generalization (when does an image of a cat in a dog suit start <a href="https://writings.stephenwolfram.com/2015/05/wolfram-language-artificial-intelligence-the-image-identification-project/">being identified as an image of a dog</a>?). But—at least when we’re talking about classification tasks—we can think of what’s going on in terms of <a href="https://www.wolframscience.com/nks/p624--human-thinking/">basins of attraction</a> that lead to attractors corresponding to our classes.</p> <p>It’s all considerably easier to analyze, though, in the kind of discrete system we’re exploring here. For example, we can readily enumerate all our training inputs (i.e. all initial states containing a single black cell), and then see how frequently these cause any given cell to be black:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08242024arraysupdateFimg1_copy.txt' data-c2c-type='text/html'> <img src='https://content.wolfram.com/sites/43/2024/08/sw08242024arraysupdateFimg2.png' alt='' title='' width='186' height='302'/> </div> </p></div> <p>By the way, here’s what happens to this plot at successive “breakthroughs” during training:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024arraysBimg30_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024arraysBimg30.png' alt='' title='' width='660' height='309'> </div> </p></div> <p>But what about all possible inputs, including ones that don’t just contain a single black cell? Well, we can enumerate all of them, and compute the overall frequency for each cell in the array to be black:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08222024MLupdatesAimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08222024MLupdatesAimg4.png' alt='' title='' width='188' height='304'> </div> </p></div> <p>As we would expect, the result is considerably “fuzzier” than what we got purely with our training inputs. But there’s still a strong trace of the discrete values for <em>f</em>[<em>x</em>] that appeared in the training data. And if we plot the overall probability for a given final cell to be black, we see peaks at positions corresponding to the values 0 and 1 that <em>f</em>[<em>x</em>] takes on:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024arraysBimg32_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024arraysBimg32.png' alt='' title='' width='276' height='76'> </div> </p></div> <p>But because our system is discrete, we can explicitly look at what outcomes occur: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024arraysBimg33_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024arraysBimg33.png' alt='' title='' width='348' height='193'> </div> </p></div> <p>The most common overall is the “meaningless” all-white state—that basically occurs when the computation from the input “never makes it” to the output. But the next most common outcomes correspond exactly to <em>f</em>[<em>x</em>] = 0 and <em>f</em>[<em>x</em>] = 1. After that is the “superposition” outcome where <em>f</em>[<em>x</em>] is in effect “both 0 and 1”. </p> <p>But, OK, so what initial states are “in the basins of attraction of” (i.e. will evolve to) the various outcomes here? The fairly flat plots in the last column above indicate that the overall density of black cells gives little information about what attractor a particular initial state will evolve to. </p> <p>So this means we have to look at specific configurations of cells in the initial conditions. As an example, start from the initial condition </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024arraysBimg34_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024arraysBimg34.png' alt='' title='' width='195' height='12'> </div> </p></div> <p>which evolves to:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024arraysBimg35_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024arraysBimg35.png' alt='' title='' width='195' height='12'> </div> </p></div> <p>Now we can ask what happens if we look at a sequence of slightly different initial conditions. And here we show in black and white initial conditions that still evolve to the original “attractor” state, and in pink ones that evolve to some different state:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024arraysBimg36_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024arraysBimg36.png' alt='' title='' width='475' height='398'> </div> </p></div> <p>What’s actually going on inside here? Here are a few examples, highlighting cells whose values change as a result of changing the initial condition:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024arraysBimg37_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024arraysBimg37.png' alt='' title='' width='596' height='154'> </div> </p></div> <p>As is typical in machine learning, there doesn’t seem to be any simple characterization of the form of the basin of attraction. But now we have a sense of what the reason for this is: it’s another consequence of computational irreducibility. Computational irreducibility gives us the effective randomness that allows us to find useful results by adaptive evolution, but it also leads to changes having what seem like random and unpredictable effects. (It’s worth noting, by the way, that we could probably dramatically improve the robustness of our attractor basins by specifically including in our training data examples that have “noise” injected.)</p> <h2 id="multiway-mutation-graphs">Multiway Mutation Graphs</h2> <p>In doing machine learning in practice, the goal is typically to find some collection of weights, etc. that successfully solve a particular problem. But in general there will be many such collections of weights, etc. With typical continuous weights and random training steps it’s very difficult to see what the whole “ensemble” of possibilities is. But in our discrete rule array systems, this becomes more feasible.</p> <p>Consider a tiny 2×2 rule array with two possible rules. We can <a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/#the-multiway-graph-of-all-possible-mutation-histories">make a graph whose edges represent all possible “point mutations”</a> that can occur in this rule array:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024mutationimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024mutationimg1.png' alt='' title='' width='258' height='243'> </div> </p></div> <p>In our adaptive evolution process, we’re always moving around a graph like this. But typically most “moves” will end up in states that are rejected because they increase whatever loss we’ve defined. </p> <p>Consider the problem of generating an <tt>And</tt>+<tt>Xor</tt> rule array in which we end with lifetime-4 patterns. Defining the loss as how far we are from this lifetime, we can draw a graph that shows all possible adaptive evolution paths that always progressively decrease the loss:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024mutationimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024mutationimg2.png' alt='' title='' width='641' height='226'> </div> </p></div> <p>The result is a multiway graph of the type we’ve now seen in a <a href="https://writings.stephenwolfram.com/2023/11/aggregation-and-tiling-as-multicomputational-processes/">great many kinds of situations</a>—notably our <a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/#the-multiway-graph-of-all-possible-mutation-histories">recent study of biological evolution</a>. </p> <p>And although this particular example is quite trivial, the idea in general is that different parts of such a graph represent “different strategies” for solving a problem. And—in direct analogy to our <a href="https://www.wolframphysics.org/" target="_blank" rel="noopener">Physics Project</a> and <a href="https://writings.stephenwolfram.com/2022/06/games-and-puzzles-as-multicomputational-systems/">our studies of things like game graphs</a>—one can imagine such strategies being laid out in a “<a href="https://www.wolframphysics.org/technical-introduction/the-updating-process-for-string-substitution-systems/the-concept-of-branchial-graphs/" target="_blank" rel="noopener">branchial space</a>” defined by common ancestry of configurations in the multiway graph. </p> <p>And one can expect that while in some cases the branchial graph will be fairly uniform, in other cases it will have quite separated pieces—that represent fundamentally different strategies. Of course, the fact that underlying strategies may be different doesn’t mean that the overall behavior or performance of the system will be noticeably different. And indeed one expects that in most cases computational irreducibility will lead to enough effective randomness that there’ll be no discernable difference.</p> <p>But in any case, here’s an example starting with a rule array that contains both <tt>And </tt>and <tt>Xor</tt>—where we observe distinct branches of adaptive evolution that lead to different solutions to the problem of finding a configuration with a lifetime of exactly 4:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024mutationimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024mutationimg3.png' alt='' title='' width='612' height='220'> </div> </p></div> <h2 id="optimizing-the-learning-process">Optimizing the Learning Process</h2> <p>How should one actually do the learning in machine learning? In practical work with traditional neural nets, learning is normally done using systematic algorithmic methods like backpropagation. But so far, all we’ve done here is something much simpler: we’ve “learned” by successively making random point mutations, and keeping only ones that don’t lead us further from our goal. And, yes, it’s interesting that such a procedure can work at all—and (<a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/">as we’ve discussed elsewhere</a>) this is presumably very relevant to understanding phenomena like biological evolution. But, as we’ll see, there are more efficient (and probably much more efficient) methods of doing machine learning, even for the kinds of discrete systems we’re studying.</p> <p>Let’s start by looking again at our earlier example of finding an <tt>And</tt>+<tt>Xor</tt> rule array that gives a “lifetime” of exactly 30. At each step in our adaptive (“learning”) process we make a single-point mutation (changing a single rule in the rule array), keeping the mutation if it doesn’t take us further from our goal. The mutations gradually accumulate—every so often reaching a rule array that gives a lifetime closer to 30. Just as above, here’s a plot of the lifetime achieved by successive mutations—with the “internal” red dots corresponding to rejected mutations:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg1.png' alt='' title='' width='349' height='123'> </div> </p></div> <p>We see a series of “plateaus” at which mutations are accumulating but not changing the overall lifetime. And between these we see occasional “breakthroughs” where the lifetime jumps. Here are the actual rule array configurations for these breakthroughs, with mutations since the last breakthrough highlighted:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg2.png' alt='' title='' width='592' height='397'> </div> </p></div> <p>But in the end the process here is quite wasteful; in this example, we make a total of 1705 mutations, but only 780 of them actually contribute to generating the final rule array; all the others are discarded along the way.</p> <p>So how can we do better? One strategy is to try to figure out at each step which mutation is “most likely to make a difference”. And one way to do this is to try every possible mutation in turn at every step (as in multiway evolution)—and see what effect each of them has on the ultimate lifetime. From this we can construct a “change map” in which we give the change of lifetime associated with a mutation at every particular cell. The results will be different for every configuration of rule array, i.e. at every step in the adaptive evolution. But for example here’s what they are for the particular “breakthrough” configurations shown above (elements in regions that are colored gray won’t affect the result if they are changed; ones colored red will have a positive effect (with more intense red being more positive), and ones colored blue a negative one:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg3.png' alt='' title='' width='542' height='335'> </div> </p></div> <p>Let’s say we start from a random rule array, then repeatedly construct the change map and apply the mutation that it implies gives the most positive change—in effect at each step following the “path of steepest descent” to get to the lifetime we want (i.e. reduce the loss). Then the sequence of “breakthrough” configurations we get is:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08222024otimizingUPDATEimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg4.png' alt='' title='' width='536' height='171'> </div> </p></div> <p>And this in effect corresponds to a slightly more direct “path to a solution” than our sequence of pure single-point mutations. </p> <p>By the way, the particular problem of reaching a certain lifetime has a simple enough structure that this “steepest descent” method—when started from a simple uniform rule array—finds a very “mechanical” (if slow) path to a solution:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg5.png' alt='' title='' width='640' height='124'> </div> </p></div> <p>What about the problem of learning <em>f</em>[<em>x</em>] = <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg6.png' alt='' title='' width='51' height='13'/>? Once again we can make a change map based on the loss we define. Here are the results for a sequence of “breakthrough” configurations. The gray regions are ones where changes will be “neutral”, so that there’s still exploration that can be done without affecting the loss. The red regions are ones that are in effect “locked in” and where any changes would be deleterious in terms of loss:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg7.png' alt='' title='' width='644' height='224'> </div> </p></div> <p>So what happens in this case if we follow the “path of steepest descent”, always making the change that would be best according to the change map? Well, the results are actually quite unsatisfactory. From almost any initial condition the system quickly gets stuck, and never finds any satisfactory solution. In effect it seems that deterministically following the path of steepest descent leads us to a “local minimum” from which we cannot escape. So what are we missing in just looking at the change map? Well, the change map as we’ve constructed it has the limitation that it’s separately assessing the effect of each possible individual mutation. It doesn’t deal with multiple mutations at a time—which could well be needed in general if one’s going to find the “fastest path to success”, and avoid getting stuck.</p> <p>But even in constructing the change map there’s already a problem. Because at least the direct way of computing it scales quite poorly. In an <em>n</em>×<em>n</em> rule array we have to check the effect of flipping about <em>n</em><sup>2</sup> values, and for each one we have to run the whole system—taking altogether about <em>n</em><sup>4</sup> operations. And one has to do this separately for each step in the learning process.</p> <p>So how do traditional neural nets avoid this kind of inefficiency? The answer in a sense involves a mathematical trick. And at least as it’s usually presented it’s all based on the continuous nature of the weights and values in neural nets—which allow us to use methods from calculus. </p> <p>Let’s say we have a neural net like this</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg10.png' alt='' title='' width='94' height='173'> </div> </p></div> <p>that computes some particular function <em>f</em>[<em>x</em>]:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg11.png' alt='' title='' width='148' height='95'> </div> </p></div> <p>We can ask how this function changes as we change each of the weights in the network:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg12.png' alt='' title='' width='625' height='231'> </div> </p></div> <p>And in effect this gives us something like our “change map” above. But there’s an important difference. Because the weights are continuous, we can think about infinitesimal changes to them. And then we can ask questions like “How does <em>f</em>[<em>x</em>] change when we make an infinitesimal change to a particular weight <em>w</em><sub><em>i</em></sub>?”—or equivalently, “What is the partial derivative of <em>f</em> with respect to <em>w</em><sub><em>i</em></sub> at the point <em>x</em>?” But now we get to use a key feature of infinitesimal changes: that they can always be thought of as just “adding linearly” (essentially because ε<sup>2</sup> can always be ignored compared to ε). Or, in other words, we can summarize any infinitesimal change just by giving its “direction” in weight space, i.e. a vector that says how much of each weight should be (infinitesimally) changed. So if we want to change <em>f</em>[<em>x</em>] (infinitesimally) as quickly as possible, we should go in the direction of steepest descent defined by all the derivatives of <em>f</em> with respect to the weights.</p> <p>In machine learning, we’re typically trying in effect to set the weights so that the form of <em>f</em>[<em>x</em>] we generate successfully minimizes whatever loss we’ve defined. And we do this by incrementally “moving in weight space”—at every step computing the direction of steepest descent to know where to go next. (In practice, there are all sorts of tricks like “ADAM” that try to optimize the way to do this.)</p> <p>But how do we efficiently compute the partial derivative of <em>f</em> with respect to each of the weights? Yes, we could do the analog of generating pictures like the ones above, separately for each of the weights. But it turns out that a standard result from calculus gives us a vastly more efficient procedure that in effect “maximally reuses” parts of the computation that have already been done. </p> <p>It all starts with the textbook chain rule for the derivative of nested (i.e. composed) functions:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg16_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg16.png' alt='' title='' width='145' height='44'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg17_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg17.png' alt='' title='' width='239' height='44'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg18_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg18.png' alt='' title='' width='361' height='44'> </div> </p></div> <p>This basically says that the (infinitesimal) change in the value of the “whole chain” <em>d</em>[<em>c</em>[<em>b</em>[<em>a</em>[<em>x</em>]]]] can be computed as a product of (infinitesimal) changes associated with each of the “links” in the chain. But the key observation is then that when we get to the computation of the change at a certain point in the chain, we’ve already had to do a lot of the computation we need—and so long as we stored those results, we always have only an incremental computation to perform.</p> <p>So how does this apply to neural nets? Well, each layer in a neural net is in effect doing a function composition. So, for example, our <em>d</em>[<em>c</em>[<em>b</em>[<em>a</em>[<em>x</em>]]]] is like a trivial neural net:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg19_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg19.png' alt='' title='' width='53' height='167'> </div> </p></div> <p>But what about the weights, which, after all, are what we are trying to find the effect of changing? Well, we could include them explicitly in the function we’re computing:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg20_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg20.png' alt='' title='' width='207' height='14'> </div> </p></div> <p>And then we could in principle symbolically compute the derivatives with respect to these weights:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg21_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg21.png' alt='' title='' width='671' height='94'> </div> </p></div> <p>For our network above</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg22_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg22.png' alt='' title='' width='52' height='74'> </div> </p></div> <p>the corresponding expression (ignoring biases) is </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg23_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg23.png' alt='' title='' width='655' height='37'> </div> </p></div> <p>where ϕ denotes our activation function. Once again we’re dealing with nested functions, and once again—though it’s a bit more intricate in this case—the computation of derivatives can be done by incrementally evaluating terms in the chain rule and in effect using the standard neural net method of “backpropagation”. </p> <p>So what about the discrete case? Are there similar methods we can use there? We won’t discuss this in detail here, but we’ll give some indications of what’s likely to be involved.</p> <p>As a potentially simpler case, let’s consider ordinary cellular automata. The analog of our change map asks how the value of a particular “output” cell is affected by changes in other cells—or in effect what the “partial derivative” of the output value is with respect to changes in values of other cells. </p> <p>For example, consider the highlighted “output” cell in this cellular automaton evolution:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg24_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg24.png' alt='' title='' width='178' height='98'> </div> </p></div> <p>Now we can look at each cell in this array, and make a change map based on seeing whether flipping the value of just that cell (and then running the cellular automaton forwards from that point) would change the value of the output cell:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg25_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg25.png' alt='' title='' width='183' height='100'> </div> </p></div> <p>The form of the change map is different if we look at different “output cells”:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg26_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg26.png' alt='' title='' width='630' height='125'> </div> </p></div> <p>Here, by the way, are some larger change maps for this and a couple of other cellular automaton rules:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg27_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg27.png' alt='' title='' width='580' height='154'> </div> </p></div> <p>But is there a way to construct such change maps incrementally? One might have thought that there would immediately be at least for <a href="https://www.wolframscience.com/nks/chap-9--fundamental-physics#sect-9-2--the-notion-of-reversibility">cellular automata that (unlike the cases here) are fundamentally reversible</a>. But actually such reversibility doesn’t seem to help much—because although it allows us to “backtrack” whole states of the cellular automaton, it doesn’t allow us to trace the separate effects of individual cells. </p> <p>So how about using discrete analogs of derivatives and the chain rule? Let’s for example call the function computed by one step in rule 30 cellular automaton evolution <em>w</em>[<em>x</em>, <em>y</em>, <em>z</em>]. We can think of the “partial derivative” of this function with respect to <em>x</em> at the point <em>x</em> as representing whether the output of <em>w</em> changes when <em>x</em> is flipped starting from the value given:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg28_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg28.png' alt='' title='' width='631' height='184'> </div> </p></div> <p>(Note that “no change” is indicated as <tt><a href="http://reference.wolfram.com/language/ref/False.html">False</a></tt> or <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg29.png' alt='' title='' width='9' height='9'/>, while a change is indicated as <tt><a href="http://reference.wolfram.com/language/ref/True.html">True</a></tt> or <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg30.png' alt='' title='' width='9' height='9'/>. And, yes, one can either explicitly compute the rule outcomes here, and then deduce from them the functional form, or one can use symbolic rules to directly deduce the functional form.)</p> <p>One can compute a discrete analog of a derivative for any Boolean function. For example, we have</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg31_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg31.png' alt='' title='' width='71' height='13'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg32_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg32.png' alt='' title='' width='120' height='14'> </div> </p></div> <p>and</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg33_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg33.png' alt='' title='' width='147' height='14'> </div> </p></div> <p>which we can write as: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg34_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg34.png' alt='' title='' width='212' height='24'> </div> </p></div> <p>We also have:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg35_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg35.png' alt='' title='' width='158' height='14'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg36_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg36.png' alt='' title='' width='212' height='24'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg37_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg37.png' alt='' title='' width='192' height='14'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg38_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg38.png' alt='' title='' width='212' height='24'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg39_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg39.png' alt='' title='' width='222' height='14'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg40_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg40.png' alt='' title='' width='212' height='24'> </div> </p></div> <p>And here is a table of “Boolean derivatives” for all 2-input Boolean functions:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg41_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg41.png' alt='' title='' width='572' height='329'> </div> </p></div> <p>And indeed there’s a whole “Boolean calculus” one can set up for these kinds of derivatives. And in particular, there’s a direct analog of the chain rule:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg42_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg42.png' alt='' title='' width='262' height='14'> </div> </p></div> <p>where <tt><a href="http://reference.wolfram.com/language/ref/Xnor.html">Xnor</a></tt><tt>[x,y]</tt> is effectively the equality test <em>x</em> == <em>y</em>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg43_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg43.png' alt='' title='' width='114' height='32'> </div> </p></div> <p>But, OK, how do we use this to create our change maps? In our simple cellular automaton case, we can think of our change map as representing how a <a href="https://www.wolframscience.com/nks/p604--cryptography-and-cryptanalysis/">change in an output cell “propagates back”</a> to previous cells. But if we just try to apply our discrete calculus rules we run into a problem: different “chain rule chains” can imply different changes in the value of the same cell. In the continuous case this path dependence doesn’t happen because of the way infinitesimals work. But in the discrete case it does. And ultimately we’re doing a kind of backtracking that can really be represented faithfully only as a multiway system. (Though if we just want probabilities, for example, we can consider <a href="https://www.wolframphysics.org/bulletins/2021/02/multiway-turing-machines/" target="_blank" rel="noopener">averaging over branches of the multiway system</a>—and the change maps we showed above are effectively the result of thresholding over the multiway system.)</p> <p>But despite the appearance of such difficulties in the “simple” cellular automaton case, such methods typically seem to work better in our original, more complicated rule array case. There’s a bunch of subtlety associated with the fact that we’re finding derivatives not only with respect to the values in the rule array, but also with respect to the choice of rules (which are the analog of weights in the continuous case). </p> <p>Let’s consider the <tt>And</tt>+<tt>Xor</tt> rule array:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg44_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg44.png' alt='' title='' width='171' height='162'> </div> </p></div> <p>Our loss is the number of cells whose values disagree with the row shown at the bottom. Now we can construct a change map for this rule array both in a direct “forward” way, and “backwards” using our discrete derivative methods (where we effectively resolve the small amount of “multiway behavior” by always picking “majority” values): </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg45_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg45.png' alt='' title='' width='293' height='149'> </div> </p></div> <p>The results are similar, though in this case not exactly the same. Here are a few other examples:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg46_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg46.png' alt='' title='' width='596' height='374'> </div> </p></div> <p>And, yes, in detail there are essentially always local differences between the results from the forward and backward methods. But the backward method—like in the case of backpropagation in ordinary neural nets—can be implemented much more efficiently. And for purposes of practical machine learning it’s actually likely to be perfectly satisfactory—especially given that the forward method is itself only providing an approximation to the question of which mutations are best to do. </p> <p>And as an example, here are the results of the forward and backward methods for the problem of learning the function <em>f</em>[<em>x</em>] = <img loading='lazy' style="margin-bottom: -1px" src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg47.png' alt='' title='' width='51' height='13'/>, for the “breakthrough” configurations that we showed above:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg48_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024optimizingimg48.png' alt='' title='' width='639' height='222'> </div> </p></div> <h2 id="what-can-be-learned">What Can Be Learned?</h2> <p>We’ve now shown quite a few examples of machine learning in action. But a fundamental question we haven’t yet addressed is what kind of thing can actually be learned by machine learning. And even before we get to this, there’s another question: given a particular underlying type of system, what kinds of functions can it even represent?</p> <p>As a first example consider a minimal neural net of the form (essentially a single-layer perceptron):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08222024MLupdatesAimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08222024MLupdatesAimg5.png' alt='' title='' width='318' height='139'> </div> </p></div> <p>With ReLU (AKA <tt><a href="http://reference.wolfram.com/language/ref/Ramp.html">Ramp</a></tt>) as the activation function and the first set of weights all taken to be 1, the function computed by such a neural net has the form:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg2.png' alt='' title='' width='204' height='14'> </div> </p></div> <p>With enough weights and biases this form can represent any piecewise linear function—essentially just by moving around ramps using biases, and scaling them using weights. So for example consider the function:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg3.png' alt='' title='' width='261' height='75'> </div> </p></div> <p>This is the function computed by the neural net above—and here’s how it’s built up by adding in successive ramps associated with the individual intermediate nodes (neurons):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg4.png' alt='' title='' width='598' height='200'> </div> </p></div> <p>(It’s similarly possible to get all smooth functions from activation functions like ELU, etc.)</p> <p>Things get slightly more complicated if we try to represent functions with more than one argument. With a single intermediate layer we can only get “piecewise (hyper)planar” functions (i.e. functions that change direction only at linear “fault lines”):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg5.png' alt='' title='' width='291' height='241'> </div> </p></div> <p>But already with a total of two intermediate layers—and sufficiently many nodes in each of these layers—we can generate any piecewise function of any number of arguments. </p> <p>If we limit the number of nodes, then roughly we limit the number of boundaries between different linear regions in the values of the functions. But as we increase the number of layers with a given number of nodes, we basically increase the number of sides that polygonal regions within the function values can have:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg6.png' alt='' title='' width='492' height='298'> </div> </p></div> <p>So what happens with the mesh nets that we discussed earlier? Here are a few random examples, showing results very similar to shallow, fully connected networks with a comparable total number of nodes:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg7.png' alt='' title='' width='443' height='352'> </div> </p></div> <p>OK, so how about our fully discrete rule arrays? What functions can they represent? We already saw part of the answer earlier when we generated rule arrays to represent various Boolean functions. It turns out that there is a fairly efficient procedure based on <a href="https://reference.wolfram.com/language/ref/SatisfiabilityInstances.html">Boolean satisfiability</a> for explicitly finding rule arrays that can represent a given function—or determine that no rule array (say of a given size) can do this. </p> <p>Using this procedure, we can find minimal <tt>And</tt>+<tt>Xor</tt> rule arrays that represent all (“even”) 3-input Boolean functions (i.e. <em>r</em> = 1 cellular automaton rules):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg8.png' alt='' title='' width='641' height='462'> </div> </p></div> <p>It’s always possible to specify any <em>n</em>-input Boolean function by an array of 2<sup><em>n</em></sup> bits, as in:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg10.png' alt='' title='' width='206' height='22'> </div> </p></div> <p>But we see from the pictures above that when we “compile” Boolean functions into <tt>And</tt>+<tt>Xor</tt> rule arrays, they can take different numbers of bits (i.e. different numbers of elements in the rule array). (In effect, the “algorithmic information content” of the function varies with the “language” we’re using to represent them.) And, for example, in the <em>n </em>= 3 case shown here, the distribution of minimal rule array sizes is:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg11.png' alt='' title='' width='216' height='114'> </div> </p></div> <p>There are some functions that are difficult to represent as <tt>And</tt>+<tt>Xor</tt> rule arrays (and seem to require 15 rule elements)—and others that are easier. And this is similar to what happens if we represent Boolean functions as Boolean expressions (say in conjunctive normal form) and <a href="https://www.wolframscience.com/nks/notes-10-11--boolean-formula-sizes/">count the total number of (unary and binary) operations used</a>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg12.png' alt='' title='' width='366' height='116'> </div> </p></div> <p>OK, so we know that there is in principle an <tt>And</tt>+<tt>Xor</tt> rule array that will compute any (even) Boolean function. But now we can ask whether an adaptive evolution process can actually find such a rule array—say with a sequence of single-point mutations. Well, if we do such adaptive evolution—with a loss that counts the number of “wrong outputs” for, say, rule 254—then here’s a sequence of successive breakthrough configurations that can be produced:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg13.png' alt='' title='' width='578' height='320'> </div> </p></div> <p>The results aren’t as compact as the minimal solution above. But it seems to always be possible to find at least some <tt>And</tt>+<tt>Xor</tt> rule array that “solves the problem” just by using adaptive evolution with single-point mutations.</p> <p>Here are results for some other Boolean functions:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg14.png' alt='' title='' width='531' height='579'> </div> </p></div> <p>And so, yes, not only are all (even) Boolean functions representable in terms of <tt>And</tt>+<tt>Xor</tt> rule arrays, they’re also learnable in this form, just by adaptive evolution with single-point mutations.</p> <p>In what we did above, we were looking at how machine learning works with our rule arrays in specific cases like for the <img loading='lazy' style="margin-bottom: -1px" style="margin-bottom: -1px" src='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg15.png' alt='' title='' width='51' height='13'/> function. But now we’ve got a case where we can explicitly enumerate all possible functions, at least of a given class. And in a sense what we’re seeing is evidence that machine learning tends to be very broad—and capable at least in principle of learning pretty much any function. </p> <p>Of course, there can be specific restrictions. Like the <tt>And</tt>+<tt>Xor</tt> rule arrays we’re using here can’t represent (“odd”) functions where <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg16.png' alt='' title='' width='69' height='9'/>. (The <tt><a href="https://reference.wolfram.com/language/ref/Nand.html">Nand</a></tt>+<tt><a href="https://reference.wolfram.com/language/ref/First.html">First</a></tt> rule arrays we discussed above nevertheless can.) But in general it seems to be a reflection of the <a href="https://www.wolframscience.com/nks/chap-12--the-principle-of-computational-equivalence/">Principle of Computational Equivalence</a> that pretty much any setup is capable of representing any function—and also adaptively “learning” it. </p> <p>By the way, it’s a lot easier to discuss questions about representing or learning “any function” when one’s dealing with discrete (countable) functions—because one can expect to either be able to “exactly get” a given function, or not. But for continuous functions, it’s more complicated, because one’s pretty much inevitably dealing with approximations (unless one can use symbolic forms, which are basically discrete). So, for example, while we can say (as we did above) that (ReLU) neural nets can represent any piecewise-linear function, in general we’ll only be able to imagine successively approaching an arbitrary function, much like when you progressively add more terms in a simple Fourier series:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg17_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg17.png' alt='' title='' width='639' height='74'> </div> </p></div> <p>Looking back at our results for discrete rule arrays, one notable observation that is that while we can successfully reproduce all these different Boolean functions, the actual rule array configurations that achieve this tend to look quite messy. And indeed it’s much the same as we’ve seen throughout: machine learning can find solutions, but they’re not “structured solutions”; they’re in effect just solutions that “happen to work”.</p> <p>Are there more structured ways of representing Boolean functions with rule arrays? Here are the two possible minimum-size <tt>And</tt>+<tt>Xor</tt> rule arrays that represent rule 30:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg18_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg18.png' alt='' title='' width='144' height='60'> </div> </p></div> <p>At the next-larger size there are more possibilities for rule 30:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg19_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg19.png' alt='' title='' width='457' height='45'> </div> </p></div> <p>And there are also rule arrays that can represent rule 110:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg20_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg20.png' alt='' title='' width='595' height='45'> </div> </p></div> <p>But in none of these cases is there obvious structure that allows us to immediately see how these computations work, or what function is being computed. But what if we try to explicitly construct—effectively by standard engineering methods—a rule array that computes a particular function? We can start by taking something like the function for rule 30 and writing it in terms of <tt>And </tt>and <tt>Xor</tt> (i.e. in <a href="https://reference.wolfram.com/language/ref/BooleanConvert.html">ANF, or “algebraic normal form”</a>):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg21_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg21.png' alt='' title='' width='225' height='14'> </div> </p></div> <p>We can imagine implementing this using an “evaluation graph”:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08222024MLupdatesAimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08222024MLupdatesAimg6A.png' alt='' title='' width='108' height='112'> </div> </p></div> <p>But now it’s easy to turn this into a rule array (and, yes, we haven’t gone all the way and arranged to copy inputs, etc.):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08222024MLupdatesAimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08222024MLupdatesAimg7.png' alt='' title='' width='101' height='108'> </div> </p></div> <p>“Evaluating” this rule array for different inputs, we can see that it indeed gives rule 30:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedAimg24_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedAimg24.png' alt='' title='' width='533' height='58'> </div> </p></div> <p>Doing the same thing for rule 110, the <tt>And</tt>+<tt>Xor</tt> expression is</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg28_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg28.png' alt='' title='' width='342' height='14'> </div> </p></div> <p>the evaluation graph is</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08222024MLupdatesAimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08222024MLupdatesAimg8A.png' alt='' title='' width='112' height='114'> </div> </p></div> <p>and the rule array is:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08222024MLupdatesAimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08222024MLupdatesAimg9.png' alt='' title='' width='100' height='102'> </div> </p></div> <p>And at least with the evaluation graph as a guide, we can readily “see what’s happening” here. But the rule array we’re using is considerably larger than our minimal solutions above—or even than the solutions we found by adaptive evolution.</p> <p>It’s a typical situation that one sees in many other kinds of systems (like for example <a href="https://www.wolframscience.com/nks/notes-12-8--sorting-networks/">sorting networks</a>): it’s possible to have a “constructed solution” that has clear structure and regularity and is “understandable”. But minimal solutions—or ones found by adaptive evolution—tend to be much smaller. But they almost always look in many ways random, and aren’t readily understandable or interpretable.</p> <p>So far, we’ve been looking at rule arrays that compute specific functions. But in getting a sense of what rule arrays can do, we can consider rule arrays that are “<a href="https://www.wolframscience.com/nks/chap-11--the-notion-of-computation/">programmable</a>”, in that their input specifies what function they should compute. So here, for example, is an <tt>And</tt>+<tt>Xor</tt> rule array—found by adaptive evolution—that takes the “bit pattern” of any (even) Boolean function as input on the left, then applies that Boolean function to the inputs on the right:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg33_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg33.png' alt='' title='' width='269' height='250'> </div> </p></div> <p>And with this same rule array we can now compute any possible (even) Boolean function. So here, for example, it’s evaluating <tt><a href="https://reference.wolfram.com/language/ref/And.html">Or</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg34_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024learnedimg34.png' alt='' title='' width='648' height='142'> </div> <h2 id="other-kinds-of-models-and-setups">Other Kinds of Models and Setups</h2> <p>Our general goal here has been to set up models that capture the most essential features of neural nets and machine learning—but that are simple enough in their structure that we can readily “look inside” and get a sense of what they are doing. Mostly we’ve concentrated on rule arrays as a way to provide a minimal analog of standard “perceptron-style” feed-forward neural nets. But what about other architectures and setups?</p> <p>In effect, our rule arrays are “spacetime-inhomogeneous” generalizations of cellular automata—in which adaptive evolution determines which rule (say from a finite set) should be used at every (spatial) position and every (time) step. A different idealization (that in fact we already used in <a href="https://writings.stephenwolfram.com/2024/08/whats-really-going-on-in-machine-learning-some-minimal-models/#making-everything-discrete-a-biological-evolution-analog">one section above</a>) is to have an ordinary homogeneous cellular automaton—but with a single “global rule” determined by adaptive evolution. Rule arrays are the analog of feed-forward networks in which a given rule in the rule array is in effect used only once as data “flows through” the system. Ordinary homogeneous cellular automata are like recurrent networks in which a single stream of data is in effect subjected over and over again to the same rule.</p> <p>There are various interpolations between these cases. For example, we can imagine a “layered rule array” in which the rules at different steps can be different, but those on a given step are all the same. Such a system can be viewed as an idealization of a convolutional neural net in which a given layer applies the same kernel to elements at all positions, but different layers can apply different kernels.</p> <p>A layered rule array can’t encode as much information as a general rule array. But it’s still able to show machine-learning-style phenomena. And here, for example, is adaptive evolution for a layered <tt>And</tt>+<tt>Xor</tt> rule array progressively solving the problem of generating a pattern that lives for exactly 30 steps:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg1.png' alt='' title='' width='591' height='329'> </div> </p></div> <p>One could also imagine “vertically layered” rule arrays, in which different rules are used at different positions, but any given position keeps running the same rule forever. However, at least for the kinds of problems we’ve considered here, it doesn’t seem sufficient to just be able to pick the positions at which different rules are run. One seems to either need to change rules at different (time) steps, or one needs to be able to adaptively evolve the underlying rules themselves.</p> <p>Rule arrays and ordinary cellular automata share the feature that the value of each cell depends only on the values of neighboring cells on the step before. But in neural nets it’s standard for the value at a given node to depend on the values of lots of nodes on the layer before. And what makes this straightforward in neural nets is that (weighted, and perhaps otherwise transformed) values from previous nodes are taken to be combined just by simple numerical addition—and addition (being <em>n</em>-ary and associative) can take any number of “inputs”. In a cellular automaton (or Boolean function), however, there’s always a definite number of inputs, determined by the structure of the function. In the most straightforward case, the inputs come only from nearest-neighboring cells. But there’s no requirement that this is how things need to work—and for example we can pick any “local template” to bring in the inputs for our function. This template could either be the same at every position and every step, or it could be picked from a certain set differently at different positions—in effect giving us “template arrays” as well as rule arrays.</p> <p>So what about having a fully connected network, as we did in our very first neural net examples above? To set up a discrete analog of this we first need some kind of discrete <em>n</em>-ary associative “accumulator” function to fill the place of numerical addition. And for this <a href="https://www.wolframscience.com/nks/notes-12-9--properties-of-logical-primitives/">we could pick a function like</a> <tt><a href="http://reference.wolfram.com/language/ref/And.html">And</a></tt>, <tt><a href="http://reference.wolfram.com/language/ref/Or.html">Or</a></tt>, <tt><a href="http://reference.wolfram.com/language/ref/Xor.html">Xor</a></tt>—or <tt><a href="http://reference.wolfram.com/language/ref/Majority.html">Majority</a></tt>. And if we’re not just going to end up with the same value at each node on a given layer, we need to set up some analog of a weight associated with each connection—which we can achieve by applying either <tt><a href="http://reference.wolfram.com/language/ref/Identity.html">Identity</a></tt> or <tt><a href="http://reference.wolfram.com/language/ref/Not.html">Not</a></tt> (i.e. flip or not) to the value flowing through each connection. </p> <p>Here’s an example of a network of this type, trained to compute the <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg2.png' alt='' title='' width='51' height='13'/> function we discussed above:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg3.png' alt='' title='' width='591' height='138'> </div> </p></div> <p>There are just two kinds of connections here: flip and not. And at each node we’re computing the majority function—giving value 1 if the majority of its inputs are 1, and 0 otherwise. With the “one-hot encoding” of input and output that we used before, here are a few examples of how this network evaluates our function:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg4.png' alt='' title='' width='613' height='126'> </div> </p></div> <p>This was trained just using 1000 steps of single-point mutation applied to the connection types. The loss systematically goes down—but the configuration of the connection types continues to look quite random even as it achieves zero loss (i.e. even after the function has been completely learned):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg5.png' alt='' title='' width='501' height='200'> </div> </p></div> <p>In what we’ve just done we assume that all connections continue to be present, though their types (or effectively signs) can change. But we can also consider a network where connections can end up being zeroed out during training—so that they are effectively no longer present. </p> <p>Much of what we’ve done here with machine learning has centered around trying to learn transformations of the form <em>x </em><img style="margin-bottom: -1px" class='' src="https://content.wolfram.com/uploads/sites/32/2022/10/rightarrow2.png" width='15' height='11' > <em>f</em>[<em>x</em>]. But another typical application of machine learning is autoencoding—or in effect learning how to compress data representing a certain set of examples. And once again it’s possible to do such a task using rule arrays, with learning achieved by a series of single-point mutations. </p> <p>As a starting point, consider training a rule array (of cellular automaton rules 4 and 146) to reproduce unchanged a block of black cells of any width. One might have thought this would be trivial. But it’s not, because in effect the initial data inevitably gets “ground up” inside the rule array, and has to be reconstituted at the end. But, yes, it’s nevertheless possible to train a rule array to at least roughly do this—even though once again the rule arrays we find that manage to do this look quite random:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg6.png' alt='' title='' width='668' height='261'> </div> </p></div> <p>But to set up a nontrivial autoencoder let’s imagine that we progressively “squeeze” the array in the middle, creating an increasingly narrow “bottleneck” through which the data has to flow. At the bottleneck we effectively have a compressed version of the original data. And we find that at least down to some width of bottleneck, it’s possible to create rule arrays that—with reasonable probability—can act as successful autoencoders of the original data:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg7B.png' alt='' title='' width='679' height='700'> </div> </p></div> <p>The success of LLMs has highlighted the use of machine learning for sequence continuation—and the effectiveness of transformers for this. But just as with other neural nets, the forms of transformers that are used in practice are typically very complicated. But can one find a minimal model that nevertheless captures the “essence of transformers”?</p> <p>Let’s say that we have a sequence that we want to continue, like:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg8.png' alt='' title='' width='315' height='87'> </div> </p></div> <p>We want to encode each possible value by a vector, as in</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg9.png' alt='' title='' width='198' height='57'> </div> </p></div> <p>so that, for example, our original sequence is encoded as:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg10.png' alt='' title='' width='304' height='32'> </div> </p></div> <p>Then we have a “head” that reads a block of consecutive vectors, picking off certain values and feeding pairs of them into <tt>And</tt> and <tt>Xor</tt> functions, to get a vector of Boolean values:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg11.png' alt='' title='' width='224' height='147'> </div> </p></div> <p>Ultimately this head is going to “slide” along our sequence, “predicting” what the next element in the sequence will be. But somehow we have to go from our vector of Boolean values to (probabilities of) sequence elements. Potentially we might be able to do this just with a rule array. But for our purposes here we’ll use a fully connected single-layer <tt><a href="https://reference.wolfram.com/language/ref/Identity.html">Identity</a></tt>+<tt><a href="https://reference.wolfram.com/language/ref/Not.html">Not</a></tt> network in which at each output node we just find the sum of the number of values that come to it—and treat this as determining (through a <a href="https://reference.wolfram.com/language/ref/SoftmaxLayer.html">softmax</a>) the probability of the corresponding element:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg12.png' alt='' title='' width='548' height='115'> </div> </p></div> <p>In this case, the element with the maximum value is 5, so at “zero temperature” this would be our “best prediction” for the next element. </p> <p>To train this whole system we just make a sequence of random point mutations to everything, keeping mutations that don’t increase the loss (where the loss is basically the difference between predicted next values and actual next values, or, more precisely, the “<a href="https://reference.wolfram.com/language/ref/CrossEntropyLossLayer.html">categorical cross-entropy</a>”). Here’s how this loss progresses in a typical such training:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08222024MLupdatesAimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08222024MLupdatesAimg10.png' alt='' title='' width='358' height='142'> </div> </p></div> <p>At the end of this training, here are the components of our minimal transformer:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg14.png' alt='' title='' width='616' height='444'> </div> </p></div> <p>First come the encodings of the different possible elements in the sequence. Then there’s the head, here shown applied to the encoding of the first elements of the original sequence. Finally there’s a single-layer discrete network that takes the output from the head, and deduces relative probabilities for different elements to come next. In this case the highest-probability prediction for the next element is that it should be element 6.</p> <p>To do the analog of an LLM we start from some initial “prompt”, i.e. an initial sequence that fits within the width (“context window”) of the head. Then we progressively apply our minimal transformer, for example at each step taking the next element to be the one with the highest predicted probability (i.e. operating “at zero temperature”). With this setup the collection of “prediction strengths” is shown in gray, with the “best prediction” shown in red:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg15_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg15.png' alt='' title='' width='671' height='86'> </div> </p></div> <p>Running this even far beyond our original training data, we see that we get a “prediction” of a continued sine wave:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg16_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg16.png' alt='' title='' width='665' height='42'> </div> </p></div> <p>As we might expect, the fact that our minimal transformer can make such a plausible prediction relies on the simplicity of our sine curve. If we use “more complicated” training data, such as the “mathematically defined” (<span class='InlineFormula'><img style="margin-top: -2px" src='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg17.png' width= '128' height='23' align='absmiddle'></span>) blue curve in</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg18_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg18.png' alt='' title='' width='659' height='93'> </div> </p></div> <p>the result of training and running a minimal transformer is now:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg19_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg19.png' alt='' title='' width='671' height='86'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg20_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg20.png' alt='' title='' width='665' height='42'> </div> </p></div> <p>And, not surprisingly, it can’t “figure out the computation” to correctly continue the curve. By the way, different training runs will involve different sequences of mutations, and will yield different predictions (often with periodic “hallucinations”):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg21_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08212024modelsimg21.png' alt='' title='' width='648' height='307'> </div> </p></div> <p>In looking at “perceptron-style” neural nets we wound up using rule arrays<tt>—</tt>or, in effect, spacetime-inhomogeneous cellular automata<tt>—</tt>as our minimal models. Here we’ve ended up with a slightly more complicated minimal model for transformer neural nets. But if we were to simplify it further, we would end up not with something like a cellular automaton but instead with something like a <a href="https://www.wolframscience.com/nks/p93--tag-systems/">tag system</a>, in which one has a sequence of elements, and at each step removes a block from the beginning, and<tt>—</tt>depending on its form<tt>—</tt>adds a certain block at the end, as in:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08222024modelsAimg22_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08222024modelsAimg22.png' alt='' title='' width='632' height='402'> </div> </p></div> <p>And, yes, such systems <a href="https://writings.stephenwolfram.com/2021/03/after-100-years-can-we-finally-crack-posts-problem-of-tag-a-story-of-computational-irreducibility-and-more/">can generate extremely complex behavior</a><tt>—</tt>reinforcing the idea (that we have repeatedly seen here) that machine learning works by selecting complexity that aligns with goals that have been set.</p> <p>And along these lines, one can consider all sorts of different computational systems as foundations for machine learning. Here we’ve been looking at cellular-automaton-like and tag-system-like examples. But for example our <a href="https://www.wolframphysics.org/" target="_blank" rel="noopener">Physics Project</a> has shown us the power and flexibility of systems based on <a href="https://www.wolframphysics.org/technical-introduction/basic-form-of-models/" target="_blank" rel="noopener">hypergraph rewriting</a>. And from what we’ve seen here, it seems very plausible that something like hypergraph rewriting can serve as a yet more powerful and flexible substrate for machine learning.</p> <h2 id="so-in-the-end-whats-really-going-on-in-machine-learning">So in the End, What’s Really Going On in Machine Learning?</h2> <p>There are, I think, several quite striking conclusions from what we’ve been able to do here. The first is just that models much simpler than traditional neural nets seem capable of capturing the essential features of machine learning—and indeed these models may well be the basis for a new generation of practical machine learning.</p> <p>But from a scientific point of view, one of the things that’s important about these models is that they are simple enough in structure that it’s immediately possible to produce visualizations of what they’re doing inside. And studying these visualizations, the most immediately striking feature is how complicated they look. </p> <p>It could have been that machine learning would somehow “crack systems”, and find simple representations for what they do. But that doesn’t seem to be what’s going on at all. Instead what seems to be happening is that machine learning is in a sense just “hitching a ride” on the <a href="https://www.wolframscience.com/nks/chap-3--the-world-of-simple-programs/">general richness of the computational universe</a>. It’s not “specifically building up behavior one needs”; rather what it’s doing is to harness behavior that’s “already out there” in the computational universe.</p> <p>The fact that this could possibly work relies on the crucial—and at first unexpected—fact that in the computational universe even very simple programs can ubiquitously produce all sorts of complex behavior. And the point then is that this behavior has enough richness and diversity that it’s possible to find instances of it that align with machine learning objectives one’s defined. In some sense what machine learning is doing is to “mine” the computational universe for programs that do what one wants. </p> <p>It’s not that machine learning nails a specific precise program. Rather, it’s that in typical successful applications of machine learning there are lots of programs that “do more or less the right thing”. If what one’s trying to do involves something computationally irreducible, machine learning won’t typically be able to “get well enough aligned” to correctly “get through all the steps” of the irreducible computation. But it seems that <a href="https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/#models-for-human-like-tasks">many “human-like tasks”</a> that are the particular focus of modern machine learning can successfully be done. </p> <p>And by the way, one can expect that with the minimal models explored here, it becomes more feasible to get a real characterization of what kinds of objectives can successfully be achieved by machine learning, and what cannot. Critical to the operation of machine learning is not only that there exist programs that can do particular kinds of things, but also that they can realistically be found by adaptive evolution processes.</p> <p>In what we’ve done here we’ve often used what’s essentially the very simplest possible process for adaptive evolution: a sequence of point mutations. And what we’ve discovered is that even this is usually sufficient to lead us to satisfactory machine learning solutions. It could be that our paths of adaptive evolution would always be getting stuck—and not reaching any solution. But the fact that this doesn’t happen seems <a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/#the-fitness-landscape">crucially connected to the computational irreducibility</a> that’s ubiquitous in the systems we’re studying, and that leads to effective randomness that with overwhelming probability will “give us a way out” of anywhere we got stuck.</p> <p>In some sense computational irreducibility “levels the playing field” for different processes of adaptive evolution, and lets even simple ones be successful. Something similar seems to happen for the whole framework we’re using. Any of a wide class of systems seem capable of successful machine learning, even if they don’t have the detailed structure of traditional neural nets. We can see this as a typical reflection of the <a href="https://www.wolframscience.com/nks/chap-12--the-principle-of-computational-equivalence/">Principle of Computational Equivalence</a>: that even though systems may differ in their details, they are ultimately all equivalent in the computations they can do.</p> <p>The phenomenon of computational irreducibility leads to a fundamental tradeoff, of <a href="https://writings.stephenwolfram.com/2023/03/will-ais-take-all-our-jobs-and-end-human-history-or-not-well-its-complicated/">particular importance in thinking about things like AI</a>. If we want to be able to know in advance—and broadly guarantee—what a system is going to do or be able to do, we have to set the system up to be computationally reducible. But if we want the system to be able to make the richest use of computation, it’ll inevitably be capable of computationally irreducible behavior. And it’s the same story with machine learning. If we want machine learning to be able to do the best it can, and perhaps give us the impression of “achieving magic”, then we have to allow it to show computational irreducibility. And if we want machine learning to be “understandable” it has to be computationally reducible, and not able to access the full power of computation.</p> <p>At the outset, though, it’s not obvious whether machine learning actually has to access such power. It could be that there are computationally reducible ways to solve the kinds of problems we want to use machine learning to solve. But what we’ve discovered here is that even in solving very simple problems, the adaptive evolution process that’s at the heart of machine learning will end up sampling—and using—what we can expect to be computationally irreducible processes. </p> <p>Like biological evolution, machine learning is fundamentally about finding things that work—without the constraint of “understandability” that’s forced on us when we as humans explicitly engineer things step by step. Could one imagine constraining machine learning to make things understandable? To do so would effectively prevent machine learning from having access to the power of computationally irreducible processes, and from the evidence here it seems unlikely that with this constraint the kind of successes we’ve seen in machine learning would be possible.</p> <p>So what does this mean for the “science of machine learning”? One might have hoped that one would be able to “look inside” machine learning systems and get detailed narrative explanations for what’s going on; that in effect one would be able to “explain the mechanism” for everything. But what we’ve seen here suggests that in general nothing like this will work. All one will be able to say is that somewhere out there in the computational universe there’s some (typically computationally irreducible) process that “happens” to be aligned with what we want. </p> <p>Yes, we can make general statements—strongly based on computational irreducibility—about things like the findability of such processes, say by adaptive evolution. But if we ask “How in detail does the system work?”, there won’t be much of an answer to that. Of course we can trace all its computational steps and see that it behaves in a certain way. But we can’t expect what amounts to a “global human-level explanation” of what it’s doing. Rather, we’ll basically just be reduced to looking at some computationally irreducible process and observing that it “happens to work”—and we won’t have a high-level explanation of “why”.</p> <p>But there is one important loophole to all this. Within any computationally irreducible system, there are always inevitably pockets of computational reducibility. And—as I’ve <a href="https://writings.stephenwolfram.com/2023/12/observer-theory/">discussed at length particularly in connection with our Physics Project</a>—it’s these pockets of computational reducibility that allow computationally bounded observers like us to identify things like “laws of nature” from which we can build “human-level narratives”.</p> <p>So what about machine learning? What pockets of computational reducibility show up there, from which we might build “human-level scientific laws”? Much as with the emergence of “simple continuum behavior” from computationally irreducible processes happening at the level of molecules in a gas or ultimate discrete elements of space, we can expect that at least certain computationally reducible features will be more obvious when one’s dealing with larger numbers of components. And indeed in sufficiently large machine learning systems, it’s routine to see smooth curves and apparent regularity when one’s looking at the kind of aggregated behavior that’s probed by things like training curves.</p> <p>But the question about pockets of reducibility is always whether they end up being aligned with things we consider interesting or useful. Yes, it could be that machine learning systems would exhibit some kind of collective (“EEG-like”) behavior. But what’s not clear is whether this behavior will tell us anything about the actual “information processing” (or whatever) that’s going on in the system. And if there is to be a “science of machine learning” what we have to hope for is that we can find in machine learning systems pockets of computational reducibility that are aligned with things we can measure, and care about.</p> <p>So given what we’ve been able to explore here about the foundations of machine learning, what can we say about the <a href="https://writings.stephenwolfram.com/2024/03/can-ai-solve-science/">ultimate power of machine learning systems</a>? A key observation has been that machine learning works by “piggybacking” on computational irreducibility—and in effect by finding “natural pieces of computational irreducibility” that happen to fit with the objectives one has. But what if those objectives involve computational irreducibility—as they often do when one’s dealing with a process that’s been successfully formalized in computational terms (as in math, exact science, computational X, etc.)? Well, it’s not enough that our machine learning system “uses some piece of computational irreducibility inside”. To achieve a particular computationally irreducible objective, the system would have to do something closely aligned with that actual, specific objective. </p> <p>It has to be said, however, that by laying bare more of the essence of machine learning here, it becomes easier to at least define the issues of merging typical “formal computation” with machine learning. Traditionally there’s been a tradeoff between the computational power of a system and its trainability. And indeed in terms of what we’ve seen here this seems to reflect the sense that “larger chunks of computational irreducibility” are more difficult to fit into something one’s incrementally building up by a process of adaptive evolution.</p> <p>So how should we ultimately think of machine learning? In effect its power comes from leveraging the “natural resource” of computational irreducibility. But when it uses computational irreducibility it does so by “foraging” pieces that happen to advance its objectives. Imagine one’s building a wall. One possibility is to fashion bricks of a particular shape that one knows will fit together. But another is just to look at stones one sees lying around, then to build the wall by fitting these together as best one can. </p> <p>And if one then asks “Why does the wall have such-and-such a pattern?” the answer will end up being basically “Because that’s what one gets from the stones that happened to be lying around”. There’s no overarching theory to it in itself; it’s just a reflection of the resources that were out there. Or, in the case of machine learning, one can expect that what one sees will be to a large extent a reflection of the raw characteristics of computational irreducibility. In other words, the foundations of machine learning are as much as anything rooted in the <a href="https://writings.stephenwolfram.com/2021/09/charting-a-course-for-complexity-metamodeling-ruliology-and-more/#the-pure-basic-science-of-ruliology">science of ruliology</a>. And it’s in large measure to that science we should look in our efforts to understand more about “what’s really going on” in machine learning, and quite possibly also in neuroscience.</p> <h2 id="historical--personal-notes" style='font-size:1.2rem'>Historical & Personal Notes</h2> <p style='font-size:90%'>In some ways it seems like a quirk of intellectual history that the kinds of foundational questions I’ve been discussing here weren’t already addressed long ago—and in some ways it seems like an inexorable consequence of the only rather recent development of certain intuitions and tools.</p> <p style='font-size:90%'>The idea that the brain is fundamentally made of connected nerve cells was considered in the latter part of the nineteenth century, and took hold in the first decades of the twentieth century—with the <a href="https://www.wolframscience.com/nks/notes-10-12--history-of-ideas-about-thinking/">formalized concept of a neural net</a> that operates in a computational way emerging in full form in the work of Warren McCulloch and Walter Pitts in 1943. By the late 1950s there were hardware implementations of neural nets (typically for image processing) in the form of “perceptrons”. But despite early enthusiasm, practical results were mixed, and at the end of the 1960s it was announced that simple cases amenable to mathematical analysis had been “solved”—leading to a general belief that “neural nets couldn’t do anything interesting”.</p> <p style='font-size:90%'>Ever since the 1940s there had been a trickle of general analyses of neural nets, particularly using methods from physics. But typically these analyses ended up with things like continuum approximations—that could say little about the information-processing aspects of neural nets. Meanwhile, there was an ongoing undercurrent of belief that somehow neural networks would both explain and reproduce how the brain works—but no methods seemed to exist to say quite how. Then at the beginning of the 1980s there was a resurgence of interest in neural networks, coming from several directions. Some of what was done concentrated on very practical efforts to get neural nets to do particular “human-like” tasks. But some was more theoretical, typically using methods from statistical physics or dynamical systems. </p> <p style='font-size:90%'>Before long, however, the buzz died down, and for several decades only a few groups were left working with neural nets. Then in 2011 came a surprise breakthrough in using neural nets for image analysis. It was an important practical advance. But it was driven by technological ideas and development—not any significant new theoretical analysis or framework. </p> <p style='font-size:90%'>And this was also the pattern for almost all of what followed. People spent great effort to come up with neural net systems that worked—and <a href="https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/#the-practice-and-lore-of-neural-net-training">all sorts of folklore</a> grew up about how this should best be done. But there wasn’t really even an attempt at an underlying theory; this was a domain of engineering practice, not basic science. </p> <p style='font-size:90%'>And it was in this tradition that <a href="https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/">ChatGPT burst onto the scene</a> in late 2022. Almost everything about LLMs seemed to be complicated. Yes, there were empirically some large-scale regularities (like scaling laws). And I quickly suspected that the success of LLMs was a <a href="https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/#what-really-lets-chatgpt-work">strong hint of general regularities in human language</a> that hadn’t been clearly identified before. But beyond a few outlier examples, almost nothing about “what’s going on inside LLMs” has seemed easy to decode. And efforts to put “strong guardrails” on the operation of the system—in effect so as to make it in some way “predictable” or “understandable”—typically seem to substantially decrease its power (a point that now makes sense in the context of computational irreducibility).</p> <p style='font-size:90%'>My own interaction with machine learning and neural nets <a href="https://writings.stephenwolfram.com/2015/05/wolfram-language-artificial-intelligence-the-image-identification-project/#personal-backstory">began in 1980</a> when I was developing my <a href="https://writings.stephenwolfram.com/2013/06/there-was-a-time-before-mathematica/">SMP symbolic computation system</a>, and wondering whether it might be possible to generalize the symbolic pattern-matching foundations of the system to some kind of “fuzzy pattern matching” that would be closer to human thinking. I was aware of neural nets but thought of them as semi-realistic models of brains, not for example as potential sources of algorithms of the kind I imagined might “solve” fuzzy matching. </p> <p style='font-size:90%'>And it was partly as a result of trying to understand the essence of systems like neural nets that <a href="https://www.wolframscience.com/nks/p17--the-personal-story-of-the-science-in-this-book/">in 1981 I came up with</a> what I later learned could be thought of as one-dimensional cellular automata. Soon I was deeply involved in studying cellular automata and developing a new intuition about how complex behavior could arise even from simple rules. But when I learned about recent efforts to make idealized models of neural nets using ideas from statistical mechanics, I was at least curious enough to set up simulations to try to understand more about these models.</p> <p style='font-size:90%'>But what I did wasn’t a success. I could neither get the models to do anything of significant practical interest—nor did I manage to derive any good theoretical understanding of them. I kept wondering, though, what relationship there might be between cellular automata that “just run”, and systems like neural nets that can also “learn”. And in fact <a href="https://content.wolfram.com/sw-publications/2020/07/approaches-complexity-engineering.pdf">in 1985 I tried to make a minimal cellular-automaton-based model</a> to explore this. It was what I’m now calling a “vertically layered rule array”. And while in many ways I was already asking the right questions, this was an unfortunate specific choice of system—and my experiments on it didn’t reveal the kinds of phenomena we’re now seeing.</p> <p style='font-size:90%'>Years went by. I wrote a <a href="https://www.wolframscience.com/nks/chap-10--processes-of-perception-and-analysis/#sect-10-12--human-thinking">section on “Human Thinking”</a> in <em><a href="https://www.wolframscience.com/nks/">A New Kind of Science</a></em>, that discussed the possibility of simple foundational rules for the essence of thinking, and even included a minimal discrete analog of a neural net. At the time, though, I didn’t develop these ideas. By 2017, though, 15 years after the book was published—and knowing about the breakthroughs in deep learning—I had begun to <a href="https://writings.stephenwolfram.com/2017/05/a-new-kind-of-science-a-15-year-view/#machine-learning-and-the-neural-net-renaissance">think more concretely about neural nets as getting their power</a> by sampling programs from across the computational universe. But still I didn’t see quite how this would work. </p> <p style='font-size:90%'>Meanwhile, there was a new intuition emerging from practical experience with machine learning: that if you “bashed” almost any system “hard enough”, it would learn. Did that mean that perhaps one didn’t need all the details of neural networks to successfully do machine learning? And could one perhaps make a system whose structure was simple enough that its operation would for example be accessible to visualization? I particularly wondered about this when I was writing <a href="https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/">an exposition of ChatGPT and LLMs in early 2023</a>. And I kept talking about “LLM science”, but didn’t have much of a chance to work on it.</p> <p style='font-size:90%'>But then, a few months ago, as part of an <a href="https://writings.stephenwolfram.com/2024/03/can-ai-solve-science/">effort to understand the relation between what science does and what AI does</a>, I tried a kind of <a href="https://writings.stephenwolfram.com/2024/03/can-ai-solve-science/#exploring-spaces-of-systems">“throwaway experiment”</a>—which, to my considerable surprise, seemed to successfully capture some of the essence of what <a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/">makes biological evolution possible</a>. But what about other adaptive evolution—and in particular, machine learning? The models that seemed to be needed were embarrassingly close to <a href="https://content.wolfram.com/sw-publications/2020/07/approaches-complexity-engineering.pdf">what I’d studied in 1985</a>. But now I had a new intuition—and, thanks to <a href="https://www.wolfram.com/language/">Wolfram Language</a>, vastly better tools. And the result has been my effort here. </p> <p style='font-size:90%'>Of course this is only a beginning. But I’m excited to be able to see what I consider to be the beginnings of foundational science around machine learning. Already there are clear directions for practical applications (which, needless to say, I plan to explore). And there are signs that perhaps we may finally be able to understand just why—and when—the “magic” of machine learning works.</p> <h2 id="thanks" style='font-size:1.2rem'>Thanks</h2> <p style='font-size:90%'>Thanks to Richard Assar of the <a href="https://www.wolframinstitute.org/">Wolfram Institute</a> for extensive help. Thanks also to Brad Klee, Tianyi Gu, Nik Murzin and Max Niederman for specific results, to George Morgan and others at <a href="https://www.symbolica.ai/" target="_blank" rel="noopener">Symbolica</a> for their early interest, and to Kovas Boguta for suggesting many years ago to link machine learning to the ideas in <em>A New Kind of Science</em>.</p> ]]></content:encoded> <wfw:commentRss>https://writings.stephenwolfram.com/2024/08/whats-really-going-on-in-machine-learning-some-minimal-models/feed/</wfw:commentRss> <slash:comments>8</slash:comments> </item> <item> <title>Yet More New Ideas and New Functions: Launching Version 14.1 of Wolfram Language & Mathematica</title> <link>https://writings.stephenwolfram.com/2024/07/yet-more-new-ideas-and-new-functions-launching-version-14-1-of-wolfram-language-mathematica/</link> <pubDate>Wed, 31 Jul 2024 21:53:02 +0000</pubDate> <dc:creator><![CDATA[Stephen Wolfram]]></dc:creator> <category><![CDATA[Mathematica]]></category> <category><![CDATA[New Technology]]></category> <category><![CDATA[Wolfram Language]]></category> <category><![CDATA[Recent Release]]></category> <category><![CDATA[Version Release]]></category> <guid isPermaLink="false">https://writings.stephenwolfram.com/?p=60695</guid> <description><![CDATA[<span class="thumbnail"><img width="128" height="108" src="https://content.wolfram.com/sites/43/2024/07/swblog-v14.1-icon.png" class="attachment-thumbnail size-thumbnail wp-post-image" alt="" /></span>For the 36th Time… the Latest from Our R&D Pipeline There’s Now a Unified Wolfram App Vector Databases and Semantic Search RAGs and Dynamic Prompting for LLMs Connect to Your Favorite LLM Symbolic Arrays and Their Calculus Binomials and Pitchforks: Navigating Mathematical Conventions Fixed Points and Stability for Differential and Difference Equations The Steady Advance […]]]></description> <content:encoded><![CDATA[<span class="thumbnail"><img width="128" height="108" src="https://content.wolfram.com/sites/43/2024/07/swblog-v14.1-icon.png" class="attachment-thumbnail size-thumbnail wp-post-image" alt="" /></span><style type="text/css"> #blog a.bottomstripe { max-width: 100%; } #blog .post_content .inline-table-of-contents { background-color: #f6fcff; border: solid 1px #bbdbe8; padding: 24px 20px 10px 20px; } #blog .post_content .inline-table-of-contents div a { color: #19749a; font-size: 14px; line-height: 1.25; margin-bottom: 12px; } #blog .post_content .inline-table-of-contents div a:active, #blog .post_content .inline-table-of-contents div a:hover { color:#D76A00; } #blog .post_content .inline-table-of-contents div a > span:nth-of-type(1) { width: 0; } #blog .post_content .inline-table-of-contents div a > span:nth-of-type(1)::before { color: #85b7cb; content: '\25FC'; font-size: 10px; position: relative; top: -2px; } #blog .post_content .inline-table-of-contents div a > span:nth-of-type(1):after { margin: 0; opacity: 0; } #blog .post_content .inline-table-of-contents div a span:nth-of-type(2) { font-weight: 400; } #blog .post_content .inline-table-of-contents div.left { padding-right: 0.75rem; } #blog .post_content .inline-table-of-contents div.right { padding-left: 0.75rem; } </style> <div class="inline-table-of-contents"> <div class="grid cols-2 heirs-width-1-2 cols-1__600 heirs-width-full__600"> <div class="left"> <div><a href="https://writings.stephenwolfram.com/2024/07/yet-more-new-ideas-and-new-functions-launching-version-14-1-of-wolfram-language-mathematica/#for-the-36th-time--the-latest-from-our-rd-pipeline"><span></span><span>For the 36th Time… the Latest from Our R&D Pipeline</a></div> <div><a href="https://writings.stephenwolfram.com/2024/07/yet-more-new-ideas-and-new-functions-launching-version-14-1-of-wolfram-language-mathematica/#theres-now-a-unified-wolfram-app"><span></span><span>There’s Now a Unified Wolfram App</a></div> <div><a href="https://writings.stephenwolfram.com/2024/07/yet-more-new-ideas-and-new-functions-launching-version-14-1-of-wolfram-language-mathematica/#vector-databases-and-semantic-search"><span></span><span>Vector Databases and Semantic Search</a></div> <div><a href="https://writings.stephenwolfram.com/2024/07/yet-more-new-ideas-and-new-functions-launching-version-14-1-of-wolfram-language-mathematica/#rags-and-dynamic-prompting-for-llms"><span></span><span>RAGs and Dynamic Prompting for LLMs</a></div> <div><a href="https://writings.stephenwolfram.com/2024/07/yet-more-new-ideas-and-new-functions-launching-version-14-1-of-wolfram-language-mathematica/#connect-to-your-favorite-llm"><span></span><span>Connect to Your Favorite LLM</a></div> <div><a href="https://writings.stephenwolfram.com/2024/07/yet-more-new-ideas-and-new-functions-launching-version-14-1-of-wolfram-language-mathematica/#symbolic-arrays-and-their-calculus"><span></span><span>Symbolic Arrays and Their Calculus</a></div> <div><a href="https://writings.stephenwolfram.com/2024/07/yet-more-new-ideas-and-new-functions-launching-version-14-1-of-wolfram-language-mathematica/#binomials-and-pitchforks-navigating-mathematical-conventions"><span></span><span>Binomials and Pitchforks: Navigating Mathematical Conventions</a></div> <div><a href="https://writings.stephenwolfram.com/2024/07/yet-more-new-ideas-and-new-functions-launching-version-14-1-of-wolfram-language-mathematica/#fixed-points-and-stability-for-differential-and-difference-equations"><span></span><span>Fixed Points and Stability for Differential and Difference Equations</a></div> <div><a href="https://writings.stephenwolfram.com/2024/07/yet-more-new-ideas-and-new-functions-launching-version-14-1-of-wolfram-language-mathematica/#the-steady-advance-of-pdes"><span></span><span>The Steady Advance of PDEs</a></div> <div><a href="https://writings.stephenwolfram.com/2024/07/yet-more-new-ideas-and-new-functions-launching-version-14-1-of-wolfram-language-mathematica/#symbolic-biomolecules-and-their-visualization"><span></span><span>Symbolic Biomolecules and Their Visualization</a></div> <div><a href="https://writings.stephenwolfram.com/2024/07/yet-more-new-ideas-and-new-functions-launching-version-14-1-of-wolfram-language-mathematica/#optimizing-neural-nets-for-gpus-and-npus"><span></span><span>Optimizing Neural Nets for GPUs and NPUs</a></div> <div><a href="https://writings.stephenwolfram.com/2024/07/yet-more-new-ideas-and-new-functions-launching-version-14-1-of-wolfram-language-mathematica/#the-statistics-of-dates"><span></span><span>The Statistics of Dates</a></div> <div><a href="https://writings.stephenwolfram.com/2024/07/yet-more-new-ideas-and-new-functions-launching-version-14-1-of-wolfram-language-mathematica/#building-videos-with-programs"><span></span><span>Building Videos with Programs</a></div> <div><a href="https://writings.stephenwolfram.com/2024/07/yet-more-new-ideas-and-new-functions-launching-version-14-1-of-wolfram-language-mathematica/#optimizing-the-speech-recognition-workflow"><span></span><span>Optimizing the Speech Recognition Workflow</a></div> </div> <div class="right"> <div><a href="https://writings.stephenwolfram.com/2024/07/yet-more-new-ideas-and-new-functions-launching-version-14-1-of-wolfram-language-mathematica/#historical-geography-becomes-computable"><span></span><span>Historical Geography Becomes Computable</a></div> <div><a href="https://writings.stephenwolfram.com/2024/07/yet-more-new-ideas-and-new-functions-launching-version-14-1-of-wolfram-language-mathematica/#astronomical-graphics-and-their-axes"><span></span><span>Astronomical Graphics and Their Axes</a></div> <div><a href="https://writings.stephenwolfram.com/2024/07/yet-more-new-ideas-and-new-functions-launching-version-14-1-of-wolfram-language-mathematica/#when-is-earthrise-on-mars-new-level-of-astronomical-computation"><span></span><span>When Is Earthrise on Mars? New Level of Astronomical Computation</a></div> <div><a href="https://writings.stephenwolfram.com/2024/07/yet-more-new-ideas-and-new-functions-launching-version-14-1-of-wolfram-language-mathematica/#geometry-goes-color-and-polar"><span></span><span>Geometry Goes Color, and Polar</a></div> <div><a href="https://writings.stephenwolfram.com/2024/07/yet-more-new-ideas-and-new-functions-launching-version-14-1-of-wolfram-language-mathematica/#new-computation-flow-in-notebooks-introducing-cell-linked"><span></span><span>New Computation Flow in Notebooks: Introducing Cell-Linked %</a></div> <div><a href="https://writings.stephenwolfram.com/2024/07/yet-more-new-ideas-and-new-functions-launching-version-14-1-of-wolfram-language-mathematica/#the-ux-journey-continues-new-typing-affordances-and-more"><span></span><span>The UX Journey Continues: New Typing Affordances, and More</a></div> <div><a href="https://writings.stephenwolfram.com/2024/07/yet-more-new-ideas-and-new-functions-launching-version-14-1-of-wolfram-language-mathematica/#syntax-for-natural-language-input"><span></span><span>Syntax for Natural Language Input</a></div> <div><a href="https://writings.stephenwolfram.com/2024/07/yet-more-new-ideas-and-new-functions-launching-version-14-1-of-wolfram-language-mathematica/#diff---for-notebooks-and-more"><span></span><span>Diff[ ] … for Notebooks and More!</a></div> <div><a href="https://writings.stephenwolfram.com/2024/07/yet-more-new-ideas-and-new-functions-launching-version-14-1-of-wolfram-language-mathematica/#lots-of-little-language-tune-ups"><span></span><span>Lots of Little Language Tune-Ups</a></div> <div><a href="https://writings.stephenwolfram.com/2024/07/yet-more-new-ideas-and-new-functions-launching-version-14-1-of-wolfram-language-mathematica/#making-the-wolfram-compiler-easier-to-use"><span></span><span>Making the Wolfram Compiler Easier to Use</a></div> <div><a href="https://writings.stephenwolfram.com/2024/07/yet-more-new-ideas-and-new-functions-launching-version-14-1-of-wolfram-language-mathematica/#even-smoother-integration-with-external-languages"><span></span><span>Even Smoother Integration with External Languages</a></div> <div><a href="https://writings.stephenwolfram.com/2024/07/yet-more-new-ideas-and-new-functions-launching-version-14-1-of-wolfram-language-mathematica/#standalone-wolfram-language-applications"><span></span><span>Standalone Wolfram Language Applications!</a></div> <div><a href="https://writings.stephenwolfram.com/2024/07/yet-more-new-ideas-and-new-functions-launching-version-14-1-of-wolfram-language-mathematica/#and-yet-more"><span></span><span>And Yet More…</a></div> </div> </div> </div> <h2 id="for-the-36th-time--the-latest-from-our-rd-pipeline">For the 36th Time… the Latest from Our R&D Pipeline</h2> <p>Today we celebrate the arrival of the 36th (<em>x</em>.<em>x</em>) version of the <a href="https://www.wolfram.com/language/">Wolfram Language</a> and <a href="https://www.wolfram.com/mathematica/">Mathematica</a>: Version 14.1. We’ve been doing this <a href="https://www.wolfram.com/mathematica/scrapbook/">since 1986</a>: continually inventing new ideas and implementing them in our larger and larger tower of technology. And it’s always very satisfying to be able to deliver our latest achievements to the world. </p> <p>We released <a href="https://reference.wolfram.com/legacy/language/v14/">Version 14.0</a> just half a year ago. And—following our modern version scheduling—we’re now releasing Version 14.1. For most technology companies a .1 release would contain only minor tweaks. But for us it’s a snapshot of what our whole R&D pipeline has delivered—and it’s full of significant new features and new enhancements.</p> <p>If you’ve been following <a href="https://livestreams.stephenwolfram.com/">our livestreams</a>, you may have already seen many of these features and enhancements being discussed as part of our open software design process. And we’re grateful as always to members of the Wolfram Language community who’ve made suggestions—and requests. And in fact Version 14.1 contains a particularly large number of long-requested features, some of which involved development that has taken many years and required many intermediate achievements.<span id="more-60695"></span></p> <p>There’s lots of both extension and polishing in Version 14.1. There are a total of 89 entirely new functions—more than in any other version for the past couple of years. And there are also 137 existing functions that have been substantially updated. Along with more than 1300 distinct bug fixes and specific improvements. </p> <p>Some of what’s new in Version 14.1 relates to AI and LLMs. And, yes, we’re riding the leading edge of these kinds of capabilities. But the vast majority of what’s new has to do with our continued mission to bring computational language and computational knowledge to everything. And today that mission is even more important than ever, supporting not only human users, but also rapidly proliferating AI “users”—who are beginning to be able to routinely make even broader and deeper use of our technology than humans. </p> <p>Each new version of Wolfram Language represents a large amount of R&D by our team, and the encapsulation of a surprisingly large number of ideas about what should be implemented, and how it should be implemented. So, today, here it is: the latest stage in our four-decade journey to bring the superpower of the computational paradigm to everything. </p> <h2 id="theres-now-a-unified-wolfram-app">There’s Now a Unified Wolfram App</h2> <p>In the beginning we just had “<a href="https://reference.wolfram.com/legacy/v1/">Mathematica</a>”—that we described as “A System for Doing Mathematics by Computer”. But the core of “Mathematica”—based on the very general concept of transformations for symbolic expressions—was always much broader than “mathematics”. And it didn’t take long before “mathematics” was an increasingly small part of the system we had built. We <a href="https://writings.stephenwolfram.com/2013/02/what-should-we-call-the-language-of-mathematica/">agonized for years about how to rebrand things</a> to better reflect what the system had become. And eventually, just over a decade ago, we did the obvious thing, and named what we had “the Wolfram Language”. </p> <p>But when it came to actual software products and executables, so many people were familiar with having a “Mathematica” icon on their desktop that we didn’t want to change that. Later we introduced <a href="https://www.wolfram.com/wolfram-one/">Wolfram|One</a> as a general product supporting Wolfram Language across desktop and cloud—with Wolfram Desktop being its desktop component. But, yes, it’s all been a bit confusing. Ultimately there’s just one “bag of bits” that implements the whole system we’ve built, even though there are different usage patterns, and differently named products that the system supports. Up to now, each of these different products has been a different executable, that’s separately downloaded. </p> <p>But starting with Version 14.1 we’re unifying all these things—so that now there’s just a single unified Wolfram app, that can be configured and activated in different ways corresponding to different products.</p> <p>So now you just go to <a href="https://www.wolfram.com/download-center/">wolfram.com/download-center</a> and download the Wolfram app:</p> <p><img src='https://content.wolfram.com/sites/43/2024/07/sw07262024appimg1.png' alt='Wolfram app' title='Wolfram app' width='48' height='48'/></p> <p>After you’ve installed the app, you activate it as whatever product(s) you’ve got: Wolfram|One, Mathematica, <a href="https://www.wolfram.com/wolfram-alpha-notebook-edition/">Wolfram|Alpha Notebook Edition</a>, etc. Why have separate products? Each one has a somewhat different usage pattern, and provides a somewhat different interface optimized for that usage pattern. But now the actual downloading of bits has been unified; you just have to download the unified Wolfram app and you’ll get what you need. </p> <h2 id="vector-databases-and-semantic-search">Vector Databases and Semantic Search</h2> <p>Let’s say you’ve got a million documents (or webpages, or images, or whatever) and you want to find the ones that are “closest” to something. Version 14.1 now has a function—<tt><a href="http://reference.wolfram.com/language/ref/SemanticSearch.html">SemanticSearch</a></tt>—for doing this. How does <tt><a href="http://reference.wolfram.com/language/ref/SemanticSearch.html">SemanticSearch</a></tt> work? Basically it uses machine learning methods to find “vectors” (i.e. lists) of numbers that somehow represent the “meaning” of each of your documents. Then when you want to know which documents are “closest” to something, <tt><a href="http://reference.wolfram.com/language/ref/SemanticSearch.html">SemanticSearch</a></tt> computes the vector for the something, and then sees which of the document vectors are closest to this vector.</p> <p>In principle one could use <tt><a href="http://reference.wolfram.com/language/ref/Nearest.html">Nearest</a></tt> to find closest vectors. And indeed this works just fine for small examples where one can readily store all the vectors in memory. But <tt><a href="http://reference.wolfram.com/language/ref/SemanticSearch.html">SemanticSearch</a></tt> uses a full industrial-strength approach based on the new vector database capabilities of Version 14.1—which can work with huge collections of vectors stored in external files. </p> <p>There are lots of ways to use both <tt><a href="http://reference.wolfram.com/language/ref/SemanticSearch.html">SemanticSearch</a></tt> and vector databases. You can use them to find documents, snippets within documents, images, sounds or anything else whose “meaning” can somehow be captured by a vector of numbers. Sometimes the point is to retrieve content directly for human consumption. But a particularly strong modern use case is to set up “retrieval-augmented generation” (RAG) for LLMs—in which relevant content found with a vector database is used to provide a “dynamic prompt” for the LLM. And indeed in Version 14.1—as we’ll discuss later—we now have <tt><a href="http://reference.wolfram.com/language/ref/LLMPromptGenerator.html">LLMPromptGenerator</a></tt> to implement exactly this pipeline.</p> <p>But let’s come back to <tt><a href="https://reference.wolfram.com/language/ref/SemanticSearch.html">SemanticSearch</a></tt> on its own. Its basic design is modeled after <tt><a href="http://reference.wolfram.com/language/ref/TextSearch.html">TextSearch</a></tt>, which does keyword-based searching of text. (Note, though, that <tt><a href="http://reference.wolfram.com/language/ref/SemanticSearch.html">SemanticSearch</a></tt> also works on many things other than text.)</p> <p>In direct analogy to <tt><a href="http://reference.wolfram.com/language/ref/CreateSearchIndex.html">CreateSearchIndex</a></tt> for <tt><a href="http://reference.wolfram.com/language/ref/TextSearch.html">TextSearch</a></tt>, there’s now a <tt><a href="http://reference.wolfram.com/language/ref/CreateSemanticSearchIndex.html">CreateSemanticSearchIndex</a></tt> for <tt><a href="http://reference.wolfram.com/language/ref/SemanticSearch.html">SemanticSearch</a></tt>. Let’s do a tiny example to see how it works. Essentially we’re going to make an (extremely restricted) “inverse dictionary”. We set up a list of definition <img style="margin-bottom: -1px" class='' src="https://content.wolfram.com/uploads/sites/32/2022/10/rightarrow2.png" width='15' height='11' > word elements:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07302024vectorimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07302024vectorimg1.png' alt='' title='' width='603' height='161'> </div> </p></div> <p>Now create a semantic search index from this:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07302024vectorimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08082024vectorBCimg2.png' alt='' title='' width='553' height='75'> </div> </p></div> <p>Behind the scenes this is a vector database. But we can access it with <tt><a href="http://reference.wolfram.com/language/ref/SemanticSearch.html">SemanticSearch</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07302024vectorimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08082024vectorBCimg3.png' alt='' title='' width='405' height='44'> </div> </p></div> <p>And since “whale” is considered closest, it comes first. </p> <p>What about a more realistic example? Instead of just using 3 words, let’s set up definitions for all words in the dictionary. It takes a little while (like a few minutes) to do the machine learning feature extraction for all the definitions. But in the end you get a new semantic search index:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07302024vectorimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08082024vectorBCimg4.png' alt='' title='' width='560' height='111'> </div> </p></div> <p>This time it has 39,186 entries—but <tt><a href="http://reference.wolfram.com/language/ref/SemanticSearch.html">SemanticSearch</a></tt> picks out the (by default) 10 that it considers closest to what you asked for (and, yes, there’s an archaic definition of “seahorse” as “walrus”):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07302024vectorimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08082024vectorBCimg5.png' alt='' title='' width='634' height='44'> </div> </p></div> <p>We can see a bit more detail about what’s going on by asking <tt><a href="http://reference.wolfram.com/language/ref/SemanticSearch.html">SemanticSearch</a></tt> to explicitly show us distances:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07303034vectorSSimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' style="margin-left: 5px"src='https://content.wolfram.com/sites/43/2024/07/sw07303034vectorSSimg1A.png' alt='' title='' width='570' height='31'> <img style="margin-left: 5px" src='https://content.wolfram.com/sites/43/2024/07/sw07302024vectorimg6A.png' alt='SemanticSearch distances' title='SemanticSearch distances' width='595' height='81'/></div> </p></div> <p>And plotting these we can see that “whale” is the winner by a decent margin:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07302024vectorimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07302024vectorimg7.png' alt='' title='' width='409' height='248'> </div> </p></div> <p>One subtlety when dealing with semantic search indices is where to store them. When they’re sufficiently small, you can store them directly in memory, or in a notebook. But usually you’ll want to store them in a separate file, and if you want to share an index you’ll want to put this file in the cloud. You can do this either interactively from within a notebook</p> <p> <img loading='lazy' style="margin-left: 8px" src='https://content.wolfram.com/sites/43/2024/08/sw07302024vectorimg8B.png' alt='SemanticSearchIndex' title='SemanticSearchIndex' width='559' height='148'> </p> <p>or programmatically:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07302024vectorimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07302024vectorimg9.png' alt='' title='' width='554' height='158'> </div> </p></div> <p>And now the <tt><a href="http://reference.wolfram.com/language/ref/SemanticSearchIndex.html">SemanticSearchIndex</a></tt> object you have can be used by anyone, with its data being accessed in the cloud.</p> <p>In most cases <tt><a href="http://reference.wolfram.com/language/ref/SemanticSearch.html">SemanticSearch</a></tt> will be what you need. But sometimes it’s worthwhile to “go underneath” and directly work with vector databases. Here’s a collection of small vectors:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07302024vectorimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08082024vectorBCimg10.png' alt='' title='' width='708' height='44'> </div> </p></div> <p>We can use <tt><a href="http://reference.wolfram.com/language/ref/Nearest.html">Nearest</a></tt> to find the nearest vector to one we give:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07302024vectorimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08082024vectorBCimg11.png' alt='' title='' width='216' height='43'> </div> </p></div> <p>But we can also do this with a vector database. First we create the database:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07302024vectorimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08082024vectorBCimg12.png' alt='' title='' width='570' height='75'> </div> </p></div> <p>And now we can search for the nearest vector to the one we give:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07302024vectorimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07302024vectorimg13.png' alt='' title='' width='302' height='44'> </div> </p></div> <p>In this case we get exactly the same answer as from <tt><a href="http://reference.wolfram.com/language/ref/Nearest.html">Nearest</a></tt>. But whereas the mission of <tt><a href="http://reference.wolfram.com/language/ref/Nearest.html">Nearest</a></tt> is to give us the mathematically precise nearest vector, <tt><a href="http://reference.wolfram.com/language/ref/VectorDatabaseSearch.html">VectorDatabaseSearch</a></tt> is doing something less precise—but is able to do it for extremely large numbers of vectors that don’t need to be stored directly in memory.</p> <p>Those vectors can come from anywhere. For example, here they’re coming from extracting features from some images:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07302024vectorimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08082024vectorBCimg14.png' alt='' title='' width='635' height='163'> </div> </p></div> <p>Now let’s say we’ve got a specific image. Then we can search our vector database to get the image whose feature vector is closest to the one for the image we provided: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07302024vectorimg15_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08082024vectorBCimg15.png' alt='' title='' width='310' height='144'> </div> </p></div> <p>And, yes, this works for other kinds of objects too:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07302024vectorimg16_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08082024vectorBCimg16A.png' alt='' title='' width='550' height='50'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07302024vectorimg17_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08082024vectorBCimg17A.png' alt='' title='' width='295' height='110'> </div> </p></div> <p><tt><a href="http://reference.wolfram.com/language/ref/CreateSemanticSearchIndex.html">CreateSemanticSearchIndex</a></tt> and <tt><a href="http://reference.wolfram.com/language/ref/CreateVectorDatabase.html">CreateVectorDatabase</a></tt> create vector databases from scratch using data you provide. But—just like with text search indices—an important feature of vector databases is that you can incrementally add to them. So, for example, <tt><a href="http://reference.wolfram.com/language/ref/UpdateSemanticSearchIndex.html">UpdateSemanticSearchIndex</a></tt> and <tt><a href="http://reference.wolfram.com/language/ref/AddToVectorDatabase.html">AddToVectorDatabase</a></tt> let you efficiently add individual entries or lists of entries to vector databases. </p> <p>In addition to providing capabilities for building (and growing) your own vector databases, there are several pre-built vector databases that are now available in the <a href="https://datarepository.wolframcloud.com/" target="_blank" rel="noopener">Wolfram Data Repository</a>:</p> <p><img src='https://content.wolfram.com/sites/43/2024/07/sw07302024vectorimg18A.png' alt='Vector Databases' title='Vector Databases' width='270' height='215'/></p> <p>So now we can use a pre-built vector database of Wolfram Language function documentation to do a semantic search for snippets that are “semantically close” to being about iterating functions:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07302024vectorimg19_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024vectorSemanticS.png' alt='' title='' width='670' height='144'> </div> </p></div> <p>(<a href="https://writings.stephenwolfram.com/2024/07/yet-more-new-ideas-and-new-functions-launching-version-14-1-of-wolfram-language-mathematica/#rags-and-dynamic-prompting-for-llms">In the next section</a>, we’ll see how to actually “synthesize a report” based on this.)</p> <p>The basic function of <tt><a href="http://reference.wolfram.com/language/ref/SemanticSearch.html">SemanticSearch</a></tt> is to determine what “chunks of content” are closest to what you are asking about. But given a semantic search index (AKA vector database) there are also other important things you can do. One of them is to use <tt><a href="http://reference.wolfram.com/language/ref/TextSummarize.html">TextSummarize</a></tt> to ask not for specific chunks but rather for some kind of overall summary of what can be said about a given topic from the content in the semantic search index:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07302024vectorimg20_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07302024vectorimg20.png' alt='' title='' width='685' height='115'> </div> </p></div> <h2 id="rags-and-dynamic-prompting-for-llms">RAGs and Dynamic Prompting for LLMs</h2> <p>How does one tell an LLM what one wants it to do? Fundamentally, one provides a prompt, and then the LLM generates output that “continues” that prompt. Typically the last part of the prompt is the specific question (or whatever) that a user is asking. But before that, there’ll be “pre-prompts” that prime the LLM in various ways to determine how it should respond. </p> <p>In <a href="https://reference.wolfram.com/legacy/language/v13.3/">Version 13.3</a> in mid-2023 (i.e. a long time ago in the world of LLMs!) we introduced <tt><a href="http://reference.wolfram.com/language/ref/LLMPrompt.html">LLMPrompt</a></tt> as a symbolic way to specify a prompt, and we launched the <a href="https://resources.wolframcloud.com/PromptRepository">Wolfram Prompt Repository</a> as a broad source for pre-built prompts. Here’s an example of using <tt><a href="http://reference.wolfram.com/language/ref/LLMPrompt.html">LLMPrompt</a></tt> with a prompt from the Wolfram Prompt Repository:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07282024RAGsimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07282024RAGsimg1.png' alt='' title='' width='570' height='44'> </div> </p></div> <p>In its simplest form, <tt><a href="http://reference.wolfram.com/language/ref/LLMPrompt.html">LLMPrompt</a></tt> just adds fixed text to “pre-prompt” the LLM. <tt><a href="http://reference.wolfram.com/language/ref/LLMPrompt.html">LLMPrompt</a></tt> is also set up to take arguments that modify the text it’s adding:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07282024RAGsimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07282024RAGsimg2.png' alt='' title='' width='607' height='43'> </div> </p></div> <p>But what if one wants the LLM to be pre-prompted in a way that depends on information that’s only available once the user actually asks their question (like, for example, the text of the question itself)? In Version 14.1 we’re adding <tt><a href="http://reference.wolfram.com/language/ref/LLMPromptGenerator.html">LLMPromptGenerator</a></tt> to dynamically generate pre-prompts. And it turns out that this kind of “dynamic prompting” is remarkably powerful, and—particularly together with tool calling—opens up a whole new level of capabilities for LLMs. </p> <p>For example, we can set up a prompt generator that produces a pre-prompt that gives the registered name of the user, so the LLM can use this information when generating its answer:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07282024RAGsimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07282024RAGsimg3.png' alt='' title='' width='599' height='115'> </div> </p></div> <p>Or for example here the prompt generator is producing a pre-prompt about sunrise, sunset and the current time:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07282024RAGsimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07282024RAGsimg4.png' alt='' title='' width='499' height='138'> </div> </p></div> <p>And, yes, if the pre-prompt contains extra information (like about the Moon) the LLM will (probably) ignore it:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07282024RAGsimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07282024RAGsimg5.png' alt='' title='' width='479' height='161'> </div> </p></div> <p>As another example, we can take whatever the user asks, and first do a web search on it, then include as a pre-prompt snippets we get from the web. The result is that we can get answers from the LLM that rely on specific “web knowledge” that we can’t expect will be “known in detail” by the raw LLM:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07282024RAGsimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07282024RAGsimg6.png' alt='' title='' width='582' height='208'> </div> </p></div> <p>But often one doesn’t want to just “search at random on the web”; instead one wants to systematically retrieve information from some known source to give as “briefing material” to the LLM to help it in generating its answer. And a typical way to implement this kind of “retrieval-augmented generation (RAG)” is to set up an <tt><a href="http://reference.wolfram.com/language/ref/LLMPromptGenerator.html">LLMPromptGenerator</a></tt> that uses the <tt><a href="http://reference.wolfram.com/language/ref/SemanticSearch.html">SemanticSearch</a></tt> and vector database capabilities that we introduced in Version 14.1.</p> <p>So, for example, here’s a semantic search index generated from my (rather voluminous) writings:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07282024RAGsimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07282024RAGsimg7.png' alt='' title='' width='604' height='75'> </div> </p></div> <p>By setting up a prompt generator based on this, I can now ask the LLM “personal questions”:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07282024RAGsimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07282024RAGsimg8.png' alt='' title='' width='513' height='68'> </div> </p></div> <p>How did the LLM “know that”? Internally the prompt generator used <tt><a href="http://reference.wolfram.com/language/ref/SemanticSearch.html">SemanticSearch</a></tt> to generate a collection of snippets, which the LLM then “trawled through” to produce a specific answer:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07282024RAGsimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07282024RAGsimg9.png' alt='' title='' width='440' height='14'><img loading='lazy' style="margin-left: -3px" src='https://content.wolfram.com/sites/43/2024/07/sw07282024RAGsimg9B.png' alt='' title='' width='500' height='204'> </div> </p></div> <p>It’s already often very useful just to “retrieve static text” to “brief” the LLM. But even more powerful is to brief the LLM with what it needs to call tools that can do further computation, etc. So, for example, if you want the LLM to write and run Wolfram Language code that uses functions you’ve created, you can do that by having it first “read the documentation” for those functions.</p> <p>As an example, this uses a prompt generator that uses a semantic search index built from the <a href="https://resources.wolframcloud.com/FunctionRepository/">Wolfram Function Repository</a>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07282024RAGsimg10A_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07282024RAGsimg10A.png' alt='' title='' width='674' height='242'> </div> </p></div> <h2 id="connect-to-your-favorite-llm">Connect to Your Favorite LLM</h2> <p>There are now many ways to use LLM functionality from within the Wolfram Language, and <a href="https://www.wolfram.com/notebooks/">Wolfram Notebooks</a>. You can do it programmatically, with <tt><a href="http://reference.wolfram.com/language/ref/LLMFunction.html">LLMFunction</a></tt>, <tt><a href="http://reference.wolfram.com/language/ref/LLMSynthesize.html">LLMSynthesize</a></tt>, etc. You can do it interactively through <a href="https://writings.stephenwolfram.com/2023/06/introducing-chat-notebooks-integrating-llms-into-the-notebook-paradigm/">Chat Notebooks</a> and related chat capabilities.</p> <p>But (at least for now) there’s no full-function LLM built directly into the Wolfram Language. So that means that (at least for now) you have to choose your “flavor” of external LLM to power Wolfram Language LLM functionality. And in Version 14.1 we have support for basically all major available foundation-model LLMs. </p> <p>We’ve made it as straightforward as possible to set up connections to external LLMs. Once you’ve done it, you can select what you want directly in any Chat Notebook</p> <p><img src='https://content.wolfram.com/sites/43/2024/08/sw08012024connectimg1-v2.png' alt='Choose your LLM' title='Choose your LLM' width='570' height='406'/></p> <p>or from your global <tt>Preferences</tt>:</p> <p><img src='https://content.wolfram.com/sites/43/2024/07/sw07262024connectBimg2.png' alt='LLM global preferences' title='LLM global preferences' width='500' height='369'/></p> <p>When you’re using a function you specify the “model” (i.e. service and specific model name) as part of the setting for <tt><a href="http://reference.wolfram.com/language/ref/LLMEvaluator.html">LLMEvaluator</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07262024connectBimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07262024connectBimg3.png' alt='' title='' width='472' height='68'> </div> </p></div> <p>In general you can use <tt><a href="http://reference.wolfram.com/language/ref/LLMConfiguration.html">LLMConfiguration</a></tt> to define the whole configuration of an LLM you want to connect to, and you can make a particular configuration your default either interactively using <tt>Preferences</tt>, or by explicitly setting the value of <tt><a href="http://reference.wolfram.com/language/ref/$LLMEvaluator.html">$LLMEvaluator</a></tt>.</p> <p>So how do you initially set up a connection to a new LLM? You can do it interactively by pressing <span class="kbd"><kbd>Connect</kbd></span> in the <tt>AI Settings</tt> pane of <tt>Preferences</tt>. Or you can do it programmatically using <tt><a href="http://reference.wolfram.com/language/ref/ServiceConnect.html">ServiceConnect</a></tt>:</p> <p> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07262024connectBimg4.png' alt='ServiceConnect' title='ServiceConnect' width='376' height='397'></p> <p>At the “<tt><a href="http://reference.wolfram.com/language/ref/ServiceConnect.html">ServiceConnect</a></tt> level” you have very direct access to the features of LLM APIs, though unless you’re studying LLM APIs you probably won’t need to use these. But talking of LLM APIs, one of the things that’s now easy to do with Wolfram Language is to compare LLMs, for example programmatically sending the same question to multiple LLMs:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07262024connectBimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07262024connectBimg5A.png' alt='' title='' width='659' height='146'><img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07262024connectBimg5B.png' alt='' title='' width='620' height='612'> </div> </p></div> <p>And in fact we’ve recently started <a href="https://www.wolfram.com/llm-benchmarking-project/">posting weekly results</a> that we get from a full range of LLMs on the task of writing Wolfram Language code (conveniently, the exercises in my book <em><a href="https://www.wolfram.com/language/elementary-introduction/3rd-ed/">An Elementary Introduction to the Wolfram Language</a></em> have textual “prompts”, and we have a well-developed system that we’ve used for many years in assessing code for the online course based on the book):</p> <p><a href="https://www.wolfram.com/llm-benchmarking-project/" ><img src='https://content.wolfram.com/sites/43/2024/07/sw07312024benchmarking.png' alt='Wolfram LLM Benchmarking Project' title='Wolfram LLM Benchmarking Project' width='568' height='441'/></a></p> <h2 id="symbolic-arrays-and-their-calculus">Symbolic Arrays and Their Calculus</h2> <p>I want A to be an <em>n</em>×<em>n</em> matrix. I don’t want to say what its elements are, and I don’t even want to say what <em>n</em> is. I just want to have a way to treat the whole thing symbolically. Well, in Version 14.1 we’ve introduced <tt><a href="http://reference.wolfram.com/language/ref/MatrixSymbol.html">MatrixSymbol</a></tt> to do that. </p> <p>A <tt><a href="http://reference.wolfram.com/language/ref/MatrixSymbol.html">MatrixSymbol</a></tt> has a name (just like an ordinary symbol)—and it has a way to specify its dimensions. We can use it, for example, to set up a symbolic representation for our matrix A:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08142024arrayBCimg1.png' alt='' title='' width='263' height='46'> </div> </p></div> <p>Hovering over this in a notebook, we’ll get a tooltip that explains what it is:</p> <p><img style="margin-left: 6px" src='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg2A.png' alt='Matrix dimensions tooltip' title='Matrix dimensions tooltip' width='175' height='78'/></p> <p>We can ask for its dimensions as a tensor:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08142024arrayBCimg3.png' alt='' title='' width='196' height='44'> </div> </p></div> <p>Here’s its inverse, again represented symbolically:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08142024arrayBCimg4.png' alt='' title='' width='120' height='47'> </div> </p></div> <p>That also has dimensions <em>n</em>×<em>n</em>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08142024arrayBCimg5.png' alt='' title='' width='269' height='44'> </div> </p></div> <p>In Version 14.1 you can not only have symbolic matrices, you can also have symbolic vectors and, for that matter, symbolic arrays of any rank. Here’s a length-<em>n</em> symbolic vector (and, yes, we can have a symbolic vector named <em>v</em> that we assign to a symbol <em>v</em>): </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08142024arrayBCimg6.png' alt='' title='' width='223' height='44'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08142024arrayBCimg7.png' alt='' title='' width='209' height='44'> </div> </p></div> <p>So now we can construct something like the quadratic form:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08142024arrayBCimg8.png' alt='' title='' width='80' height='44'> </div> </p></div> <p>A classic thing to compute from this is its gradient with respect to the vector <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimgvector.png' alt='' title='' width='13' height='17'>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08142024arrayBCimg9.png' alt='' title='' width='143' height='46'> </div> </p></div> <p>And actually this is just the same as the “vector derivative”:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08142024arrayBCimg10.png' alt='' title='' width='120' height='46'> </div> </p></div> <p>If we do a second derivative we get:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08142024arrayBCimg11.png' alt='' title='' width='159' height='48'> </div> </p></div> <p>What happens if we differentiate <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimgvector.png' alt='' title='' width='13' height='17'> with respect to <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimgvector.png' alt='' title='' width='13' height='17'>? Well, then we get a symbolic identity matrix</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg12.png' alt='' title='' width='95' height='43'> </div> </p></div> <p>which again has dimensions <em>n</em>×<em>n</em>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg13.png' alt='' title='' width='201' height='44'> </div> </p></div> <p><img loading='lazy' style="margin-bottom: -4px" src='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimgLN.png' alt='' title='' width='15' height='17'> is a rank-2 example of a symbolic identity array:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg14.png' alt='' title='' width='247' height='43'> </div> </p></div> <p>If we give <em>n</em> an explicit value, we can get an explicit componentwise array:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg15_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg15.png' alt='' title='' width='249' height='44'> </div> </p></div> <p>Let’s say we have a function of <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimgvector.png' alt='' title='' width='13' height='17'>, like <tt><a href="http://reference.wolfram.com/language/ref/Total.html">Total</a></tt>. Once again we can find the derivative with respect to <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimgvector.png' alt='' title='' width='13' height='17'>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg16_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg16.png' alt='' title='' width='153' height='43'> </div> </p></div> <p>And now we see another symbolic array construct: <a href="https://reference.wolfram.com/language/ref/SymbolicOnesArray.html"><tt>SymbolicOnesArray</tt></a>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg17_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg17.png' alt='' title='' width='220' height='57'> </div> </p></div> <p>This is simply an array whose elements are all 1:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg18_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg18.png' alt='' title='' width='257' height='44'> </div> </p></div> <p>Differentiating a second time gives us a <a href="https://reference.wolfram.com/language/ref/SymbolicZerosArray.html"><tt>SymbolicZerosArray</tt></a>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg19_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg19.png' alt='' title='' width='191' height='45'> </div> </p></div> <p>Although we’re not defining explicit elements for <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimgvector.png' alt='' title='' width='13' height='17'>, it’s sometimes important to specify, for example, that all the elements are reals:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg20_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08142024arrayBCimg20.png' alt='' title='' width='275' height='44'> </div> </p></div> <p>For a vector whose elements are reals, it’s straightforward to find the derivative of the norm:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg21_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg21.png' alt='' title='' width='167' height='68'> </div> </p></div> <p>The third derivative, though, is a bit more complicated:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg22_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg22.png' alt='' title='' width='515' height='70'> </div> </p></div> <p>The ⊗ here is <tt><a href="http://reference.wolfram.com/language/ref/TensorProduct.html">TensorProduct</a></tt>, and the T:(1,3,2) represents <tt><a href="http://reference.wolfram.com/language/ref/Transpose.html">Transpose</a></tt><tt>[..., {1, 3, 2}]</tt>. </p> <p>In the Wolfram Language, a symbol, say <em>s</em>, can stand on its own, and represent a “variable”. It can also appear as a head—as in <tt>s[x]</tt>—and represent a function. And the same is true for vector and matrix symbols:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg23_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08142024arrayBCimg23.png' alt='' title='' width='252' height='47'> </div> </p></div> <p>Importantly, the chain rule also works for matrix and vector functions:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg24_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08142024arrayBCimg24.png' alt='' title='' width='187' height='49'> </div> </p></div> <p>Things get a bit trickier when one’s dealing with functions of matrices:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg25_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08142024arrayBCimg25.png' alt='' title='' width='192' height='54'> </div> </p></div> <p>The <img loading='lazy' style="margin-bottom: -6px" src='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimgdot2.png' alt='' title='' width='9' height='22'/> here represents <tt><a href="http://reference.wolfram.com/language/ref/ArrayDot.html">ArrayDot</a></tt><tt>[..., ..., 2]</tt>, which is a generalization of <tt><a href="http://reference.wolfram.com/language/ref/Dot.html">Dot</a></tt>. Given two arrays <em>u</em> and <em>v</em>, <tt><a href="http://reference.wolfram.com/language/ref/Dot.html">Dot</a></tt> will contract the last index of <em>u</em> with the first index of <em>v</em>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg27_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg27.png' alt='' title='' width='502' height='44'> </div> </p></div> <p><tt><a href="http://reference.wolfram.com/language/ref/ArrayDot.html">ArrayDot</a></tt><tt>[u, v, n]</tt>, on the other hand, contracts the last <em>n</em> indices of <em>u</em> with the first <em>n</em> of <em>v</em>. <nobr><tt><a href="http://reference.wolfram.com/language/ref/ArrayDot.html">ArrayDot</a></tt><tt>[u, v, 1]</tt></nobr> is just the same as <tt><a href="http://reference.wolfram.com/language/ref/Dot.html">Dot</a></tt><tt>[u, v]</tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg28_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg28.png' alt='' title='' width='556' height='44'> </div> </p></div> <p>But now in this particular example all the indices get “contracted out”:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg29_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg29.png' alt='' title='' width='556' height='44'> </div> </p></div> <p>We’ve talked about symbolic vectors and matrices. But—needless to say—what we have is completely general, and will work for arrays of any rank. Here’s an example of a <em>p</em>×<em>q</em>×<em>r</em> array:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg30_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08142024arrayBCimg30.png' alt='' title='' width='272' height='55'> </div> </p></div> <p>The overscript <img loading='lazy' style="margin-bottom: -11px" src='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimgbox3.png' alt='' title='' width='21' height='29'> indicates that this is an array of rank 3. </p> <p>When one takes derivatives, it’s very easy to end up with high-rank arrays. Here’s the result of differentiating with respect to a matrix:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg31_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg31.png' alt='' title='' width='161' height='51'> </div> </p></div> <p><img loading='lazy' style="margin-bottom: -4px" src='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimgLLN.png' alt='' title='' width='24' height='18'> is a rank-4 <em>n</em>×<em>n</em>×<em>n</em>×<em>n</em> identity array.</p> <p>When one’s dealing with higher-rank objects, there’s one more construct that appears—that we call <tt><a href="http://reference.wolfram.com/language/ref/SymbolicDeltaProductArray.html">SymbolicDeltaProductArray</a></tt>. Let’s set up a rank-3 array with dimensions 3×3×3:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg32_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg32.png' alt='' title='' width='245' height='55'> </div> </p></div> <p>Now let’s compute a derivative:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg33_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg33.png' alt='' title='' width='147' height='67'> </div> </p></div> <p>The result is a rank-5 array that’s effectively a combination of two <tt><a href="http://reference.wolfram.com/language/ref/KroneckerDelta.html">KroneckerDelta</a></tt> objects for indices 1,4 and 2,5, respectively:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg34_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg34.png' alt='' title='' width='450' height='168'> </div> </p></div> <p>We can visualize this with <tt><a href="http://reference.wolfram.com/language/ref/ArrayPlot3D.html">ArrayPlot3D</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg35_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg35.png' alt='' title='' width='345' height='194'> </div> </p></div> <p>The most common way to deal with arrays in the Wolfram Language has always been in terms of explicit lists of elements. And in this representation it’s extremely convenient that operations are normally done elementwise:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg36_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg36.png' alt='' title='' width='181' height='44'> </div> </p></div> <p>Non-lists are then by default treated as scalars—and for example here added into every element:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg37_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg37.png' alt='' title='' width='173' height='43'> </div> </p></div> <p>But now there’s something new, namely symbolic arrays—which in effect implicitly contain multiple list elements, and thus can’t be “added into every element”:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg38_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg38.png' alt='' title='' width='134' height='49'> </div> </p></div> <p>This is what happens when we have an “ordinary scalar” together with a symbolic vector:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg39_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg39.png' alt='' title='' width='200' height='49'> </div> </p></div> <p>How does this work? “Under the hood” there’s a new attribute <tt><a href="http://reference.wolfram.com/language/ref/NonThreadable.html">NonThreadable</a></tt> which specifies that certain heads (like <tt><a href="http://reference.wolfram.com/language/ref/ArraySymbol.html">ArraySymbol</a></tt>) shouldn’t be threaded by <tt><a href="http://reference.wolfram.com/language/ref/Listable.html">Listable</a></tt> functions (like <tt><a href="http://reference.wolfram.com/language/ref/Plus.html">Plus</a></tt>).</p> <p>By the way, ever since <a href="https://reference.wolfram.com/legacy/v9/guide/Mathematica.html">Version 9</a> a dozen years ago we’ve had a limited mechanism for assuming that symbols represent vectors, matrices or arrays—and now that mechanism interoperates with all our new symbolic array functionality:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg40_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg40.png' alt='' title='' width='457' height='44'> </div> </p></div> <p>When you’re doing explicit computations there’s often no choice but to deal directly with individual array elements. But it turns out that there are all sorts of situations where it’s possible to work instead in terms of “whole” vectors, matrices, etc. And indeed in the literature of fields like machine learning, optimization, statistics and control theory, it’s become quite routine to write down formulas in terms of symbolic vectors, matrices, etc. And what Version 14.1 now adds is a streamlined way to compute in terms of these symbolic array constructs.</p> <p>The results are often very elegant. So, for example, here’s how one might set up a general linear least-squares problem using our new symbolic array constructs. First we define a symbolic <em>n</em>×<em>m</em> matrix A, and symbolic vectors <em>b</em> and <em>x</em>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg41_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08142024arrayBC2img41.png' alt='' title='' width='320' height='66'> </div> </p></div> <p>Our goal is to find a vector <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimgvector.png' alt='' title='' width='13' height='17'> that minimizes <span class='InlineFormula'><img src='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg42.png' width= '70' height='23' align='absmiddle'></span>. And with our definitions we can now immediately write down this quantity:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg43_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08142024arrayBC2img43.png' alt='' title='' width='305' height='66'> </div> </p></div> <p>To extremize it, we compute its derivative</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg44_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08142024arrayBCimg44.png' alt='' title='' width='195' height='58'> </div> </p></div> <p>and to ensure we get a minimum, we compute the second derivative:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg45_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08142024arrayBCimg45.png' alt='' title='' width='234' height='61'> </div> </p></div> <p>These are standard textbook formulas, but the cool thing is that in Version 14.1 we’re now in a position to generate them completely automatically. By the way, if we take another derivative, the result will be a zero tensor:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg46_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08142024arrayBCimg46.png' alt='' title='' width='234' height='50'> </div> </p></div> <p>We can look at other norms too:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg47_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08142024arrayBCimg47.png' alt='' title='' width='209' height='81'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024arrayimg48_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08142024arrayBCimg48.png' alt='' title='' width='610' height='94'> </div> </p></div> <h2 id="binomials-and-pitchforks-navigating-mathematical-conventions">Binomials and Pitchforks: Navigating Mathematical Conventions</h2> <p>Binomial coefficients have been around for at least a thousand years, and one might not have thought there could possibly be anything shocking or controversial about them anymore (notwithstanding the fictional <em><a href="https://en.wikipedia.org/wiki/A_Treatise_on_the_Binomial_Theorem" target="_blank" rel="noopener">Treatise on the Binomial Theorem</a></em> by Sherlock Holmes’s nemesis Professor Moriarty). But in fact we have recently been mired in an intense debate about binomial coefficients—which has caused us in Version 14.1 to introduce a new function <tt><a href="http://reference.wolfram.com/language/ref/PascalBinomial.html">PascalBinomial</a></tt> alongside our existing <tt><a href="http://reference.wolfram.com/language/ref/Binomial.html">Binomial</a></tt>.</p> <p>When one’s dealing with positive integer arguments, there’s no issue with binomials. And even when one extends to generic complex arguments, there’s again a unique way to do this. But negative integer arguments are a special degenerate case. And that’s where there’s trouble—because there are two different definitions that have historically been used.</p> <p>In early versions of Mathematica, we picked one of these definitions. But over time we realized that it led to some subtle inconsistencies, and so for <a href="https://reference.wolfram.com/legacy/v7/guide/Mathematica.html">Version 7</a>—in 2008—we changed to the other definition. Some of our users were happy with the change, but some were definitely not. A notable (vociferous) example was my friend <a href="https://www.wolframalpha.com/input?i=don+knuth">Don Knuth</a>, who has written several well-known books that make use of binomial coefficients—always choosing what amounts to our pre-2008 definition. </p> <p>So what could we do about this? For a while we thought about adding an option to <tt><a href="http://reference.wolfram.com/language/ref/Binomial.html">Binomial</a></tt>, but to do this would have broken our normal conventions for mathematical functions. And somehow we kept on thinking that there was ultimately a “right answer” to how binomial coefficients should be defined. But after a lot of discussion—and historical research—we finally concluded that since at least before 1950 there have just been two possible definitions, each with their own advantages and disadvantages, with no obvious “winner”. And so in Version 14.1 we decided just to introduce a new function <tt><a href="http://reference.wolfram.com/language/ref/PascalBinomial.html">PascalBinomial</a></tt> to cover the “other definition”. </p> <p>And—though at first it might not seem like much—here’s a big difference between <tt><a href="http://reference.wolfram.com/language/ref/Binomial.html">Binomial</a></tt> and <tt><a href="http://reference.wolfram.com/language/ref/PascalBinomial.html">PascalBinomial</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024binomialsimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024binomialsimg1.png' alt='' title='' width='170' height='43'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024binomialsimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024binomialsimg2.png' alt='' title='' width='215' height='43'> </div> </p></div> <p>Part of why things get complicated is the relation to symbolic computation. <tt><a href="http://reference.wolfram.com/language/ref/Binomial.html">Binomial</a></tt> has a symbolic simplification rule, valid for any <em>n</em>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024binomialsimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024binomialsimg3.png' alt='' title='' width='149' height='43'> </div> </p></div> <p>But there isn’t a corresponding generic simplification rule for <tt><a href="http://reference.wolfram.com/language/ref/PascalBinomial.html">PascalBinomial</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024binomialsimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024binomialsimg4.png' alt='' title='' width='193' height='44'> </div> </p></div> <p><tt><a href="http://reference.wolfram.com/language/ref/FunctionExpand.html">FunctionExpand</a></tt> shows us the more nuanced result in this case:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024binomialsimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024binomialsimg5.png' alt='' title='' width='419' height='61'> </div> </p></div> <p>To see a bit more of what’s going on, we can compute arrays of nonzero results for <tt><a href="http://reference.wolfram.com/language/ref/Binomial.html">Binomial</a></tt> and <tt><a href="http://reference.wolfram.com/language/ref/PascalBinomial.html">PascalBinomial</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024binomialsimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024binomialsimg6.png' alt='' title='' width='661' height='297'> </div> </p></div> <p><tt><a href="http://reference.wolfram.com/language/ref/Binomial.html">Binomial</a></tt><tt>[n, k]</tt> has the “nice feature” that it’s symmetric in <em>k</em> even when <em>n</em> < 0. But this has the “bad consequence” that <a href="https://artofproblemsolving.com/wiki/index.php/Pascal%27s_Identity" target="_blank" rel="noopener">Pascal’s identity</a> (that says a particular binomial coefficient is the sum of two coefficients “above it”) isn’t always true. <tt><a href="http://reference.wolfram.com/language/ref/PascalBinomial.html">PascalBinomial</a></tt>, on the other hand, always satisfies the identity, and it’s in recognition of this that we put “Pascal” in its name.</p> <p>And, yes, this is all quite subtle. And, remember, the differences between <tt><a href="http://reference.wolfram.com/language/ref/Binomial.html">Binomial</a></tt> and <tt><a href="http://reference.wolfram.com/language/ref/PascalBinomial.html">PascalBinomial</a></tt> only show up at negative integer values. Away from such values, they’re both given by the same expression, involving gamma functions. But at negative integer values, they correspond to different limits, respectively:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024binomialsimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024binomialsimg7.png' alt='' title='' width='667' height='69'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024binomialsimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024binomialsimg8.png' alt='' title='' width='667' height='69'> </div> </p></div> <p>The story of <tt><a href="http://reference.wolfram.com/language/ref/Binomial.html">Binomial</a></tt> and <tt><a href="http://reference.wolfram.com/language/ref/PascalBinomial.html">PascalBinomial</a></tt> is a complicated one that mainly affects only the upper reaches of discrete mathematics. But there’s another, much more elementary convention that we’ve also tackled in Version 14.1: the convention of what the arguments of trigonometric functions mean.</p> <p>We’ve always taken the “fundamentally mathematical” point of view that the <em>x</em> in <tt><a href="http://reference.wolfram.com/language/ref/Sin.html">Sin</a></tt><tt>[x]</tt> is in radians:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024binomialsimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024binomialsimg9.png' alt='' title='' width='117' height='70'> </div> </p></div> <p>You’ve always been able to explicitly give the argument in degrees (using <tt><a href="http://reference.wolfram.com/language/ref/Degree.html">Degree</a></tt>—or after <a href="https://reference.wolfram.com/legacy/v3/">Version 3</a> in 1996—using °): </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024binomialsimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024binomialsimg10.png' alt='' title='' width='108' height='70'> </div> </p></div> <p>But a different convention would just say that the argument to <tt><a href="http://reference.wolfram.com/language/ref/Sin.html">Sin</a></tt> should always be interpreted as being in degrees, even if it’s just a plain number. Calculators would often have a physical switch that globally toggles to this convention. And while that might be OK if you are just doing a small calculation and can physically see the switch, nothing like that would make any sense at all in our system. But still, particularly in elementary mathematics, one might want a “degrees version” of trigonometric functions. And in Version 14.1 we’ve introduced these:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024binomialsimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024binomialsimg11.png' alt='' title='' width='154' height='70'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024binomialsimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024binomialsimg12.png' alt='' title='' width='173' height='43'> </div> </p></div> <p>One might think this was somewhat trivial. But what’s nontrivial is that the “degrees trigonometric functions” are consistently integrated throughout the system. Here, for example, is the period in <tt><a href="http://reference.wolfram.com/language/ref/SinDegrees.html">SinDegrees</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024binomialsimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024binomialsimg13.png' alt='' title='' width='290' height='43'> </div> </p></div> <p>You can take the integral as well</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024binomialsimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024binomialsimg14.png' alt='' title='' width='248' height='63'> </div> </p></div> <p>and the messiness of this form shows why for more than three decades we’ve just dealt with <tt><a href="http://reference.wolfram.com/language/ref/Sin.html">Sin</a></tt><tt>[x]</tt> and radians.</p> <h2 id="fixed-points-and-stability-for-differential-and-difference-equations">Fixed Points and Stability for Differential and Difference Equations</h2> <p>All sorts of differential equations have the feature that their solutions exhibit fixed points. It’s always in principle been possible to find these by looking for points where derivatives vanish. But in Version 14.1 we now have a general, robust function that takes the same form of input as <tt><a href="http://reference.wolfram.com/language/ref/DSolve.html">DSolve</a></tt> and finds all fixed points:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024differentialimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024differentialimg1.png' alt='' title='' width='609' height='95'> </div> </p></div> <p>Here’s a stream plot of the solutions to our equations, together with the fixed points we’ve found:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024differentialimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024differentialimg2.png' alt='' title='' width='652' height='412'> </div> </p></div> <p>And we can see that there are two different kinds of fixed points here. The ones on the left and right are “stable” in the sense that solutions that start near them always stay near them. But it’s a different story for the fixed points at the top and bottom; for these, solutions that start nearby can diverge. The function <tt><a href="http://reference.wolfram.com/language/ref/DStabilityConditions.html">DStabilityConditions</a></tt> computes fixed points, and specifies whether they are stable or not:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024differentialimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024differentialimg3.png' alt='' title='' width='661' height='95'> </div> </p></div> <p>As another example, here are the <a href="https://mathworld.wolfram.com/LorenzEquations.html">Lorenz equations</a>, which have one unstable fixed point, and two stable ones:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024differentialimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024differentialimg4.png' alt='' title='' width='657' height='77'> </div> </p></div> <p>If your equations have parameters, their stability fixed points can depend on those parameters:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024differentialimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024differentialimg5.png' alt='' title='' width='641' height='63'> </div> </p></div> <p>Extracting the conditions here, we can now plot the region of parameter space where this fixed point is stable:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024differentialimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024differentialimg6.png' alt='' title='' width='357' height='237'> </div> </p></div> <p>This kind of stability analysis is important in all sorts of fields, including dynamical systems theory, control theory, celestial mechanics and computational ecology.</p> <p>And just as one can find fixed points and do stability analysis for differential equations, one can also do it for difference equations—and this is important for discrete dynamical systems, digital control systems, and for iterative numerical algorithms. Here’s a classic example in Version 14.1 for the logistic map:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024differentialimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024differentialimg7.png' alt='' title='' width='462' height='63'> </div> </p></div> <h2 id="the-steady-advance-of-pdes">The Steady Advance of PDEs</h2> <p>Five years ago—in <a href="https://reference.wolfram.com/legacy/language/v11.3/">Version 11.3</a>—we introduced our <a href="https://writings.stephenwolfram.com/2024/01/the-story-continues-announcing-version-14-of-wolfram-language-and-mathematica/#industrial-strength-multidomain-pdes">framework for symbolically representing physical systems using PDEs</a>. And in every version since we’ve been steadily adding more and more capabilities. At this point we’ve now covered the basics of heat transfer, mass transport, acoustics, solid mechanics, fluid mechanics, electromagnetics and (one-particle) quantum mechanics. And with our underlying symbolic framework, it’s easy to mix components of all these different kinds. </p> <p>Our goal now is to progressively cover what’s needed for more and more kinds of applications. So in Version 14.1 we’re adding <a href="https://en.wikipedia.org/wiki/Von_Mises_yield_criterion" target="_blank" rel="noopener">von Mises stress analysis</a> for solid mechanics, electric current density models for electromagnetics and anisotropic effective masses for quantum mechanics. </p> <p>So as an example of what’s now possible, here’s a piece of geometry representing a spiral inductor of the kind that might be used in a modern MEMS device:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08012024PDEsAimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08012024PDEsAimg1.png' alt='' title='' width='191' height='88'> </div> </p></div> <p>Let’s define our variables—voltage and position:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08012024PDEsAimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08012024PDEsAimg2.png' alt='' title='' width='251' height='14'> </div> </p></div> <p>And let’s specify parameters—here just that the material we’re going to deal with is copper:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08012024PDEsAimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08012024PDEsAimg3.png' alt='' title='' width='322' height='21'> </div> </p></div> <p>Now we’re in a position to set up the PDE for this system, making use of the new constructs <tt><a href="http://reference.wolfram.com/language/ref/ElectricCurrentPDEComponent.html">ElectricCurrentPDEComponent</a></tt> and <tt><a href="http://reference.wolfram.com/language/ref/ElectricCurrentDensityValue.html">ElectricCurrentDensityValue</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08012024PDEsAimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08012024PDEsAimg4.png' alt='' title='' width='622' height='62'> </div> </p></div> <p>All it takes to solve this PDE for the voltage is then:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08012024PDEsAimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08012024PDEsAimg5.png' alt='' title='' width='513' height='75'> </div> </p></div> <p>From the voltage we can compute the current density</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08012024PDEsAimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08012024PDEsAimg6.png' alt='' title='' width='239' height='14'> </div> </p></div> <p>and then plot it (and, yes, the current tends to avoid the corners):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/08/sw08012024PDEsAimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08012024PDEsAimg7.png' alt='' title='' width='607' height='323'> </div> </p></div> <h2 id="symbolic-biomolecules-and-their-visualization">Symbolic Biomolecules and Their Visualization</h2> <p>Ever since <a href="https://reference.wolfram.com/legacy/language/v12.2/">Version 12.2</a> we’ve had the ability to represent and manipulate bio sequences of the kind that appear in DNA, RNA and proteins. We’ve also been able to do things like import PDB (Protein Data Bank) files and generate graphics from them. But now in Version 14.1 we’re adding a symbolic <tt><a href="http://reference.wolfram.com/language/ref/BioMolecule.html">BioMolecule</a></tt> construct, to represent the full structure of biomolecules:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024biomoleculesimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024biomoleculesimg1.png' alt='' title='' width='285' height='92'> </div> </p></div> <p>Ultimately this is “just a molecule” (and in this case its data is so big it’s not by default stored locally in your notebook):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024biomoleculesimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024biomoleculesimg2.png' alt='' title='' width='392' height='102'> </div> </p></div> <p>But what <tt><a href="http://reference.wolfram.com/language/ref/BioMolecule.html">BioMolecule</a></tt> does is also to capture the “higher-order structure” of the molecule, for example how it’s built up from distinct chains, where structures like <em>α</em>-helices occur in these, and so on. For example, here are the two (bio sequence) chains that appear in this case:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024biomoleculesimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024biomoleculesimg3.png' alt='' title='' width='470' height='172'> </div> </p></div> <p>And here are where the <em>α</em>-helices occur:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024biomoleculesimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024biomoleculesimg4.png' alt='' title='' width='459' height='104'> </div> </p></div> <p>What about visualization? Well, there’s <tt><a href="http://reference.wolfram.com/language/ref/BioMoleculePlot3D.html">BioMoleculePlot3D</a></tt> for that:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024biomoleculesimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024biomoleculesimg5.png' alt='' title='' width='446' height='282'> </div> </p></div> <p>There are different “themes” you can use for this:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024biomoleculesimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024biomoleculesimg6.png' alt='' title='' width='444' height='281'> </div> </p></div> <p>Here’s a raw-atom-level view:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024biomoleculesimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024biomoleculesimg7.png' alt='' title='' width='605' height='353'> </div> </p></div> <p>You can combine the views—and for example add coordinate values (specified in angstroms):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024biomoleculesimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024biomoleculesimg8.png' alt='' title='' width='556' height='575'> </div> </p></div> <p>You can also specify “color rules” that define how particular parts of the biomolecule should be rendered:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024biomoleculesimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024biomoleculesimg9.png' alt='' title='' width='444' height='289'> </div> </p></div> <p>But the structure here isn’t just something you can make graphics out of; it’s also something you can compute with. For example, here’s a geometric region formed from the biomolecule:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024biomoleculesimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024biomoleculesimg10.png' alt='' title='' width='506' height='259'> </div> </p></div> <p>And this computes its surface area (in square angstroms):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024biomoleculeC2Cimg1_copy.txt' data-c2c-type='text/html'><img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024biomoleculesimg11.png' alt='' title='' width='178' height='99'> </div> </p></div> <p>The Wolfram Language has built-in data on a certain number of proteins. But you can get data on many more proteins from external sources—specifying them with external identifiers:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024biomoleculesimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024biomoleculesimg12.png' alt='' title='' width='366' height='55'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024biomoleculesimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024biomoleculesimg13.png' alt='' title='' width='389' height='395'> </div> </p></div> <p>When you get a protein—say from an external source—it’ll often come with a 3D structure specified, for example as deduced from experimental measurements. But even without that, Version 14.1 will attempt to find at least an approximate structure—by using machine-learning-based protein-folding methods. As an example, here’s a random bio sequence:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024biomoleculesimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024biomoleculesimg14.png' alt='' title='' width='641' height='98'> </div> </p></div> <p>If you make a <tt><a href="http://reference.wolfram.com/language/ref/BioMolecule.html">BioMolecule</a></tt> out of this, a “predicted” 3D structure will be generated:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024biomoleculesimg15_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024biomoleculesimg15.png' alt='' title='' width='351' height='80'> </div> </p></div> <p>Here’s a visualization of this structure—though more work would be needed to determine how it’s related to what one might actually observe experimentally:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024biomoleculesimg16_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024biomoleculesimg16.png' alt='' title='' width='266' height='308'> </div> </p></div> <h2 id="optimizing-neural-nets-for-gpus-and-npus">Optimizing Neural Nets for GPUs and NPUs</h2> <p>Many computers now come with GPU and NPU hardware accelerators for machine learning, and in Version 14.1 we’ve added more support for these. Specifically, on macOS (Apple Silicon) and Windows machines, built-in functions like <tt><a href="http://reference.wolfram.com/language/ref/ImageIdentify.html">ImageIdentify</a></tt> and <tt><a href="http://reference.wolfram.com/language/ref/SpeechRecognize.html">SpeechRecognize</a></tt> now automatically use CoreML (Neural Engine) and DirectML capabilities—and the result is typically 2X to 10X faster performance. </p> <p>We’ve always supported explicit CUDA GPU acceleration, for both training and inference. But in Version 14.1 we now support CoreML and DirectML acceleration for inference tasks with explicitly specified neural nets. But whereas this acceleration is now the default for built-in machine-learning-based functions, for explicitly specified models the default isn’t yet the default. </p> <p>So, for example, this doesn’t use GPU acceleration:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024neuralimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024neuralimg1.png' alt='' title='' width='585' height='153'> </div> </p></div> <p>But you can explicitly request it—and then (assuming all features of the model can be accelerated) things will typically run significantly faster:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024neuralimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024neuralimg2.png' alt='' title='' width='633' height='188'> </div> </p></div> <p>We’re continually sprucing up our infrastructure for machine learning. And as part of that, in Version 14.1 we’ve enhanced our diagrams for neural nets to make layers more visually distinct—and to immediately produce diagrams suitable for publication:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024neuralimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024neuralimg3.png' alt='' title='' width='662' height='446'> </div> </p></div> <h2 id="the-statistics-of-dates">The Statistics of Dates</h2> <p>We’ve been releasing versions of what’s now the Wolfram Language for 36 years. And looking at that whole collection of release dates, we can ask statistical questions. Like “What’s the median date for all the releases so far?” Well, in Version 14.1 there’s a direct way to answer that—because statistical functions like <tt><a href="http://reference.wolfram.com/language/ref/Median.html">Median</a></tt> now just immediately work on dates:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07292024datesimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07292024datesimg1B.png' alt='' title='' width='616' height='175'> </div> </p></div> <p>What if we ask about all 7000 or so functions in the Wolfram Language? Here’s a histogram of when they were introduced:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07292024datesimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07292024datesimg2.png' alt='' title='' width='494' height='183'> </div> </p></div> <p>And now we can compute the median, showing quantitatively that, yes, Wolfram Language development has speeded up:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07292024datesimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07292024datesimg3.png' alt='' title='' width='440' height='47'> </div> </p></div> <p>Dates are a bit like numbers, but not quite. For example, their “zero” shifts around depending on the calendar. And their granularity is more complicated than precision for numbers. In addition, a single date can have multiple different representations (say in different calendars or with different granularities). But it nevertheless turns out to be possible to define many kinds of statistics for dates. To understand these statistics—and to compute them—it’s typically convenient to make one’s whole collection of dates have the same form. And in Version 14.1 this can be achieved with the new function <a href="https://reference.wolfram.com/language/ref/ConformDates.html"><tt>ConformDates</tt></a> (which here converts all dates to the format of the first one listed):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07292024datesimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07292024datesimg4.png' alt='' title='' width='570' height='58'> </div> </p></div> <p>By the way, in Version 14.1 the whole pipeline for handling dates (and times) has been dramatically speeded up, most notably conversion from strings, as needed in the import of dates. </p> <p>The concept of doing statistics on dates introduces another new idea: date (and time) distributions. And in Version 14.1 there are two new functions <a href="http://reference.wolfram.com/language/ref/DateDistribution.html"><tt>DateDistribution</tt></a> and <a href="http://reference.wolfram.com/language/ref/TimeDistribution.html"><tt>TimeDistribution</tt></a> for defining such distributions. Unlike for numerical (or quantity) distributions, date and time distributions require the specification of an origin, like <tt><a href="http://reference.wolfram.com/language/ref/Today.html">Today</a></tt>, as well as of a scale, like <tt>"Days"</tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07292024datesimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07292024datesimg5.png' alt='' title='' width='466' height='88'> </div> </p></div> <p>But given this symbolic specification, we can now do operations just like for any other distribution, say generating some random variates: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07292024datesimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07292024datesimg6.png' alt='' title='' width='734' height='48'> </div> </p></div> <h2 id="building-videos-with-programs">Building Videos with Programs</h2> <p>Introduced in <a href="https://reference.wolfram.com/legacy/v6/guide/Mathematica.html">Version 6</a> back in 2007, <tt><a href="http://reference.wolfram.com/language/ref/Manipulate.html">Manipulate</a></tt> provides an immediate way to create an interactive “manipulable” interface. And it’s been possible for a long time to export <tt><a href="http://reference.wolfram.com/language/ref/Manipulate.html">Manipulate</a></tt> objects to video. But just what should happen in the video? What sliders should move in what way? In <a href="https://reference.wolfram.com/legacy/language/v12.3/">Version 12.3</a> we introduced <tt><a href="http://reference.wolfram.com/language/ref/AnimationVideo.html">AnimationVideo</a></tt> to let you make a video in which one parameter is changing with time. But now in Version 14.1 we have <tt><a href="http://reference.wolfram.com/language/ref/ManipulateVideo.html">ManipulateVideo</a></tt> which lets you create a video in which many parameters can be varied simultaneously. One way to specify what you want is to say for each parameter what value it should get at a sequence of times (by default measured in seconds from the beginning of the video). <tt><a href="http://reference.wolfram.com/language/ref/ManipulateVideo.html">ManipulateVideo</a></tt> then produces a smooth video by interpolating between these values:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07302024videoimg1_copy.txt' data-c2c-type='text/html'> <video autoplay width="567" height="495" loop><source src="https://content.wolfram.com/sites/43/2024/07/manipulatevideo.mp4" type="video/mp4"></video></div> <div> <p>(An alternative is to specify complete “keyframes” by giving operations to be done at particular times.) </p> <p><tt><a href="http://reference.wolfram.com/language/ref/ManipulateVideo.html">ManipulateVideo</a></tt> in a sense provides a “holistic” way to create a video by controlling a <tt><a href="http://reference.wolfram.com/language/ref/Manipulate.html">Manipulate</a></tt>. And in the last several versions we’ve introduced many functions for creating videos from “existing structures” (for example <tt><a href="http://reference.wolfram.com/language/ref/FrameListVideo.html">FrameListVideo</a></tt> assembles a video from a list of frames). But sometimes you want to build up videos one frame at a time. And in Version 14.1 we’ve introduced <tt><a href="http://reference.wolfram.com/language/ref/SowVideo.html">SowVideo</a></tt> and <tt><a href="http://reference.wolfram.com/language/ref/ReapVideo.html">ReapVideo</a></tt> for doing this. They’re basically the analog of <tt><a href="http://reference.wolfram.com/language/ref/Sow.html">Sow</a></tt> and <tt><a href="http://reference.wolfram.com/language/ref/Reap.html">Reap</a></tt> for video frames. <tt><a href="http://reference.wolfram.com/language/ref/SowVideo.html">SowVideo</a></tt> will “sow” one or more frames, and all frames you sow will then be collected and assembled into a video by <tt><a href="http://reference.wolfram.com/language/ref/ReapVideo.html">ReapVideo</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07302024videoimg2_copy.txt' data-c2c-type='text/html'> <video autoplay width="501" height="427" loop><source src="https://content.wolfram.com/sites/43/2024/07/reapvideo.mp4" type="video/mp4"></video></div> <div> <p>One common application of <tt><a href="http://reference.wolfram.com/language/ref/SowVideo.html">SowVideo</a></tt>/<tt><a href="http://reference.wolfram.com/language/ref/ReapVideo.html">ReapVideo</a></tt> is to assemble a video from frames that are programmatically picked out by some criterion from some other video. So, for example, this “sows” frames that contain a bird, then “reaps” them to assemble a new video.</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07302024videoimg3_copy.txt' data-c2c-type='text/html'> <video autoplay width="560" height="539" loop><source src="https://content.wolfram.com/sites/43/2024/07/hummingbird.mp4" type="video/mp4"></video></div> <div> <p>Another way to programmatically create one video from another is to build up a new video by progressively “folding in” frames from an existing video—which is what the new function <tt><a href="http://reference.wolfram.com/language/ref/VideoFrameFold.html">VideoFrameFold</a></tt> does:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07302024videoimg4_copy.txt' data-c2c-type='text/html'> <video autoplay width="594" height="498" loop><source src="https://content.wolfram.com/sites/43/2024/07/butterfly.mp4" type="video/mp4"></video></div> <div> <p>Version 14.1 also has a variety of new “convenience functions” for dealing with videos. One example is <tt><a href="http://reference.wolfram.com/language/ref/VideoSummaryPlot.html">VideoSummaryPlot</a></tt> which generates various “at-a-glance” summaries of videos (and their audio):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07302024videoimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07302024videoimg5.png' alt='' title='' width='402' height='266'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07302024videoimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07302024videoimg6.png' alt='' title='' width='426' height='232'> </div> </p></div> <p>Another new feature in Version 14.1 is the ability to apply audio processing functions directly to videos:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07302024videoimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07302024videoimg7.png' alt='' title='' width='402' height='279'> </div> </p></div> <p>And, yes, it’s a bird:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07302024videoimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07302024videoimg8.png' alt='' title='' width='333' height='173'> </div> </p></div> <h2 id="optimizing-the-speech-recognition-workflow">Optimizing the Speech Recognition Workflow</h2> <p>We first introduced <tt><a href="http://reference.wolfram.com/language/ref/SpeechRecognize.html">SpeechRecognize</a></tt> in 2019 in <a href="https://reference.wolfram.com/legacy/language/v12/">Version 12.0</a>. And now in Version 14.1 <tt><a href="http://reference.wolfram.com/language/ref/SpeechRecognize.html">SpeechRecognize</a></tt> is getting a makeover. </p> <p>The most dramatic change is speed. In the past, <tt><a href="http://reference.wolfram.com/language/ref/SpeechRecognize.html">SpeechRecognize</a></tt> would typically take at least as long to recognize a piece of speech as the duration of the speech itself. But now in Version 14.1, <tt><a href="http://reference.wolfram.com/language/ref/SpeechRecognize.html">SpeechRecognize</a></tt> runs many tens of times faster, so you can recognize speech much faster than real time.</p> <p>And what’s more, <tt><a href="http://reference.wolfram.com/language/ref/SpeechRecognize.html">SpeechRecognize</a></tt> now produces full, written text, complete with capitalization, punctuation, etc. So here, for example, is a transcription of <a href="https://www.wolframcloud.com/obj/5f070490-8074-441c-beba-9490d95e125a">a little video</a>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024speechNewC2Cimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07302024speechBimg1.png' alt='' title='' width='527' height='166'> </div> </p></div> <p>There’s also a new function, <tt><a href="http://reference.wolfram.com/language/ref/VideoTranscribe.html">VideoTranscribe</a></tt>, that will take a video, transcribe its audio, and insert the transcription back into the subtitle track of the video.</p> <p>And, by the way, <tt><a href="http://reference.wolfram.com/language/ref/SpeechRecognize.html">SpeechRecognize</a></tt> runs entirely locally on your computer, without having to access a server (except maybe for updates to the neural net it’s using). </p> <p>In the past <tt><a href="http://reference.wolfram.com/language/ref/SpeechRecognize.html">SpeechRecognize</a></tt> could only handle English. In Version 14.1 it can handle 100 languages—and can automatically produce translated transcriptions. (By default it produces transcriptions in the language you’ve specified with <tt><a href="http://reference.wolfram.com/language/ref/$Language.html">$Language</a></tt>.) And if you want to identify what language a piece of audio is in, <tt><a href="http://reference.wolfram.com/language/ref/LanguageIdentify.html">LanguageIdentify</a></tt> now works directly on audio.</p> <p><tt><a href="http://reference.wolfram.com/language/ref/SpeechRecognize.html">SpeechRecognize</a></tt> by default produces a single string of text. But it now also has the option to break up its results into a list, say of sentences:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024speechNewC2Cimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07302024speechBimg2.png' alt='' title='' width='548' height='166'> </div> </p></div> <p>And in addition to producing a transcription, <tt><a href="http://reference.wolfram.com/language/ref/SpeechRecognize.html">SpeechRecognize</a></tt> can give time intervals or audio fragments for each element:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024speechNewC2Cimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07302024speechBimg3A.png' alt='' title='' width='662' height='283'> </div> </p></div> <h2 id="historical-geography-becomes-computable">Historical Geography Becomes Computable</h2> <p>History is complicated. But that doesn’t mean there isn’t much that can be made computable about it. And in Version 14.1 we’re taking a major step forward in making historical geography computable. We’ve had extensive geographic computation capabilities in the Wolfram Language for well over a decade. And in Version 14.1 we’re extending that to historical geography.</p> <p>So now you can not only ask for a map of where the current country of Italy is, you can also ask to make a map of the Roman Empire in 100 AD:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg1.png' alt='' title='' width='558' height='338'> </div> </p></div> <p>And “the Roman Empire in 100 AD” is now a computable entity. So you can ask for example what its approximate area was:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg2.png' alt='' title='' width='535' height='58'> </div> </p></div> <p>And you can even make a plot of how the area of the Roman Empire changed over the period from 0 AD to 200 AD:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg3A.png' alt='' title='' width='662' height='283'> </div> </p></div> <p>We’ve been building our knowledgebase of historical geography for many years. Of course, country borders may be disputed, and—particularly in the more distant past—may not have been well defined. But by now we’ve accumulated computable data on basically all of the few thousand known historical countries. Still—with history being complicated—it’s not surprising that there are all sorts of often subtle issues.</p> <p>Let’s start by asking what historical countries the location that’s now Mexico City has been in. <tt><a href="http://reference.wolfram.com/language/ref/GeoIdentify.html">GeoIdentify</a></tt> gives the answer:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg4.png' alt='' title='' width='606' height='102'> </div> </p></div> <p>And already we see subtlety. For example, our historical country entities are labeled by their overall beginning and ending dates. But most of them covered Mexico City only for part of their existence. And here we can see what’s going on:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg6.png' alt='' title='' width='483' height='25'><img style="margin-left: -3px"src='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg5A.png' alt='' title='' width='542' height='225'/> </div> </p></div> <p>Often there’s subtlety in identifying what should count as a “different country”. If there was just an “acquisition” or a small “change of management” maybe it’s still the same country. But if there was a “dramatic reorganization”, maybe it’s a different country. Sometimes the names of countries (if they even had official names) give clues. But in general it’s taken lots of case-by-case curation, trying to follow the typical conventions used by historians of particular times and places. </p> <p>For London we see several “close-but-we-consider-it-a-different-country” issues—along with various confusing repeated conquerings and reconquerings:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg7.png' alt='' title='' width='651' height='244'> </div> </p></div> <p>Here’s a timeline plot of the countries that have contained London:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg8.png' alt='' title='' width='504' height='711'> </div> </p></div> <p>And because everything is computable, it’s easy to identify the longest contiguous segment here:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg9.png' alt='' title='' width='609' height='121'> </div> </p></div> <p><tt><a href="http://reference.wolfram.com/language/ref/GeoIdentify.html">GeoIdentify</a></tt> can tell us what entities something like a city is inside. <tt><a href="http://reference.wolfram.com/language/ref/GeoEntities.html">GeoEntities</a></tt>, on the other hand, can tell us what entities are inside something like a country. So, for example, this tells us what historical countries were inside (or at least overlapped with) the current boundaries of the UK in 800 AD:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg10.png' alt='' title='' width='703' height='102'> </div> </p></div> <p>This then makes a map (the extra list makes these countries be rendered separately):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg11.png' alt='' title='' width='545' height='326'> </div> </p></div> <p>In the Wolfram Language we have data on quite a few kinds of historical entities beyond countries. For example, we have extensive data on military conflicts. Here we’re asking what military conflicts occurred within the borders of what’s now France between 200 BC and 200 AD:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg12.png' alt='' title='' width='700' height='102'> </div> </p></div> <p>Here’s a map of their locations:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg13.png' alt='' title='' width='353' height='230'> </div> </p></div> <p>And here are conflicts in the Atlantic Ocean in the period 1939–1945:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg14.png' alt='' title='' width='545' height='348'> </div> </p></div> <p>And—combining several things—here’s a map of conflicts that, at the time when they occurred, were within the region of what was then Carthage:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg15_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg15.png' alt='' title='' width='632' height='227'> </div> </p></div> <p>There are all sorts of things that we can compute from historical geography. For example, this asks for the (minimum) geo distance between the territory of the Roman Empire and the Han Dynasty in 100 AD:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg16_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg16.png' alt='' title='' width='528' height='93'> </div> </p></div> <p>But what about the overall minimum distance across all years when these historical countries existed? This gives the result for that:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg17_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg17.png' alt='' title='' width='438' height='93'> </div> </p></div> <p>Let’s compare this with a plot of these two entities:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg18_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg18.png' alt='' title='' width='562' height='253'> </div> </p></div> <p>But there’s a subtlety here. What version of the Roman Empire is it that we’re showing on the map here? Our convention is by default to show historical countries “at their zenith”, i.e. at the moment when they had their maximum extent. </p> <p>But what about other choices? <tt><a href="http://reference.wolfram.com/language/ref/Dated.html">Dated</a></tt> gives us a way to specify a particular date. But another possibility is to include in what we consider to be a particular historical country any territory that was ever part of that country, at any time in its history. And you can do this using <tt><a href="http://reference.wolfram.com/language/ref/GeoVariant.html">GeoVariant</a></tt><tt>[…, "UnionArea"]</tt>. In the particular case we’re showing here, it doesn’t make much difference, except that there’s more territory in Germany and Scotland included in the Roman Empire:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg19_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg19.png' alt='' title='' width='629' height='253'> </div> </p></div> <p>By the way, you can combine <tt><a href="http://reference.wolfram.com/language/ref/Dated.html">Dated</a></tt> and <tt><a href="http://reference.wolfram.com/language/ref/GeoVariant.html">GeoVariant</a></tt>, to get things like “the zenith within a certain period” or “any territory that was included at any time within a period”. And, yes, it can get quite complicated. In a rather physics-like way you can think of the extent of a historical country as defining a region in spacetime—and indeed <tt><a href="http://reference.wolfram.com/language/ref/GeoVariant.html">GeoVariant</a></tt><tt>[…, "TimeSeries"]</tt> in effect represents a whole “stack of spacelike slices” in this spacetime region:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg20_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg20.png' alt='' title='' width='652' height='89'> </div> </p></div> <p>And—though it takes a little while—you can use it to make a video of the rise and fall of the Roman Empire:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07292024geographyimg21_copy.txt' data-c2c-type='text/html'> <video autoplay width="620" height="458" loop><source src="https://content.wolfram.com/sites/43/2024/07/risefallromanempire.mp4" type="video/mp4"></video></div> <div> <h2 id="astronomical-graphics-and-their-axes">Astronomical Graphics and Their Axes</h2> <p>It’s complicated to define where things are in the sky. There are four main coordinate systems that get used in doing this: horizon (relative to local horizon), equatorial (relative to the Earth’s equator), ecliptic (relative to the orbit of the Earth around the Sun) and galactic (relative to the plane of the galaxy). And when we draw a diagram of the sky (here on white for clarity) it’s typical to show the “axes” for all these coordinate systems:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024astroimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024astroimg1.png' alt='' title='' width='667' height='674'> </div> </p></div> <p>But here’s a tricky thing: how should those axes be labeled? Each one is different: horizon is most naturally labeled by things like cardinal directions (N, E, S, W, etc.), equatorial by hours in the day (in sidereal time), ecliptic by months in the year, and galactic by angle from the center of the galaxy. </p> <p>In ordinary plots axes are usually straight, and labeled uniformly (or perhaps, say, logarithmically). But in astronomy things are much more complicated: the axes are intrinsically circular, and then get rendered through whatever projection we’re using.</p> <p>And we might have thought that such axes would require some kind of custom structure. But not in the Wolfram Language. Because in the Wolfram Language we try to make things general. And axes are no exception: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024astroimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024astroimg2.png' alt='' title='' width='581' height='248'> </div> </p></div> <p>So in <tt><a href="http://reference.wolfram.com/language/ref/AstroGraphics.html">AstroGraphics</a></tt> all our various axes are just <tt><a href="http://reference.wolfram.com/language/ref/AxisObject.html">AxisObject</a></tt> constructs—that can be computed with. And so, for example, here’s a <a href="https://www.wolframalpha.com/input?i=Mollweide+projection">Mollweide projection</a> of the sky:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024astroimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024astroimg3.png' alt='' title='' width='596' height='318'> </div> </p></div> <p>If we insist on “seeing the whole sky”, the bottom half is just the Earth (and, yes, the Sun isn’t shown because I’m writing this after it’s set for the day…):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024astroimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024astroimg4.png' alt='' title='' width='619' height='340'> </div> </p></div> <p>Things get a bit wild if we start adding grid lines, here for galactic coordinates:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024astroimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024astroimg5.png' alt='' title='' width='619' height='340'> </div> </p></div> <p>And, yes, the galactic coordinate axis is indeed aligned with the plane of the Milky Way (i.e. our galaxy):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07272024astroimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07272024astroimg6.png' alt='' title='' width='619' height='340'> </div> </p></div> <h2 id="when-is-earthrise-on-mars-new-level-of-astronomical-computation">When Is Earthrise on Mars? New Level of Astronomical Computation</h2> <p>When will the Earth next rise above the horizon from where the Perseverance rover is on Mars? In Version 14.1 we can now compute this (and, yes, this is an “Earth time” converted from Mars time using the standard barycentric celestial reference system (BCRS) solar-system-wide spacetime coordinate system):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07282024earthriseimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07282024earthriseimg1.png' alt='' title='' width='525' height='57'> </div> </p></div> <p>This is a fairly complicated computation that takes into account not only the motion and rotation of the bodies involved, but also various other physical effects. A more “down to Earth” example that one might readily check by looking out of one’s window is to compute the rise and set times of the Moon from a particular point on the Earth:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07282024earthriseimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07282024earthriseimg2.png' alt='' title='' width='464' height='48'> </div> </p></div> <p>There’s a slight variation in the times between moonrises:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07282024earthriseimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07282024earthriseimg3.png' alt='' title='' width='588' height='53'> </div> </p></div> <p>Over the course of a year we see systematic variations associated with the periods of different kinds of lunar months:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07282024earthriseimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07282024earthriseimg4.png' alt='' title='' width='616' height='279'> </div> </p></div> <p>There are all sorts of subtleties here. For example, when exactly does one define something (like the Sun) to have “risen”? Is it when the top of the Sun first peeks out? When the center appears? Or when the “whole Sun” is visible? In Version 14.1 you can ask about any of these:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07282024earthriseimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07282024earthriseimg5.png' alt='' title='' width='656' height='48'> </div> </p></div> <p>Oh, and you could compute the same thing for the rise of Venus, but now to see the differences, you’ve got to go to millisecond granularity (and, by the way, granularities of milliseconds down to picoseconds are new in Version 14.1): </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07282024earthriseimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07282024earthriseimg6.png' alt='' title='' width='664' height='96'> </div> </p></div> <p>By the way, particularly for the Sun, the concept of <tt><a href="http://reference.wolfram.com/language/ref/ReferenceAltitude.html">ReferenceAltitude</a></tt> is useful in specifying the various kinds of sunrise and sunset: for example, “civil twilight” corresponds to a reference altitude of –6°.</p> <h2 id="geometry-goes-color-and-polar">Geometry Goes Color, and Polar</h2> <p>Last year we introduced the function <tt><a href="http://reference.wolfram.com/language/ref/ARPublish.html">ARPublish</a></tt> to provide a streamlined way to take 3D geometry and publish it for viewing in augmented reality. In Version 14.1 we’ve now extended this pipeline to deal with color:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07302024geometryARpublishimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07302024geometryARpublishimg1.png' alt='' title='' width='214' height='122'> </div> </p></div> <p>(Yes, the color is a little different on the phone because the phone tries to make it look “more natural”.)</p> <p> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07302024geometryimg2.png' alt='Augmented reality via QR code' title='Augmented reality via QR code' width='263' height='284'></p> <p>And now it’s easy to view this not just on a phone, but also, for example, <a href="https://reference.wolfram.com/language/workflow/Visualize3DObjectsWithAppleVisionPro.html">on the Apple Vision Pro</a>:</p> <p><video autoplay width="480" height="270" loop><source src="https://content.wolfram.com/sites/43/2024/07/visionpro1-big.mp4" type="video/mp4"></video></p> <p>Graphics have always had color. But now in Version 14.1 symbolic geometric regions can have color too:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07302024geometryimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07302024geometryimg3.png' alt='' title='' width='244' height='101'> </div> </p></div> <p>And constructive geometric operations on regions preserve color:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07302024geometryimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07302024geometryimg4.png' alt='' title='' width='394' height='168'> </div> </p></div> <p>Two other new functions in Version 14.1 are <tt><a href="http://reference.wolfram.com/language/ref/PolarCurve.html">PolarCurve</a></tt> and <tt><a href="http://reference.wolfram.com/language/ref/FilledPolarCurve.html">FilledPolarCurve</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07302024geometryimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07302024geometryimg5.png' alt='' title='' width='286' height='152'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07302024geometryimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07302024geometryimg6.png' alt='' title='' width='105' height='62'> </div> </p></div> <p>And while at this level this may look simple, what’s going on underneath is actually seriously complicated, with all sorts of symbolic analysis needed in order to determine what the “inside” of the parametric curve should be.</p> <p>Talking about geometry and color brings up another enhancement in Version 14.1: plot themes for diagrams in synthetic geometry. Back in <a href="https://reference.wolfram.com/legacy/language/v12/">Version 12.0</a> we introduced symbolic synthetic geometry—in effect finally providing a streamlined computable way to do <a href="https://writings.stephenwolfram.com/2020/09/the-empirical-metamathematics-of-euclid-and-beyond/">the kind of geometry that Euclid did two millennia ago</a>. In the past few versions we’ve been steadily expanding our synthetic geometry capabilities, and now in Version 14.1 one notable thing we’ve added is the ability to use plot themes—and explicit graphics options—to style geometric diagrams. Here’s the default version of a geometric diagram:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07302024geometryimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08142024geometryBCimg7.png' alt='' title='' width='620' height='548'> </div> </p></div> <p>Now we can “theme” this for the web:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07302024geometryimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08142024geometryBCimg8.png' alt='' title='' width='359' height='324'> </div> </p></div> <h2 id="new-computation-flow-in-notebooks-introducing-cell-linked">New Computation Flow in Notebooks: Introducing Cell-Linked %</h2> <p>In building up computations in notebooks, one very often finds oneself wanting to take a result one just got and then do something with it. And ever since <a href="https://reference.wolfram.com/legacy/v1/">Version 1.0</a> one’s been able to do this by referring to the result one just got as <tt>%</tt>. It’s very convenient. But there are some subtle and sometimes frustrating issues with it, the most important of which has to do with what happens when one reevaluates an input that contains <tt>%</tt>.</p> <p>Let’s say you’ve done this:</p> <p><img src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg1.png' alt='Range' title='Range' width='377' height='138'/></p> <p>But now you decide that actually you wanted <tt><a href="http://reference.wolfram.com/language/ref/Median.html">Median</a>[ <a href="http://reference.wolfram.com/language/ref/Out.html">%</a> <a href="http://reference.wolfram.com/language/ref/Power.html">^</a> 2 ]</tt> instead. So you edit that input and reevaluate it:</p> <p><img src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg2.png' alt='Edit and reevaluate' title='Edit and reevaluate' width='377' height='187'/></p> <p>Oops! Even though what’s right above your input in the notebook is a list, the value of <tt>%</tt> is the latest result that was computed, which you can’t now see, but which was <tt>3</tt>.</p> <p>OK, so what can one do about this? We’ve thought about it for a long time (and by “long” I mean decades). And finally now in Version 14.1 we have a solution—that I think is very nice and very convenient. The core of it is a new notebook-oriented analog of <tt>%</tt>, that lets one refer not just to things like “the last result that was computed” but instead to things like “the result computed in a particular cell in the notebook”.</p> <p>So let’s look at our sequence from above again. Let’s start typing another cell—say to “try to get it right”. In Version 14.1 as soon as we type <tt>%</tt> we see an autosuggest menu:</p> <p><img src='https://content.wolfram.com/sites/43/2024/07/diffMedianA.png' alt='Autosuggest menu' title='Autosuggest menu' width='377' height='155'/></p> <p>The menu is giving us a choice of (output) cells that we might want to refer to. Let’s pick the last one listed:</p> <p><img src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg4.png' alt='Last menu option' title='Last menu option' width='377' height='94'/></p> <p>The <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg6.png' alt='' title='' width='20' height='13'/> object is a reference to the output from the cell that’s currently labeled <tt>In[1]</tt>—and using <img loading='lazy' style="margin-bottom: -3px" src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg6.png' alt='' title='' width='20' height='13'/> now gives us what we wanted.</p> <p>But let’s say we go back and change the first (input) cell in the notebook—and reevaluate it:</p> <p><img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/GIFflow1.gif' alt='Reevaluate Range' title='Reevaluate Range' width='377' height='82'></p> <p>The cell now gets labeled <tt>In[5]</tt>—and the <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg6.png' alt='' title='' width='20' height='13'/> (in <tt>In[4]</tt>) that refers to that cell will immediately change to <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg9.png' alt='' title='' width='19' height='12'/>:</p> <p><img src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg10.png' alt='Median' title='Median' width='377' height='52'/></p> <p>And if we now evaluate this cell, it’ll pick up the value of the output associated with <tt>In[5]</tt>, and give us a new answer:</p> <p><img src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg11.png' alt='New answer' title='New answer' width='377' height='84'/></p> <p>So what’s really going on here? The key idea is that <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg12.png' alt='' title='' width='13' height='14'/> signifies a new type of notebook element that’s a kind of cell-linked analog of <tt>%</tt>. It represents the latest result from evaluating a particular cell, wherever the cell may be, and whatever the cell may be labeled. (The <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg13.png' alt='' title='' width='13' height='14'/> object always shows the current label of the cell it’s linked to.) In effect <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg14.png' alt='' title='' width='17' height='14'/> is “notebook front end oriented”, while ordinary <tt>%</tt> is kernel oriented. <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg15.png' alt='' title='' width='17' height='14'/> is linked to the contents of a particular cell in a notebook; <tt>%</tt> refers to the state of the Wolfram Language kernel at a certain time. </p> <p><img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg16.png' alt='' title='' width='17' height='14'/> gets updated whenever the cell it’s referring to is reevaluated. So its value can change either through the cell being explicitly edited (as in the example above) or because reevaluation gives a different value, say because it involves generating a random number:</p> <p><img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/GlFflow2.gif' alt='RandomInteger' title='RandomInteger' width='377' height='167'></p> <p>OK, so <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg18.png' alt='' title='' width='17' height='14'/> always refers to “a particular cell”. But what makes a cell a particular cell? It’s defined by a unique ID that’s assigned to every cell. When a new cell is created it’s given a universally unique ID, and it carries that same ID wherever it’s placed and whatever its contents may be (and even across different sessions). If the cell is copied, then the copy gets a new ID. And although you won’t explicitly see cell IDs, <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg19.png' alt='' title='' width='17' height='14'/> works by linking to a cell with a particular ID. </p> <p>One can think of <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg20.png' alt='' title='' width='17' height='14'/> as providing a “more stable” way to refer to outputs in a notebook. And actually, that’s true not just within a single session, but also across sessions. Say one saves the notebook above and opens it in a new session. Here’s what you’ll see:</p> <p><img src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg21.png' alt='Saving across sessions' title='Saving across sessions' width='377' height='180'/></p> <p>The <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg22.png' alt='' title='' width='17' height='14'/> is now grayed out. So what happens if we try to reevaluate it? Well, we get this:</p> <p><img src='https://content.wolfram.com/sites/43/2024/08/sw07302023flowimg23A.png' alt='Reconstruct or reevaluate' title='Reconstruct or reevaluate' width='614' height='146'/></p> <p>If we press <span class="kbd"><kbd>Reconstruct from output cell</kbd></span> the system will take the contents of the first output cell that was saved in the notebook, and use this to get input for the cell we’re evaluating:</p> <p><img src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg25.png' alt='Reconstruct from output cell' title='Reconstruct from output cell' width='377' height='91'/></p> <p>In almost all cases the contents of the output cell will be sufficient to allow the expression “behind it” to be reconstructed. But in some cases—like when the original output was too big, and so was elided—there won’t be enough in the output cell to do the reconstruction. And in such cases it’s time to take the <span class="kbd"><kbd>Go to input cell</kbd></span> branch, which in this case will just take us back to the first cell in the notebook, and let us reevaluate it to recompute the output expression it gives.</p> <p>By the way, whenever you see a “positional <tt>%</tt>” you can hover over it to highlight the cell it’s referring to:</p> <p><img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/GIFflow3.gif' alt='Positional % highlighting' title='Positional % highlighting' width='377' height='180'></p> <p>Having talked a bit about “cell-linked <tt>%</tt>” it’s worth pointing out that there are still cases when you’ll want to use “ordinary <tt>%</tt>”. A typical example is if you have an input line that you’re using a bit like a function (say for post-processing) and that you want to repeatedly reevaluate to see what it produces when applied to your latest output. </p> <p>In a sense, ordinary <tt>%</tt> is the “most volatile” in what it refers to. Cell-linked <tt>%</tt> is “less volatile”. But sometimes you want no volatility at all in what you’re referring to; you basically just want to burn a particular expression into your notebook. And in fact the <tt>%</tt> autosuggest menu gives you a way to do just that. </p> <p>Notice the <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg27.png' alt='' title='' width='21' height='15'/> that appears in whatever row of the menu you’re selecting:</p> <p><img src='https://content.wolfram.com/sites/43/2024/07/MedianPointer.png' alt='Iconize option' title='Iconize option' width='440' height='155'/></p> <p>Press this and you’ll insert (in iconized form) the whole expression that’s being referred to:</p> <p><img src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg29.png' alt='Iconized expression' title='Iconized expression' width='377' height='70'/></p> <p>Now—for better or worse—whatever changes you make in the notebook won’t affect the expression, because it’s right there, in literal form, “inside” the icon. And yes, you can explicitly “uniconize” to get back the original expression:</p> <p><img src='https://content.wolfram.com/sites/43/2024/07/diffMedianswooshA.png' alt='Uniconize' title='Uniconize' width='663' height='140'/></p> <p>Once you have a cell-linked <tt>%</tt> it always has a contextual menu with various actions:</p> <p><img src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg31.png' alt='Contextual menu' title='Contextual menu' width='407' height='186'/></p> <p>One of those actions is to do what we just mentioned, and replace the positional <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg32.png' alt='' title='' width='17' height='14'/> by an iconized version of the expression it’s currently referring to. You can also highlight the output and input cells that the <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg32.png' alt='' title='' width='17' height='14'/> is “linked to”. (Incidentally, another way to replace a <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg33.png' alt='' title='' width='17' height='14'/> by the expression it’s referring to is simply to “evaluate in place” <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg34.png' alt='' title='' width='17' height='14'/>, which you can do by selecting it and pressing <span class="kbd"><kbd>CMD</kbd><kbd>Return</kbd></span> or <span class="kbd"><kbd>Shift</kbd><kbd>Control</kbd><kbd>Enter</kbd></span>.)</p> <p>Another item in the <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg35.png' alt='' title='' width='13' height='14'/> menu is <span class="kbd"><kbd>Replace With Rolled-Up Inputs</kbd></span>. What this does is—as it says—to “roll up” a sequence of “<img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg36.png' alt='' title='' width='13' height='14'/> references” and create a single expression from them:</p> <p><img src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg37.png' alt='Replace with rolled-up inputs' title='Replace with rolled-up inputs' width='499' height='525'/></p> <p>What we’ve talked about so far one can think of as being “normal and customary” uses of <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg38.png' alt='' title='' width='13' height='14'/>. But there are all sorts of corner cases that can show up. For example, what happens if you have a <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg39.png' alt='' title='' width='17' height='14'/> that refers to a cell you delete? Well, within a single (kernel) session that’s OK, because the expression “behind” the cell is still available in the kernel (unless you reset your <tt><a href="http://reference.wolfram.com/language/ref/$HistoryLength.html">$HistoryLength</a></tt> etc.). Still, the <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg40.png' alt='' title='' width='17' height='14'/> will show up with a “red broken link” to indicate that “there could be trouble”:</p> <p><img src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg41.png' alt='Red broken link' title='Red broken link' width='91' height='36'/></p> <p>And indeed if you go to a different (kernel) session there will be trouble—because the information you need to get the expression to which the <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg42.png' alt='' title='' width='17' height='14'/> refers is simply no longer available, so it has no choice but to show up in a kind of everything-has-fallen-apart “surrender state” as:</p> <p><img src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg43.png' alt='Surrender state' title='Surrender state' width='87' height='36'/></p> <p><img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg44.png' alt='' title='' width='13' height='14'/> is primarily useful when it refers to cells in the notebook you’re currently using (and indeed the autosuggest menu will contain only cells from your current notebook). But what if it ends up referring to a cell in a different notebook, say because you copied the cell from one notebook to another? It’s a precarious situation. But if all relevant notebooks are open, <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg45.png' alt='' title='' width='17' height='14'/> can still work, though it’s displayed in purple with an action-at-a-distance “wi-fi icon” to indicate its precariousness:</p> <p><img src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg46.png' alt='Wi-fi icon' title='Wi-fi icon' width='74' height='18'/></p> <p>And if, for example, you start a new session, and the notebook containing the “source” of the <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg47.png' alt='' title='' width='17' height='14'/> isn’t open, then you’ll get the “surrender state”. (If you open the necessary notebook it’ll “unsurrender” again.)</p> <p>Yes, there are lots of tricky cases to cover (in fact, many more than we’ve explicitly discussed here). And indeed seeing all these cases makes us not feel bad about how long it’s taken for us to conceptualize and implement <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg48.png' alt='' title='' width='13' height='14'/>. </p> <p>The most common way to access <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg49.png' alt='' title='' width='17' height='14'/> is to use the <tt>%</tt> autosuggest menu. But if you know you want a <img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/07/sw07302023flowimg50.png' alt='' title='' width='17' height='14'/>, you can always get it by “pure typing”, using for example <span class="kbd"><kbd>ESC</kbd><kbd><tt>%</tt></kbd><kbd>ESC</kbd></span>. (And, yes, <span class="kbd"><kbd>ESC</kbd><kbd><tt>%%</tt></kbd><kbd>ESC</kbd></span> or <span class="kbd"><kbd>ESC</kbd><kbd><tt>%</tt>5</kbd><kbd>ESC</kbd></span> etc. also work, so long as the necessary cells are present in your notebook.)</p> <h2 id="the-ux-journey-continues-new-typing-affordances-and-more">The UX Journey Continues: New Typing Affordances, and More</h2> <p>We invented <a href="https://www.wolfram.com/notebooks/">Wolfram Notebooks</a> more than 36 years ago, and we’ve been improving and polishing them ever since. And in Version 14.1 we’re implementing several new ideas, particularly around making it even easier to type Wolfram Language code.</p> <p>It’s worth saying at the outset that good UX ideas quickly become essentially invisible. They just give you hints about how to interpret something or what to do with it. And if they’re doing their job well, you’ll barely notice them, and everything will just seem “obvious”. </p> <p>So what’s new in UX for Version 14.1? First, there’s a story around brackets. We first introduced syntax coloring for unmatched brackets back in the late 1990s, and gradually polished it over the following two decades. Then in 2021 we started “automatching” brackets (and other delimiters), so that as soon as you type “f[” you immediately get <tt>f[ ]</tt>.</p> <p>But how do you keep on typing? You could use an <span class="kbd"><kbd><img style="margin-bottom: -1px" class='' src="https://content.wolfram.com/uploads/sites/32/2022/10/rightarrow2.png" width='15' height='11' ></kbd></span> to “move through” the <tt>]</tt>. But we’ve set it up so you can just “type through” <tt>]</tt> by typing <span class="kbd"><kbd>]</kbd></span>. In one of those typical pieces of UX subtlety, however, “type through” doesn’t always make sense. For example, let’s say you typed <tt>f[x]</tt>. Now you click right after <tt>[</tt> and you type <tt>g[</tt>, so you’ve got <tt>f[g[x]</tt>. You might think there should be an autotyped <tt>]</tt> to go along with the <tt>[</tt> after <tt>g</tt>. But where should it go? Maybe you want to get <tt>f[g[x]]</tt>, or maybe you’re really trying to type <tt>f[g[],x]</tt>. We definitely don’t want to autotype <tt>]</tt> in the wrong place. So the best we can do is not autotype anything at all, and just let you type the <tt>]</tt> yourself, where you want it. But remember that with <tt>f[x]</tt> on its own, the <tt>]</tt> is autotyped, and so if you type <span class="kbd"><kbd>]</kbd></span> yourself in this case, it’ll just type through the autotyped <tt>]</tt> and you won’t explicitly see it.</p> <p>So how can you tell whether a ] you type will explicitly show up, or will just be “absorbed” as type-through? In Version 14.1 there’s now different syntax coloring for these cases: yellow if it’ll be “absorbed”, and pink if it’ll explicitly show up. </p> <p>This is an example of non-type-through, so <tt><a href="http://reference.wolfram.com/language/ref/Range.html">Range</a></tt> is colored yellow and the <span class="kbd"><kbd>]</kbd></span> you type is “absorbed”:</p> <p> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/syntaxGIF1.gif' alt='Range highlighted yellow' title='Range highlighted yellow' width='77' height='19'></p> <p>And this is an example of non-type-through, so <tt><a href="http://reference.wolfram.com/language/ref/Round.html">Round</a></tt> is colored pink and the <span class="kbd"><kbd>]</kbd></span> you type is explicitly inserted:</p> <p> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/syntaxGIF2.gif' alt='Round highlighted pink' title='Round highlighted pink' width='124' height='16'></p> <p>This may all sound very fiddly and detailed—and for us in developing it, it is. But the point is that you don’t explicitly have to think about it. You quickly learn to just “take the hint” from the syntax coloring about when your closing delimiters will be “absorbed” and when they won’t. And the result is that you’ll have an even smoother and faster typing experience, with even less chance of unmatched (or incorrectly matched) delimiters.</p> <p>The new syntax coloring we just discussed helps in typing code. In Version 14.1 there’s also something new that helps in reading code. It’s an enhanced version of something that’s actually common in IDEs: when you click (or select) a variable, every instance of that variable immediately gets highlighted:</p> <p> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/syntaxGIF3A.gif' alt='Highlighted variable' title='Highlighted variable' width='592' height='234'></p> <p>What’s subtle in our case is that we take account of the scoping of localized variables—putting a more colorful highlight on instances of a variable that are in scope:</p> <p><img src='https://content.wolfram.com/sites/43/2024/07/sw07302024UXimg4.png' alt='Multiple instances of a variable' title='Multiple instances of a variable' width='537' height='85'/></p> <p>One place this tends to be particularly useful is in understanding nested pure functions that use <tt>#</tt>. By clicking a <tt>#</tt> you can see which other instances of <tt>#</tt> are in the same pure function, and which are in different ones (the highlight is bluer inside the same function, and grayer outside):</p> <p><img src='https://content.wolfram.com/sites/43/2024/07/sw07302024UXimg5.png' alt='Highlighting in nested functions' title='Highlighting in nested functions' width='304' height='14'/></p> <p>On the subject of finding variables, another change in Version 14.1 is that fuzzy name autocompletion now also works for contexts. So if you have a symbol whose full name is <tt>context1`subx`var2</tt> you can type <tt>c1x</tt> and you’ll get a completion for the context; then accept this and you get a completion for the symbol.</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07302024UXimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08142024UXBCimg6.png' alt='' title='' width='183' height='43'> </div> </p></div> <p>There are also several other notable UX “tune-ups” in Version 14.1. For many years, there’s been an “information box” that comes up whenever you hover over a symbol. Now that’s been extended to entities—so (alongside their explicit form) you can immediately get to information about them and their properties:</p> <p><img src='https://content.wolfram.com/sites/43/2024/07/sw07302024UXimg7.png' alt='Entity information box' title='Entity information box' width='352' height='45'/></p> <p>Next there’s something that, yes, I personally have found frustrating in the past. Say you’ve a file, or an image, or something else somewhere on your computer’s desktop. Normally if you want it in a Wolfram Notebook you can just drag it there, and it will very beautifully appear. But what if the thing you’re dragging is very big, or has some other kind of issue? In the past, the drag just failed. Now what happens is that you get the explicit <tt><a href="http://reference.wolfram.com/language/ref/Import.html">Import</a></tt> that the dragging would have done, so that you can run it yourself (getting progress information, etc.), or you can modify it, say adding relevant options. </p> <p>Another small piece of polish that’s been added in Version 14.1 has to do with <tt>Preferences</tt>. There are a lot of things you can set in the notebook front end. And they’re explained, at least briefly, in the many <tt>Preferences</tt> panels. But in Version 14.1 there are now <span class="kbd"><kbd>(i)</kbd></span> buttons that give direct links to the relevant workflow documentation:</p> <p><img src='https://content.wolfram.com/sites/43/2024/07/sw07302024UXimg8.png' alt='Direct link to workflow documentation' title='Direct link to workflow documentation' width='619' height='167'/></p> <h2 id="syntax-for-natural-language-input">Syntax for Natural Language Input</h2> <p>Ever since shortly after <a href="https://www.wolframalpha.com/">Wolfram|Alpha</a> was released in 2009, there’ve been ways to access its natural language understanding capabilities in the Wolfram Language. Foremost among these has been <span class="kbd"><kbd>CTRL</kbd><kbd>=</kbd></span>—which lets you type free-form natural language and immediately get a Wolfram Language version, often in terms of entities, etc.:</p> <p> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024syntaximg1.png' alt='Wolfram|Alpha entities' title='Wolfram|Alpha entities' width='453' height='24'></p> <p>Generally this is a very convenient and elegant capability. But sometimes one may want to just use plain text to specify natural language input, for example so that one doesn’t interrupt one’s textual typing of input.</p> <p>In Version 14.1 there’s a new mechanism for this: syntax for directly entering free-form natural language input. The syntax is a kind of a “textified” version of <span class="kbd"><kbd>CTRL</kbd><kbd>=</kbd></span>: <tt>=[…]</tt>. When you type <tt>=[...]</tt> as input nothing immediately happens. It’s only when you evaluate your input that the natural language gets interpreted—and then whatever it specifies is computed.</p> <p>Here’s a very simple example, where each <tt>=[…]</tt> just turns into an entity:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024syntaximg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024syntaximg2.png' alt='' title='' width='227' height='54'> </div> </p></div> <p>But when the result of interpreting the natural language is an expression that can be further evaluated, what will come out is the result of that evaluation:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024syntaximg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024syntaximg3.png' alt='' title='' width='418' height='55'> </div> </p></div> <p>One feature of using <tt>=[…]</tt> instead of <span class="kbd"><kbd>CTRL</kbd><kbd>=</kbd></span> is that <tt>=[…]</tt> is something anyone can immediately see how to type:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024syntaximg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024syntaximg4.png' alt='' title='' width='518' height='267'> </div> </p></div> <p>But what actually is <tt>=[…]</tt>? Well, it’s just input syntax for the new function <tt><a href="http://reference.wolfram.com/language/ref/FreeformEvaluate.html">FreeformEvaluate</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024syntaximg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024syntaximg5.png' alt='' title='' width='219' height='51'> </div> </p></div> <p>You can use <tt><a href="http://reference.wolfram.com/language/ref/FreeformEvaluate.html">FreeformEvaluate</a></tt> inside a program—here, rather whimsically, to see what interpretations are chosen by default for “a” followed by each letter of the alphabet:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024syntaximg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024syntaximg6.png' alt='' title='' width='732' height='164'> </div> </p></div> <p>By default, <tt><a href="http://reference.wolfram.com/language/ref/FreeformEvaluate.html">FreeformEvaluate</a></tt> interprets your input, then evaluates it. But you can also specify that you want to hold the result of the interpretation:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024syntaximg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024syntaximg7.png' alt='' title='' width='383' height='55'> </div> </p></div> <h2 id="diff---for-notebooks-and-more">Diff[ ] … for Notebooks and More!</h2> <p>It’s been a very long-requested capability: give me a way to tell what changed, particularly in a notebook. It’s fairly easy to do “diffs” for plain text. But for notebooks—as structured symbolic documents—it’s a much more complicated story. But in Version 14.1 it’s here! We’ve got a function <tt><a href="http://reference.wolfram.com/language/ref/Diff.html">Diff</a></tt> for doing diffs in notebooks, and actually also in many other kinds of things. </p> <p>Here’s an example, where we’re requesting a “side-by-side view” of the diff between two notebooks:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08142024diffUPDATEimg1.png' alt='' title='' width='438' height='184'/> </div> </p></div> <p>And here’s an “alignment chart view” of the diff:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08142024diffUPDATEimg2.png' alt='Click to enlarge' title='Click to enlarge' width='415' height='176'/> </div> </p></div> <p>Like everything else in the Wolfram Language, a “diff” is a symbolic expression. Here’s an example:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08142024diffUPDATEimg3.png' alt='' title='' width='447' height='137'/> </div> </p></div> <p>There are lots of different ways to display a diff object; many of them one can select interactively with the menu:</p> <p><img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw08142024diffUPDATEimg4.png' alt='Diff object viewing options' title='Diff object viewing options' width='352' height='450'/></p> <p>But the most important thing about diff objects is that they can be used programmatically. And in particular <tt><a href="http://reference.wolfram.com/language/ref/DiffApply.html">DiffApply</a></tt> applies the diffs from a diff object to an existing object, say a notebook. </p> <p>What’s the point of this? Well, let’s imagine you’ve made a notebook, and given a copy of it to someone else. Then both you and the person to whom you’ve given the copy make changes. You can create a diff object of the diffs between the original version of the notebook, and the version with your changes. And if the changes the other person made don’t overlap with yours, you can just take your diffs and use <tt><a href="http://reference.wolfram.com/language/ref/DiffApply.html">DiffApply</a></tt> to apply your diffs to their version, thereby getting a “merged notebook” with both sets of changes made.</p> <p>But what if your changes might conflict? Well, then you need to use the function <tt><a href="http://reference.wolfram.com/language/ref/Diff3.html">Diff3</a></tt>. <tt><a href="http://reference.wolfram.com/language/ref/Diff3.html">Diff3</a></tt> takes your original notebook and two modified versions, and does a “three-way diff” to give you a diff object in which any conflicts are explicitly identified. (And, yes, three-way diffs are familiar from source control systems in which they provide the back end for making the merging of files as automated as possible.)</p> <p>Notebooks are an important use case for <tt><a href="http://reference.wolfram.com/language/ref/Diff.html">Diff</a></tt> and related functions. But they’re not the only one. <tt><a href="http://reference.wolfram.com/language/ref/Diff.html">Diff</a></tt> can perfectly well be applied, for example, just to lists:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg6.png' alt='' title='' width='467' height='102'> </div> </p></div> <p>There are many ways to display this diff object; here’s a side-by-side view: </p> <p><img src='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg7A.png' alt='Side-by-side diff view' title='Side-by-side diff view' width='718' height='245'/></p> <p>And here’s a “unified view” reminiscent of how one might display diffs for lines of text in a file:</p> <p><img src='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg8A.png' alt='Unified diff view' title='Unified diff view' width='718' height='259'/></p> <p>And, speaking of files, <tt><a href="http://reference.wolfram.com/language/ref/Diff.html">Diff</a></tt>, etc. can immediately be applied to files:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg9.png' alt='' title='' width='591' height='268'> </div> </p></div> <p><tt><a href="http://reference.wolfram.com/language/ref/Diff.html">Diff</a></tt>, etc. can also be applied to cells, where they can analyze changes in both content and styles or metadata. Here we’re creating two cells and then diffing them—showing the result in a side by side:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg10.png' alt='' title='' width='722' height='291'> </div> </p></div> <p>In “Combined” view the “pure insertions” are highlighted in green, the “pure deletions” in red, and the “edits” are shown as deletion/insertion stacks:</p> <p><img src='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg11A.png' alt='Combined diff view highlighting' title='Combined diff view highlighting' width='601' height='146'/></p> <p>Many uses of diff technology revolve around content development—editing, software engineering, etc. But in the Wolfram Language <tt><a href="http://reference.wolfram.com/language/ref/Diff.html">Diff</a></tt>, etc. are set up also to be convenient for information visualization and for various kinds of algorithmic operations. For example, to see what letters differ between the Spanish and Polish alphabets, we can just use <tt><a href="http://reference.wolfram.com/language/ref/Diff.html">Diff</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg12.png' alt='' title='' width='608' height='102'> </div> </p></div> <p>Here’s the “pure visualization”:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg13.png' alt='' title='' width='574' height='44'> </div> </p></div> <p>And here’s an alternate “unified summary” form:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg14.png' alt='' title='' width='737' height='75'> </div> </p></div> <p>Another use case for <tt><a href="http://reference.wolfram.com/language/ref/Diff.html">Diff</a></tt> is bioinformatics. We retrieve two genome sequences—as strings—then use <tt><a href="http://reference.wolfram.com/language/ref/Diff.html">Diff</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg15_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg15.png' alt='' title='' width='719' height='223'> </div> </p></div> <p>We can take the resulting diff object and show it in a different form—here character alignment: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg16_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg16.png' alt='' title='' width='736' height='272'> </div> </p></div> <p>Under the hood, by the way, <tt><a href="http://reference.wolfram.com/language/ref/Diff.html">Diff</a></tt> is finding the differences using <tt><a href="http://reference.wolfram.com/language/ref/SequenceAlignment.html">SequenceAlignment</a></tt>. But while <tt><a href="http://reference.wolfram.com/language/ref/Diff.html">Diff</a></tt> is giving a “high-level symbolic diff object”, <tt><a href="http://reference.wolfram.com/language/ref/SequenceAlignment.html">SequenceAlignment</a></tt> is giving a direct low-level representation of the sequence alignment:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg17_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg17.png' alt='' title='' width='736' height='305'> </div> </p></div> <p>Information visualization isn’t restricted to two-way diffs; here’s an example with a three-way diff:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg18_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg18.png' alt='' title='' width='545' height='391'> </div> </p></div> <p>And here it is as a “unified summary”:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg19_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg19A.png' alt='' title='' width='543' height='200'> </div> </p></div> <p>There are all sorts of options for diffs. One that is sometimes important is <tt><a href="http://reference.wolfram.com/language/ref/DiffGranularity.html">DiffGranularity</a></tt>. By default the granularity for diffs of strings is <tt>"Characters"</tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg21_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg21.png' alt='' title='' width='514' height='102'> </div> </p></div> <p>But it’s also possible to set it to be <tt>"Words"</tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg22_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg22.png' alt='' title='' width='511' height='126'> </div> </p></div> <p>Coming back to notebooks, the most “interactive” form of diff is a “report”:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg23_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg23A.png' alt='' title='' width='404' height='327'> </div> </p></div> <p>In such a report, you can open cells to see the details of a specific change, and you can also click to jump to where the change occurred in the underlying notebooks.</p> <p>When it comes to analyzing notebooks, there’s another new feature in Version 14.1: <tt><a href="http://reference.wolfram.com/language/ref/NotebookCellData.html">NotebookCellData</a></tt>. <tt><a href="http://reference.wolfram.com/language/ref/NotebookCellData.html">NotebookCellData</a></tt> gives you direct programmatic access to lots of properties of notebooks. By default it generates a dataset of some of them, here for the notebook in which I’m currently authoring this:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg24_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg24A.png' alt='' title='' width='391' height='239'> </div> </p></div> <p>There are properties like the word count in each cell, the style of each cell, the memory footprint of each cell, and a thumbnail image of each cell. </p> <p>Ever since <a href="https://reference.wolfram.com/legacy/v6/guide/Mathematica.html">Version 6</a> in 2007 we’ve had the <tt><a href="http://reference.wolfram.com/language/ref/CellChangeTimes.html">CellChangeTimes</a></tt> option which records when cells in notebooks are created or modified. And now in Version 14.1 <tt><a href="http://reference.wolfram.com/language/ref/NotebookCellData.html">NotebookCellData</a></tt> provides direct programmatic access to this data. So, for example, here’s a date histogram of when the cells in the current notebook were last changed:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg26_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024diffimg26A.png' alt='' title='' width='613' height='202'> </div> </p></div> <h2 id="lots-of-little-language-tune-ups">Lots of Little Language Tune-Ups</h2> <p>It’s part of a journey of almost four decades. Steadily discovering—and inventing—new “lumps of computational work” that make sense to implement as functions or features in the Wolfram Language. The Wolfram Language is of course very much strong enough that one can build essentially any functionality from the primitives that already exist in it. But part of the point of the language is to define the best “elements of computational thought”. And particularly as the language progresses, there’s a continual stream of new opportunities for convenient elements that get exposed. And in Version 14.1 we’ve implemented quite a diverse collection of them.</p> <p>Let’s say you want to nestedly compose a function. Ever since <a href="https://reference.wolfram.com/legacy/v1/">Version 1.0</a> there’s been <tt><a href="http://reference.wolfram.com/language/ref/Nest.html">Nest</a></tt> for that:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07282024tuneupsimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07282024tuneupsimg1.png' alt='' title='' width='290' height='44'> </div> </p></div> <p>But what if you want the abstract nested function, not yet applied to anything? Well, in Version 14.1 there’s now an operator form of <tt><a href="http://reference.wolfram.com/language/ref/Nest.html">Nest</a></tt> (and <tt><a href="http://reference.wolfram.com/language/ref/NestList.html">NestList</a></tt>) that represents an abstract nested function that can, for example, be composed with other functions, as in</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07282024tuneupsimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07282024tuneupsimg2.png' alt='' title='' width='399' height='44'> </div> </p></div> <p>or equivalently:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07282024tuneupsimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07282024tuneupsimg3.png' alt='' title='' width='399' height='44'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07282024tuneupsimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07282024tuneupsimg4.png' alt='' title='' width='502' height='44'> </div> </p></div> <p>A decade ago we introduced functions like <tt><a href="http://reference.wolfram.com/language/ref/AllTrue.html">AllTrue</a></tt> and <tt><a href="http://reference.wolfram.com/language/ref/AnyTrue.html">AnyTrue</a></tt> that effectively “in one gulp” do a whole collection of separate tests. If one wanted to test whether there are any primes in a list, one can always do:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07282024tuneupsimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07282024tuneupsimg5.png' alt='' title='' width='332' height='43'> </div> </p></div> <p>But it’s better to “package” this “lump of computational work” into the single function <tt><a href="http://reference.wolfram.com/language/ref/AnyTrue.html">AnyTrue</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07282024tuneupsimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07282024tuneupsimg6.png' alt='' title='' width='274' height='43'> </div> </p></div> <p>In Version 14.1 we’re extending this idea by introducing <tt><a href="http://reference.wolfram.com/language/ref/AllMatch.html">AllMatch</a></tt>, <tt><a href="http://reference.wolfram.com/language/ref/AnyMatch.html">AnyMatch</a></tt> and <tt><a href="http://reference.wolfram.com/language/ref/NoneMatch.html">NoneMatch</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07282024tuneupsimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07282024tuneupsimg7.png' alt='' title='' width='370' height='43'> </div> </p></div> <p>Another somewhat related new function is <tt><a href="http://reference.wolfram.com/language/ref/AllSameBy.html">AllSameBy</a></tt>. <tt><a href="http://reference.wolfram.com/language/ref/SameQ.html">SameQ</a></tt> tests whether a collection of expressions are immediately the same. <tt><a href="http://reference.wolfram.com/language/ref/AllSameBy.html">AllSameBy</a></tt> tests whether expressions are the same by the criterion that the value of some function applied to them is the same:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07282024tuneupsimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07282024tuneupsimg8.png' alt='' title='' width='317' height='43'> </div> </p></div> <p>Talking of tests, another new feature in Version 14.1 is a second argument to <tt><a href="http://reference.wolfram.com/language/ref/QuantityQ.html">QuantityQ</a></tt> (and <tt><a href="http://reference.wolfram.com/language/ref/KnownUnitQ.html">KnownUnitQ</a></tt>), which lets you test not only whether something is a quantity, but also whether it’s a specific type of physical quantity:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07282024tuneupsimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/sw07282024tuneupsimg9A.png' alt='' title='' width='320' height='55'> </div> </p></div> <p>And now talking about “rounding things out”, Version 14.1 does that in a very literal way by enhancing the <tt><a href="http://reference.wolfram.com/language/ref/RoundingRadius.html">RoundingRadius</a></tt> option. For a start, you can now specify a different rounding radius for particular corners:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07282024tuneupsimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07282024tuneupsimg10.png' alt='' title='' width='461' height='55'> </div> </p></div> <p>And, yes, that’s useful if you’re trying to fit button-like constructs together:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07282024tuneupsimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07282024tuneupsimg11.png' alt='' title='' width='571' height='79'> </div> </p></div> <p>By the way, <tt><a href="http://reference.wolfram.com/language/ref/RoundingRadius.html">RoundingRadius</a></tt> now also works for rectangles inside <tt><a href="http://reference.wolfram.com/language/ref/Graphics.html">Graphics</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07282024tuneupsimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07282024tuneupsimg12.png' alt='' title='' width='561' height='95'> </div> </p></div> <p>Let’s say you have a string, like “hello”. There are many functions that operate directly on strings. But sometimes you really just want to use a function that operates on lists—and apply it to the characters in a string. Now in Version 14.1 you can do this using <tt><a href="http://reference.wolfram.com/language/ref/StringApply.html">StringApply</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07282024tuneupsimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07282024tuneupsimg13.png' alt='' title='' width='327' height='43'> </div> </p></div> <p>Another little convenience in Version 14.1 is the function <tt><a href="http://reference.wolfram.com/language/ref/BitFlip.html">BitFlip</a></tt>, which, yes, flips a bit in the binary representation of a number:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07282024tuneupsimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07282024tuneupsimg14.png' alt='' title='' width='158' height='43'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07282024tuneupsimg15_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07282024tuneupsimg15.png' alt='' title='' width='305' height='44'> </div> </p></div> <p>When it comes to Boolean functions, a detail that’s been improved in Version 14.1 is the conversion to NAND representation. By default, functions like <tt><a href="http://reference.wolfram.com/language/ref/BooleanConvert.html">BooleanConvert</a></tt> have allowed <tt><a href="http://reference.wolfram.com/language/ref/Nand.html">Nand</a></tt><tt>[p]</tt> (which is equivalent to <tt><a href="http://reference.wolfram.com/language/ref/Not.html">Not</a></tt><tt>[p]</tt>). But in Version 14.1 there’s now <tt>"BinaryNAND"</tt> which yields for example <tt><a href="http://reference.wolfram.com/language/ref/Nand.html">Nand</a></tt><tt>[p, p]</tt> instead of just <tt><a href="http://reference.wolfram.com/language/ref/Nand.html">Nand</a></tt><tt>[p]</tt> (i.e. <tt><a href="http://reference.wolfram.com/language/ref/Not.html">Not</a></tt><tt>[p]</tt>). So here’s a representation of <tt><a href="http://reference.wolfram.com/language/ref/Or.html">Or</a></tt> in terms of <tt><a href="http://reference.wolfram.com/language/ref/Nand.html">Nand</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07282024tuneupsimg16_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07282024tuneupsimg16.png' alt='' title='' width='344' height='44'> </div> </p></div> <h2 id="making-the-wolfram-compiler-easier-to-use">Making the Wolfram Compiler Easier to Use</h2> <p>Let’s say you have a piece of Wolfram Language code that you know you’re going to run a zillion times—so you want it to run absolutely as fast as possible. Well, you’ll want to make sure you’re doing the best algorithmic things you can (and making the best possible use of Wolfram Language superfunctions, etc.). And perhaps you’ll find it helpful to use things like <tt><a href="http://reference.wolfram.com/language/ref/DataStructure.html">DataStructure</a></tt> constructs. But ultimately if you really want your code to run absolutely as fast as your computer can make it, you’ll probably want to set it up so that it can be compiled using the <a href="https://reference.wolfram.com/language/Compile/tutorial/Overview.html">Wolfram Compiler</a>, directly to LLVM code and then machine code. </p> <p>We’ve been developing the Wolfram Compiler for many years, and it’s becoming steadily more capable (and efficient). And for example it’s become increasingly important in our own internal development efforts. In the past, when we wrote critical inner-loop internal code for the Wolfram Language, we did it in C. But in the past few years we’ve almost completely transitioned instead to writing pure Wolfram Language code that we then compile with the Wolfram Compiler. And the result of this has been a dramatically faster and more reliable development pipeline for writing inner-loop code.</p> <p>Ultimately what the Wolfram Compiler needs to do is to take the code you write and align it with the low-level capabilities of your computer, figuring out what low-level data types can be used for what, etc. Some of this can be done automatically (using all sorts of fancy symbolic and theorem-proving-like techniques). But some needs to be based on collaboration between the programmer and the compiler. And in Version 14.1 we’re adding several important ways to enhance that collaboration.</p> <p>The first thing is that it’s now easy to get access to information the compiler has. For example, here’s the type declaration the compiler has for the built-in function <tt><a href="http://reference.wolfram.com/language/ref/Dimensions.html">Dimensions</a></tt>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024compilerimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024compilerimg1.png' alt='' title='' width='654' height='121'> </div> </p></div> <p>And here’s the source code of the actual implementation the compiler is using for <tt><a href="http://reference.wolfram.com/language/ref/Dimensions.html">Dimensions</a></tt>, calling its intrinsic low-level internal functions like <tt>CopyTo</tt>:</p> <p><img src='https://content.wolfram.com/sites/43/2024/07/sw07312024compilerimg2.png' alt='Compiler source code' title='Compiler source code' width='610' height='366'/></p> <p>A function like <tt><a href="http://reference.wolfram.com/language/ref/Map.html">Map</a></tt> has a vastly more complex set of type declarations:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024compilerimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024compilerimg3.png' alt='' title='' width='662' height='323'> </div> </p></div> <p>For types themselves, <tt><a href="http://reference.wolfram.com/language/ref/CompilerInformation.html">CompilerInformation</a></tt> lets you see their type hierarchy:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024compilerimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024compilerimg4.png' alt='' title='' width='378' height='514'> </div> </p></div> <p>And for data structure types, you can do things like see the fields they contain, and the operations they support:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024compilerimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024compilerimg5.png' alt='' title='' width='431' height='307'> </div> </p></div> <p>And, by the way, something new in Version 14.1 is the function <tt><a href="http://reference.wolfram.com/language/ref/OperationDeclaration.html">OperationDeclaration</a></tt> which lets you declare operations to add to a data structure type you’ve defined. </p> <p>Once you actually start running the compiler, a convenient new feature in Version 14.1 is a detailed progress monitor that lets you see what the compiler is doing at each step:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024compilerimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024compilerimg6.png' alt='' title='' width='560' height='201'> </div> </p></div> <p>As we said, the key to compilation is figuring out how to align your code with the low-level capabilities of your computer. The Wolfram Language can do arbitrary symbolic operations. But many of those don’t align with low-level capabilities of your computer, and can’t meaningfully be compiled. Sometimes those failures to align are the result of sophistication that’s possible only with symbolic operations. But sometimes the failures can be avoided if you “unpack” things a bit. And sometimes the failures are just the result of programming mistakes. And now in Version 14.1 the Wolfram Compiler is starting to be able to annotate your code to show where the misalignments are happening, so you can go through and figure out what to do with them. (It’s something that’s uniquely possible because of the symbolic structure of the Wolfram Language and even more so of Wolfram Notebooks.)</p> <p>Here’s a very simple example:</p> <p><img src='https://content.wolfram.com/sites/43/2024/07/sw07312024compilerimg7.png' alt='Misalignment error message' title='Misalignment error message' width='604' height='95'/></p> <p>In compiled code, <tt><a href="http://reference.wolfram.com/language/ref/Sin.html">Sin</a></tt> expects a numerical argument, so a Boolean argument won’t work. Clicking the <span class="kbd"><kbd>Source</kbd></span> button lets you see where specifically something went wrong:</p> <p><img src='https://content.wolfram.com/sites/43/2024/07/sw07312024compilerimg8.png' alt='Error source' title='Error source' width='619' height='116'/></p> <p>If you have several levels of definitions, the <span class="kbd"><kbd>Source</kbd></span> button will show you the whole chain:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024compilerimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024compilerimg9AA.png' alt='' title='' width='595' height='144'><img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024compilerimg9B.png' alt='' title='' width='624' height='143'> </div> </p></div> <p>Here’s a slightly more complicated piece of code, in which the specific place where there’s a problem is highlighted:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024compilerimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024compilerimg10A.png' alt='' title='' width='589' height='91'><img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024compilerimg10B.png' alt='' title='' width='622' height='321'> </div> </p></div> <p>In a typical workflow you might start from pure Wolfram Language code, without <tt><a href="http://reference.wolfram.com/language/ref/Typed.html">Typed</a></tt> and other compilation information. Then you start adding such information, repeatedly trying the compilation, seeing what issues arise, and fixing them. And, by the way, because it’s completely efficient to call small pieces of compiled code within ordinary Wolfram Language code, it’s common to start by annotating and compiling the “innermost inner loops” in your code, and gradually “working outwards”. </p> <p>But, OK, let’s say you’ve successfully compiled a piece of code. Most of the time it’ll handle certain cases, but not others (for example, it might work fine with machine-precision numbers, but not be capable of handling arbitrary precision). By default, compiled code that’s running is set up to generate a message and revert to ordinary Wolfram Language evaluation if it can’t handle something:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024compilerAimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024compilerimg11A.png' alt='' title='' width='627' height='144'> </div> </p></div> <p>But in Version 14.1 there a new option <tt><a href="http://reference.wolfram.com/language/ref/CompilerRuntimeErrorAction.html">CompilerRuntimeErrorAction</a></tt> that lets you specify an action to take (or, in general, a function to apply) whenever a runtime error occurs. A setting of <tt><a href="http://reference.wolfram.com/language/ref/None.html">None</a></tt> aborts the whole computation if there’s a runtime error:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024compilerAimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024compilerimg12A.png' alt='' title='' width='630' height='200'> </div> </p></div> <h2 id="even-smoother-integration-with-external-languages">Even Smoother Integration with External Languages</h2> <p>Let’s say there’s some functionality you want to use, but the only implementation you have is in a package in some external language, like Python. Well, it’s now basically seamless to work with such functionality directly in the Wolfram Language—plugging into the whole symbolic framework and functionality of the Wolfram Language.</p> <p>As a simple example, here’s a function that uses the Python package faker to produce a random sentence (which of course would also be straightforward to do directly in Wolfram Language):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024externalimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024externalimg1.png' alt='' title='' width='587' height='62'> </div> </p></div> <p>The first time you run <tt>RandomSentence</tt>, the progress monitor will show you all sorts of messy things happening under the hood, as Python versions get loaded, dependencies get set up, and so on. But the point is that it’s all automatic, and so you don’t have to worry about it. And in the end, out pops the answer:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024externalimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024externalimg2.png' alt='' title='' width='478' height='119'> </div> </p></div> <p>And if you run the function again, all the setup will already have been done, and the answer will pop out immediately:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024externalimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024externalimg3.png' alt='' title='' width='361' height='44'> </div> </p></div> <p>An important piece of automation here is the conversion of data types. One of the great things about the Wolfram Language is that it has fully integrated symbolic representations for a very wide range of things—from videos to molecules to IP addresses. And when there are standard representations for these things in a language like Python, we’ll automatically convert to and from them.</p> <p>But particularly with more sophisticated packages, there’ll be a need to let the package deal with its own “external objects” that are basically opaque to the Wolfram Language, but can be handled as atomic symbolic constructs there. </p> <p>For example, let’s say we’ve started a Python external package <tt>chess</tt> (and, yes, there’s <a href="https://resources.wolframcloud.com/PacletRepository/resources/Wolfram/Chess/">a paclet in the Wolfram Paclet Repository</a> that has considerably more chess functionality):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024externalimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024externalimg4.png' alt='' title='' width='892' height='169'> </div> </p></div> <p>Now the state of a chessboard can be represented by an external object:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024externalimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024externalimg5A.png' alt='' title='' width='426' height='78'> </div> </p></div> <p>We can define a function to plot the board:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024externalimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024externalimg6A.png' alt='' title='' width='644' height='18'> </div> </p></div> <p>And now in Version 14.1 you can just pass your external object to the external function:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024externalimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024externalimg7A.png' alt='' title='' width='170' height='158'> </div> </p></div> <p>You can also directly extract attributes of the external object:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024externalimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024externalimg8A.png' alt='' title='' width='240' height='46'> </div> </p></div> <p>And you can call methods (here to make a chess move), changing the state of the external object:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024externalimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024externalimg9A.png' alt='' title='' width='426' height='78'> </div> </p></div> <p>Here’s a plot of a new board configuration:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024externalimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024externalimg10A.png' alt='' title='' width='170' height='159'> </div> </p></div> <p>This computes all legal moves from the current position, representing them as external objects:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024externalimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024externalimg11A.png' alt='' title='' width='481' height='136'> </div> </p></div> <p>Here are UCI string representations of these:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024externalimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024externalimg12A.png' alt='' title='' width='463' height='72'> </div> </p></div> <p>In what we’re doing here we’re immediately performing each external operation. But Version 14.1 introduces the construct <a href="http://reference.wolfram.com/language/ref/ExternalOperation.html"><tt>ExternalOperation</tt></a> which lets you symbolically represent an external operation, and for example build up collections of such operations that can all be performed together in a single external evaluation. <tt><a href="http://reference.wolfram.com/language/ref/ExternalObject.html">ExternalObject</a></tt> supports various built-in operations for each environment. So, for example, in Python we can use <span class="computer-voice">Call</span> and <span class="computer-voice">GetAttribute</span> to get the symbolic representation:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024externalimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024externalimg13A.png' alt='' title='' width='648' height='263'> </div> </p></div> <p>If we evaluate this, all these operations will get done together in the external environment:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/07/sw07312024externalimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/sw07312024externalimg14.png' alt='' title='' width='464' height='68'> </div> </p></div> <h2 id="standalone-wolfram-language-applications">Standalone Wolfram Language Applications!</h2> <p>Let’s say you’re writing an application in pretty much any programming language—and inside it you want to call Wolfram Language functionality. Well, you could always do that by using a web API served from the <a href="https://www.wolfram.com/cloud">Wolfram Cloud</a>. And you could also do it locally by running the Wolfram Engine. But in Version 14.1 there’s something new: a way of integrating a standalone Wolfram Language runtime right into your application. The Wolfram Language runtime is a dynamic library that you link into your program, and then call using a C-based API. How big is the runtime? Well, it depends on what you want to use in the Wolfram Language. Because we now have the technology to prune a runtime to include only capabilities needed for particular Wolfram Language code. And the result is that adding the Wolfram Language will often increase the disk requirements of your application only by a remarkably small amount—like just a few hundred megabytes or even less. And, by the way, you can distribute the Wolfram runtime as an integrated part of an application, with its users not needing their own licenses to run it.</p> <p>OK, so how does creating a standalone Wolfram-enabled application actually work? There’s a lot of software engineering (associated with the Wolfram Language runtime, how it’s called, etc.) under the hood. But at the level of the application programmer you only have to deal with our Standalone Applications SDK—whose interface is rather simple. </p> <p>As an example, here’s the C code part of a standalone application that uses the Wolfram Language to identify what (human) language a piece of text is in. The program here takes a string of text on its command line, then runs the Wolfram Language <tt><a href="http://reference.wolfram.com/language/ref/LanguageIdentify.html">LanguageIdentify</a></tt> function on it, and then prints a string giving the result: </p> <p><img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/Ccode1B.png' alt='C code using Wolfram Language' title='C code example using Wolfram Language' width='583' height='563'></p> <p>If we ignore issues of pruning, etc. we can compile this program just with (and, yes, the file paths are necessarily a bit long):</p> <p><img loading='lazy' src='https://content.wolfram.com/sites/43/2024/08/Ccode2C.png' alt='Compiled C program' title='Compiled C program' width='660' height='77'></p> <p>Now we can run the resulting executable directly from the command line—and it’ll act just like any other executable, even though inside it’s got all the power of a Wolfram Language runtime:</p> <p><img loading='lazy' src='https://content.wolfram.com/sites/43/2024/07/Ccode3C.png' alt='Command-line executable' title='Command-line executable' width='218' height='30'></p> <p>If we look at the C program above, it basically begins just by starting the Wolfram Language runtime (using <span class="computer-voice">WLR_SDK_START_RUNTIME()</span>). But then it takes the string <span class="computer-voice">(argv[1])</span> from the command line, embeds it in a Wolfram Language expression <tt><a href="http://reference.wolfram.com/language/ref/LanguageIdentify.html">LanguageIdentify</a></tt><tt>[</tt><em>string</em><tt>]</tt>, evaluates this expression, and extracts a raw string from the result.</p> <p>The functions, etc. that are involved here are part of the new Expression API supported by the Wolfram Language runtime dynamic library. The Expression API provides very clean capabilities for building up and taking apart Wolfram Language expressions from C. There are functions like <tt>wlr_</tt><tt>Symbol</tt><tt>("</tt><em>string</em><tt>")</tt> that form symbols, as well as macros like <nobr><tt>wlr_</tt><tt>List</tt><tt>(elem<sub>1</sub>, elem<sub>2</sub>, …)</tt></nobr> and <tt>wlr_</tt><tt>E</tt><tt>(</tt><em>head</em><tt>, arg<sub>1</sub>, arg<sub>2</sub>, …)</tt> that build up lists and general expressions. Then there’s the function <tt>wlr_Eval(expr)</tt> that calls the Wolfram Language evaluator. With functions like <tt>wlr_StringData(expr, &result, …)</tt> you can then extract content from expressions (here the characters in a string) and put it into C data structures. </p> <p>How does the Expression API relate to <a href="https://www.wolfram.com/wstp/">WSTP</a>? WSTP (“Wolfram Symbolic Transfer Protocol”) is our protocol for transferring symbolic expressions between processes. The Expression API, on the other hand, operates within a single process, providing the “glue” that connects C code to expressions in the Wolfram Language runtime. </p> <p>One example of a real-world use of our new Standalone Applications technology is the LSPServer application that will soon be in full distribution. LSPServer started from a pure (though somewhat lengthy) Wolfram Language paclet that provides Language Server Protocol services for annotating Wolfram Language code in programs like Visual Studio Code. To build the LSPServer standalone application we just wrote a tiny C program that calls the paclet, then compiled this and linked it against our Standalone Applications SDK. Along the way (using tools that we’re planning to soon make available)—and based on the fact that only a small part of the full functionality of the Wolfram Language is needed to support LSPServer—we pruned the Wolfram Language runtime, in the end getting a complete LSPServer application that’s only about 170 MB in size, and that shows no outside signs of having Wolfram Language functionality inside.</p> <h2 id="and-yet-more">And Yet More…</h2> <p>Is that all? Well, no. There’s more. Like new formatting of <tt><a href="http://reference.wolfram.com/language/ref/Root.html">Root</a></tt> objects (yes, I was frustrated with the old one). Or like a new drag-and-drop-to-answer option for <tt><a href="http://reference.wolfram.com/language/ref/QuestionObject.html">QuestionObject</a></tt> quizzes. Or like all the documentation we’ve added for new types of entities and interpreters. </p> <p>In addition, there’s also the continual stream of new data that we’ve curated, or that’s flowed in real time into the <a href="https://www.wolfram.com/language/core-areas/knowledgebase/">Wolfram Knowledgebase</a>. And beyond the core Wolfram Language itself, there’ve also been lots of functions added to the <a href="https://resources.wolframcloud.com/FunctionRepository">Wolfram Function Repository</a>, lots of paclets added to the <a href="https://resources.wolframcloud.com/PacletRepository/">Wolfram Language Paclet Repository</a>, not to mention new entries in the <a href="https://resources.wolframcloud.com/NeuralNetRepository/">Wolfram Neural Net Repository</a>, <a href="https://datarepository.wolframcloud.com/" target="_blank" rel="noopener">Wolfram Data Repository</a>, etc. </p> <p>Yes, as always it’s been a lot of work. But today it’s here, and we’re proud of it: Version 14.1!</p> <p></p> <p style="font-style: italic; color: #555;"> <style type="text/css"> div.bottomstripe { max-width:620px; margin-bottom:10px; background-color: #fff39a; border: solid 2px #ffd400; padding: 7px 10px 7px 10px; line-height: 1.2;} #blog .post_content .bottomstripe a, #blog .post_content .bottomstripe a:link, #blog .post_content .bottomstripe a:visited { font-family:"Source Sans Pro",Arial,Sans Serif; font-size:11pt; color:#aa0d00;} </style> <div class="bottomstripe"> <a href="https://www.wolfram.com/download-center/"><strong>Download your 14.1 now! » </strong> (It’s already live in the Wolfram Cloud!)</a> </div> ]]></content:encoded> <enclosure url="https://content.wolfram.com/sites/43/2024/07/manipulatevideo.mp4" length="587461" type="video/mp4" /> <enclosure url="https://content.wolfram.com/sites/43/2024/07/reapvideo.mp4" length="220541" type="video/mp4" /> <enclosure url="https://content.wolfram.com/sites/43/2024/07/hummingbird.mp4" length="548924" type="video/mp4" /> <enclosure url="https://content.wolfram.com/sites/43/2024/07/butterfly.mp4" length="190836" type="video/mp4" /> <enclosure url="https://content.wolfram.com/sites/43/2024/07/risefallromanempire.mp4" length="299030" type="video/mp4" /> <enclosure url="https://content.wolfram.com/sites/43/2024/07/visionpro1-big.mp4" length="3016749" type="video/mp4" /> </item> <item> <title>Ruliology of the “Forgotten” Code 10</title> <link>https://writings.stephenwolfram.com/2024/06/ruliology-of-the-forgotten-code-10/</link> <comments>https://writings.stephenwolfram.com/2024/06/ruliology-of-the-forgotten-code-10/#comments</comments> <pubDate>Sat, 01 Jun 2024 15:21:39 +0000</pubDate> <dc:creator><![CDATA[Stephen Wolfram]]></dc:creator> <category><![CDATA[Historical Perspectives]]></category> <category><![CDATA[New Kind of Science]]></category> <category><![CDATA[Ruliology]]></category> <guid isPermaLink="false">https://writings.stephenwolfram.com/?p=59703</guid> <description><![CDATA[<span class="thumbnail"><img width="128" height="108" src="https://content.wolfram.com/sites/43/2024/05/tile-Code10-v1.png" class="attachment-thumbnail size-thumbnail wp-post-image" alt="" /></span>My All-Time Favorite Science Discovery June 1, 1984—forty years ago today—is when it would be fair to say I made my all-time favorite science discovery. Like with basically all significant science discoveries (despite the way histories often present them) it didn’t happen without several long years of buildup. But June 1, 1984, was when I […]]]></description> <content:encoded><![CDATA[<span class="thumbnail"><img width="128" height="108" src="https://content.wolfram.com/sites/43/2024/05/tile-Code10-v1.png" class="attachment-thumbnail size-thumbnail wp-post-image" alt="" /></span><h2 id="my-all-time-favorite-science-discovery">My All-Time Favorite Science Discovery</h2> <p>June 1, 1984—forty years ago today—is when it would be fair to say I made my <a href="https://www.wolframscience.com/nks/p27--how-do-simple-programs-behave/">all-time favorite science discovery</a>. Like with <a href="https://wolfram-media.com/products/idea-makers/">basically all significant science discoveries</a> (despite the way histories often present them) it didn’t happen without several long years of buildup. But June 1, 1984, was when I finally had my “aha” moment—even though in retrospect the discovery had actually been hiding in plain sight for more than two years.</p> <p>My diary from 1984 has a cryptic note that shows what happened on June 1, 1984:</p> <p><img src="https://content.wolfram.com/sites/43/2024/05/sw053024aimg1.png" alt="Ruliology of the "Forgotten" Code 10" title="Ruliology of the "Forgotten" Code 10" width="480" height="537"></p> <p>There’s a part that says “BA 9 pm → LDN”, recording the fact that at 9pm that day I took a (British Airways) flight to London (from New York; I lived in Princeton at that time). “Sent vega monitor → SUN” indicates that I had sent the broken display of a computer I called “vega” to Sun Microsystems. But what’s important for our purposes here is the little “side” note:<br /> <em style="margin-left: 15px;">Take C10 pict.</em><br /> <em style="margin-left: 15px;">R30</em><br /> <em style="margin-left: 15px;">R110</em></p> <p>What did that mean? <em>C10</em>, <em>R30</em> and <em>R110</em> were my shorthand designations for particular, very simple programs of types I’d been studying: “code 10”, “rule 30” and “rule 110”. And my note reminded me that I wanted to take pictures of those programs with me that evening, making them on the laser printer I’d just got (laser printers were rare and expensive devices at the time). <span id="more-59703"></span></p> <p>I’d actually made (and <a href="https://files.wolframcdn.com/pub/www.stephenwolfram.com/pdf/statistical-mechanics-cellular-automata.pdf">even published</a>) pictures of all these programs before, but at least for rule 30 and rule 110 those pictures were very low resolution:</p> <p><a class="magnific image sourced" data-url="https://writings.stephenwolfram.com/2023/02/a-50-year-quest-my-personal-journey-with-the-second-law-of-thermodynamics/#statistical-mechanics-and-simple-programs" href="https://content.wolfram.com/sites/43/2024/05/sw053124img2.png"><img alt="Click to enlarge" title="Click to enlarge" src="https://content.wolfram.com/sites/43/2024/05/sw053124img2.png" width="282" height="188"></a></p> <p>But on June 1, 1984, my picture was much better:</p> <p><a class="magnific image sourced" data-url="https://writings.stephenwolfram.com/2023/02/a-50-year-quest-my-personal-journey-with-the-second-law-of-thermodynamics/" href="https://content.wolfram.com/sites/43/2024/05/sw053124img3.png"><img alt="Click to enlarge" title="Click to enlarge" src="https://content.wolfram.com/sites/43/2024/05/sw053124img3.png" width='423' height='226'></a></p> <p>For several years I’d been <a href="https://writings.stephenwolfram.com/2021/09/charting-a-course-for-complexity-metamodeling-ruliology-and-more/">studying the question of “where complexity comes from”</a>, for example in nature. I’d realized there was something very computational about it (and that had even led me to the <a href="https://www.wolframscience.com/nks/p737--computational-irreducibility/">concept of computational irreducibility</a>—a term I <a href="https://writings.stephenwolfram.com/2023/02/a-50-year-quest-my-personal-journey-with-the-second-law-of-thermodynamics/#computational-irreducibility-and-rule-30">coined just a few days before</a> June 1, 1984). But somehow I had imagined that “true complexity” must come from something already complex or at least random. Yet here in this picture, plain as anything, complexity was just being “created”, basically from nothing. And all it took was following a very simple rule, starting from a single black cell. </p> <p>Our usual intuition that to make something complex required “complex effort” <a href="https://www.wolframscience.com/nks/chap-2--the-crucial-experiment/#sect-2-2--the-need-for-a-new-intuition">was, I realized, simply wrong</a>. In the computational universe one needed a new intuition. And the picture of rule 30 I generated that day was what finally made me understand that. Still, although I hadn’t internalized it before, several years of work had prepared me for this. And just days later I <a href="https://writings.stephenwolfram.com/2023/02/a-50-year-quest-my-personal-journey-with-the-second-law-of-thermodynamics/#computational-irreducibility-and-rule-30">was at a conference</a> already talking confidently about the implications of what I’d seen in rule 30.</p> <p>Over the years that followed, rule 30 became basically the face of the phenomenon I had discovered. By 1985 I <a href="https://content.wolfram.com/sw-publications/2020/07/random-sequence-generation-cellular-automata.pdf">devoted a whole paper to it</a>, in <em><a href="https://www.wolframscience.com/nks/">A New Kind of Science</a></em> it was my <a href="https://content.wolfram.com/sw-publications/2020/07/random-sequence-generation-cellular-automata.pdf">initial and quintessential example</a>, for the past quarter century a picture of rule 30 has adorned my <a href="https://www.stephenwolfram.com/scrapbook/2017-rule-30-business-cards/">personal business cards</a>, and in 2019 we <a href="https://writings.stephenwolfram.com/2019/10/announcing-the-rule-30-prizes/">launched the Rule 30 Prizes</a> to promote the rich basic science of rule 30:</p> <p><a href='https://rule30prize.org'><img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw053124img4.png' alt='Rule 30 Prize' title='Rule 30 Prize' width='300' height='auto'/></a></p> <p>But what about “<em>C10</em>”—the first item in my cryptic note? What was that? And what became of it?</p> <h2 id="first-sightings-of-code-10">First Sightings of Code 10</h2> <p>Well, <em>C10</em> was “code 10”, or, more fully, “<em>k</em> = 2, <em>r</em> = 2 <a href="https://www.wolframscience.com/nks/p60--more-cellular-automata/">totalistic code 10 cellular automaton</a>”. (I used the term “code” as a way to indicate a totalistic, rather than general, “rule”.) And, actually, I had looked at code 10 several times before, never really paying much attention to it.</p> <p>The first explicit mention I find in my archives is from February 1983 (apparently reporting on something I’d done in January of that year). I had been doing all sorts of computer experiments on cellular automata, recording the results in a lab notebook. One page has observations about what I then called “summational rules” (I would soon rename these “totalistic”). And there’s code 10:</p> <p><a class="magnific image sourced" data-url="https://writings.stephenwolfram.com/2023/02/a-50-year-quest-my-personal-journey-with-the-second-law-of-thermodynamics/" href="https://content.wolfram.com/sites/43/2024/05/sw053124img5-min.png"><img alt="Click to enlarge" title="Click to enlarge" src="https://content.wolfram.com/sites/43/2024/05/sw053124img5-min.png" width='528' height='441'></a></p> <p>Mostly I had been studying the behavior starting from random initial conditions, but for code 10 I noted: “very irregular, even from simple initial state”. Within a couple of months I had even made (on an electrostatic printer) a high-resolution picture of code 10 starting from a single black cell—and here it is, prepared for publication, Scotch tape and all:</p> <p><a class='magnific image' alt='' title='' href='https://content.wolfram.com/sites/43/2024/05/sw053124img7.png'><img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw053124img7.png' alt='Click to enlarge' title='Click to enlarge' width='465' height='220'/></a></p> <p>It appeared in a paper I wrote in May 1983. But the paper (entitled “<a href="https://content.wolfram.com/sw-publications/2020/07/universality-complexity-cellular-automata.pdf">Universality and Complexity in Cellular Automata</a>”) was mostly about other things (for example, introducing my <a href="https://www.wolframscience.com/nks/chap-6--starting-from-randomness#sect-6-2--four-classes-of-behavior">four general classes of cellular automaton behavior</a> and talking quite a lot about <a href="https://www.wolframscience.com/nks/chap-6--starting-from-randomness#sect-6-8--structures-in-class-4-systems">code 20 as an example of class 4 rule</a>), and it contained only a passing comment about code 10:</p> <p><a class='magnific image' alt='' title='' href='https://content.wolfram.com/sites/43/2024/05/sw053124img8.png'><img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw053124img8.png' alt='Click to enlarge' title='Click to enlarge' width='576' height='394'/></a></p> <p>Code 10 is a range-2 rule, which means that the patterns it generates can grow by 2 cells on each side at each step. And the result is that the patterns quickly get quite wide, so that if one cuts them off when they “hit the edge of the page” (as my early programs “conveniently” tended to do) they don’t go very far, and one doesn’t get to see much of code 10’s behavior. </p> <p>And it was this piece of “ergonomics” that caused me to basically ignore code 10—and not to recognize the “rule 30 phenomenon” until I happened to produce that high-resolution image of rule 30 on June 1, 1984.</p> <p>I didn’t entirely forget code 10, for example <a href="https://www.wolframscience.com/nks/p882--why-these-discoveries-were-not-made-before/">mentioning it in a note</a> to “<a href="https://www.wolframscience.com/nks/p882--why-these-discoveries-were-not-made-before/">Why These Discoveries Were Not Made Before</a>” in my 2002 book <em>A New Kind of Science</em>:</p> <p><a class="magnific image sourced" data-url="https://www.wolframscience.com/nks/p882--why-these-discoveries-were-not-made-before/" href="https://content.wolfram.com/sites/43/2024/05/sw053124img9.png"><img alt="Click to enlarge" title="Click to enlarge" src="https://content.wolfram.com/sites/43/2024/05/sw053124img9.png" width='508' height='364'></a></p> <p>But now that forty years have passed since I made—and basically ignored—that “<em>C10</em>” picture, I thought it would be nice to go back and see what I missed, and to use our modern <a href="https://www.wolfram.com/language">Wolfram Language</a> tools to spend a few hours checking out the story of code 10. </p> <p>It’s an exercise in what I <a href="https://writings.stephenwolfram.com/2021/09/charting-a-course-for-complexity-metamodeling-ruliology-and-more/">now call “ruliology”</a>—the basic science of studying what simple rules do. And whenever one does ruliology there are certain standard things one can look at—that I showed many examples of in <em>A New Kind of Science</em>. But in a quintessential reflection of computational irreducibility there are also always “surprises”, and special phenomena one did not expect. And so it is with code 10.</p> <h2 id="code-10:-the-basic-story">Code 10: The Basic Story</h2> <p><span></p> <div id="gpt-stripe" style="background: #f6fcff87; padding: 0.75rem 1.5rem;border: 1px solid #aeccd987;font-family: 'Source Sans Pro', sans-serif;margin-bottom: 2.5rem;max-width: 620px;/* font-size: .6rem; */"> <p style="font-size: .85rem;color: #3f5f6a;line-height: 1.5;padding-bottom: 0;display: block;"><em>Note: Click any diagram to get Wolfram Language code to reproduce it.</em></p> </div> <p>Code 10 is a cellular automaton operating on a line of black and white cells, at each step adding up the values of the 5 cells up to distance 2 from any given cell (black is 1, white is 0). If the total is 1 or 3, the cell is black on the next step; otherwise it’s white (in base 2 the number 10 is 001010):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw053124img10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw053124img10.png' alt='' title='' width='346' height='33'> </div> </p></div> <p>And, yes, in many ways this rule is even simpler to describe—at least in words—than rule 30. And if one thinks of it <a href="https://www.wolframscience.com/nks/notes-3-2--rule-expressions-for-cellular-automata/">in terms of Boolean expressions</a>, it can also be written in a very simple form:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw053124img11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw053124img11.png' alt='' title='' width='318' height='14'> </div> </p></div> <p>(By the way, as a general <em>k</em> = 2, <em>r</em> = 2 rule, code 10 is rule 376007062.)</p> <p>So what does code 10 do? Here are a few steps of its evolution starting from a single black cell:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw053124img12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw053124img12.png' alt='' title='' width='647' height='166'> </div> </p></div> <p>And here are 2000 steps:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw053024img13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/2000steps053124.png' alt='' title='' width='620' height='auto'> </div> </p></div> <p>And, yes, even though it’s a simple rule, its behavior looks highly complex, and in many ways quite random. One immediate observation is that—unlike rule 30—code 10 is symmetric, so the pattern it generates is left-right symmetric. The center column isn’t interesting: after having black cells for 2 steps, it’s white thereafter. (And by substituting values <em>yx</em>0<em>xy</em> into the Boolean expression above, it’s easy to prove this.)</p> <p>Filling the white region around the center column with red we get:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/06/sw060324updatesimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw053124img14.png' alt='' title='' width='646' height='282'> </div> </p></div> <p>There doesn’t seem to be any long-range regularity to the way the width of this region changes:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw053024img17_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw053124img18.png' alt='' title='' width='619' height='164'> </div> </p></div> <p>And indeed the (even) widths seem at least close to exponentially distributed:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw053024img19a_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw053124img19.png' alt='' title='' width='331' height='115'> </div> </p></div> <p>What if one goes one column to the left or right of the center? Here’s the beginning of the sequence one gets:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw053024img20_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw053124img20.png' alt='' title='' width='640' height='7'> </div> </p></div> <p>And, yes, every other cell is white. Picking only “even-numbered positions” we get:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw053024img21_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw053124img21.png' alt='' title='' width='640' height='7'> </div> </p></div> <p>Looking at the accumulated mean for 100,000 steps suggests that this sequence isn’t “uniformly random”, and that slightly fewer than 50% of the cells end up being black: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw053024img22_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw053124img22.png' alt='' title='' width='483' height='125'> </div> </p></div> <p>Going away from the center line, every other column has white cells every two steps. Sampling the pattern only at “odd positions” in both “space and time” we get a pattern that looks similar—though not identical—to our original one:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw053024img23_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw053024aimg13.png' alt='' title='' width='620' height='auto'> </div> </p></div> <p>Looking at every cell, the overall density of the pattern seems to approach about 0.361. Looking only at “odd positions” the overall density seems to be about 0.49. And, yes, the fact that it doesn’t seem to become exactly 1/2 is one of those <a href="https://www.wolframscience.com/nks/notes-7-6--circularity-in-code-746/">typical “not-quite-as-expected” things</a> that one routinely finds in doing ruliology.</p> <p>There are some aspects of the code 10 pattern, though, that inevitably work in particular ways. For example, if we “rotate” the pattern so that its boundary is vertical, we can see that close to the boundary the pattern is periodic:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw053024img24_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw053124img24a.png' alt='' title='' width='571' height='181'> </div> </p></div> <p>The <a href="https://www.wolframscience.com/nks/notes-2-1--rule-30/">period progressively doubles</a> at depths separated by 1, 1, 4, 6, 8, 14, 124, …—yielding what may perhaps ultimately be a logarithmic growth of period with depth:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw053124img25_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw053124img25-v2.png' alt='' title='' width='377' height='auto'> </div> </p></div> <h2 id="other-initial-conditions,-and-a-surprise">Other Initial Conditions, and a Surprise</h2> <p>We’ve looked at what happens with an initial condition consisting of a single black cell. But what about other initial conditions? Here are a few examples:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw053124img26_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/06/sw060324updatesimg2.png' alt='' title='' width='570' height='250'> </div> </p></div> <p>We might have thought that the “strength of randomness” would be large enough that we’d get patterns that look basically the same in all cases. But so what’s going on in the <img src='https://content.wolfram.com/sites/43/2024/05/sw053024aimg27.png' style="margin-bottom: -4px" width='20' height='16'/> case? Running twice and five times as long reveals it’s actually nothing special; there just happen to be a few large triangles near the top:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw053124img28_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/06/sw060324updatesimg4.png' alt='' title='' width='569' height='137'> </div> </p></div> <p>So will nothing else notable happen with larger initial conditions? </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw053124img29_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/06/sw060324updatesimg5.png' alt='' title='' width='628' height='208'> </div> </p></div> <p>What about <img src='https://content.wolfram.com/sites/43/2024/05/sw053024aimg30.png' width='30' height='7' style="margin-bottom: 1px"/>? Let’s run that a little longer:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw053124img31_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw053124img31.png' alt='' title='' width='657' height='329'> </div> </p></div> <p>And OMG! It’s not random and unpredictable at all. It’s a <a href="https://www.wolframscience.com/nks/p26--how-do-simple-programs-behave/">nested pattern</a>!</p> <p>Even in the midst of all that randomness and computational irreducibility, here is a dash of computational reducibility—and a reminder that there are always pockets of reducibility to be found in any computationally irreducible system, though there’s no guarantee how difficult they will be to find in any given case. </p> <p>The particular nested pattern we get here is a bit like the one from the additive <a href="https://www.wolframscience.com/nks/p58--more-cellular-automata/">elementary rule 150</a>, that simply computes the total mod 2 of the three cells in each neighborhood:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw053124img32_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw053124img32.png' alt='' title='' width='332' height='167'> </div> </p></div> <p>And it turns out to be almost exactly the <em>r</em> = 2 analog of this—the additive rule (code 42) that takes the total mod 2 of the five cells in the <em>r</em> = 2 neighborhood:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw053124img33_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw053124img33.png' alt='' title='' width='435' height='218'> </div> </p></div> <p>The <a href="https://www.wolframscience.com/nks/notes-6-6--fractal-dimensions-of-additive-cellular-automata/">limiting fractal dimension of this pattern is</a>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw053124img37_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw053124img37.png' alt='' title='' width='388' height='19'> </div> </p></div> <p>Is <img src='https://content.wolfram.com/sites/43/2024/05/sw053024aimg38.png' width='30' height='7'/> unique, or does this same phenomenon happen with other “seeds”? It turns out to happen again for:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw053124img39_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw053124img39.png' alt='' title='' width='558' height='36'> </div> </p></div> <p>So what’s going on here? Comparing the detailed pattern in the code 10 <img src='https://content.wolfram.com/sites/43/2024/05/sw053024aimg40.png' width='30' height='7' style="margin-bottom: -1px"/> case with the additive rule <img src='https://content.wolfram.com/sites/43/2024/05/sw053024aimg41.png' style="margin-bottom: -5px" width='8' height='16'/> case, there’s no immediate obvious correspondence:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw053124img42_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw053124img42.png' alt='' title='' width='474' height='251'> </div> </p></div> <p>But if we look at the rules for code 10 and code 42 respectively:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw053124img43_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw053124img43.png' alt='' title='' width='298' height='69'> </div> </p></div> <p>We notice that there’s really only one difference. In code 10, <img src='https://content.wolfram.com/sites/43/2024/05/sw053124img44.png' style="margin-bottom: -6px" width='36' height='16'/> gives <img src='https://content.wolfram.com/sites/43/2024/05/sw053124img45.png' width='8' height='8'/> while in code 42, it gives <img src='https://content.wolfram.com/sites/43/2024/05/sw053124img46.png' width='8' height='8'/>. In other words, if code 10 avoids ever generating any <img src='https://content.wolfram.com/sites/43/2024/05/sw053124img47.png' style="margin-bottom: -5px" width='36' height='16'/> block, it will inevitably behave just like code 42—and shows nested before. And that’s what happens for the initial conditions above; they can for example lead to <img src='https://content.wolfram.com/sites/43/2024/05/sw053124img48.png' style="margin-bottom: -6px" width='30' height='16'/> blocks, but never <img src='https://content.wolfram.com/sites/43/2024/05/sw053124img49.png' style="margin-bottom: -5px" width='36' height='16'/>.</p> <p>Another notable and at first unexpected phenomenon concerns the overall density of black cells in patterns from different initial conditions: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw053124img50_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw053124img50.png' alt='' title='' width='600' height='209'> </div> </p></div> <p>And what we find is that for even-length initial blocks the density is about 0.47, while for odd ones it’s about 0.36. At first it might seem very strange that something as global as overall density could be affected by the initial conditions. But once again, it’s a story of what blocks can occur: in the odd-length case, there’s a checkerboard of guaranteed-white cells, which just doesn’t exist in the even-length case.</p> <h2 id="other-things-to-study">Other Things to Study</h2> <p>We’ve been looking at what code 10 does with specific, simple initial conditions. What about with <a href="https://www.wolframscience.com/nks/chap-6--starting-from-randomness/">random initial conditions</a>? Well, it’s not terribly exciting. It basically just looks random all the way through—which, by the way, is part of the reason I didn’t pay much attention to code 10 back in 1983:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw053124img51_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw053124img51.png' alt='' title='' width='628' height='210'> </div> </p></div> <p>But even though this looks quite random, it’s for example not the case that every single possible block of values can occur. Though it’s very close. Let’s say we start from all possible sequences of 0s and 1s in the initial conditions. Then—using <a href="https://content.wolfram.com/sw-publications/2020/07/computation-theory-cellular-automata.pdf">methods I developed in 1984</a> based <a href="https://www.wolframscience.com/nks/p278--the-notion-of-attractors/">on finite automata</a>—it’s possible to determine that even after 1 step there are some blocks of values that can’t occur. But it turns out that one has to go all the way to blocks of length 36 before one finds an example:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw053124img52_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw053124img52.png' alt='' title='' width='343' height='11'> </div> </p></div> <p>Although the patterns generated by code 10 generally look quite random, if we look closely we can see at least patches that are fairly regular. The most obvious examples are white triangles. But there are other examples, most notably associated with regions consisting of <a href="https://www.wolframscience.com/nks/p268--special-initial-conditions/">repetitions of blocks with periodic behavior</a>: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw053124img53_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/06/sw060324updatesimg6.png' alt='' title='' width='666' height='250'> </div> </p></div> <p>Complementary to this is the question of what code 10 <a href="https://www.wolframscience.com/nks/chap-6--starting-from-randomness#sect-6-4--systems-of-limited-size-and-class-2-behavior">does in regions of limited size</a>—say with cyclic boundary conditions, starting from a single black cell. The result is quite different for regions of different sizes:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw053124img54_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/06/sw060324updatesimg7A.png' alt='' title='' width='566' height='342'> </div> </p></div> <p>For a region of size <em>n</em>, a symmetric rule like code 10 <a href="https://www.wolframscience.com/nks/p260--systems-of-limited-size-and-class-2-behavior/">must repeat with a period</a> of at most 2<sup><em>n</em>/2</sup>. Here are the actual repetition periods as a function of size, shown on a log plot:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw053124img56_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw053124img56.png' alt='' title='' width='622' height='164'> </div> </p></div> <p>These are results specifically for the single-cell initial condition. We can also generate <a href="https://www.wolframscience.com/nks/notes-6-7--state-networks-for-systems-of-limited-size/">state transition diagrams</a> for all 2<sup><em>n</em></sup> possible states in a size <em>n</em> code 10 system:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw053124img58a_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/06/sw060324updatesimg9.png' alt='' title='' width='537' height='552'> </div> </p></div> <p>And mostly what we see is highly contractive behavior, with many different initial states evolving to the same final state—even though “eventually” we should start seeing larger cycles of the kind we picked up above when we looked at evolution from a single-cell initial condition.</p> <p>And, yes, I could go on, for example repeating <a href="https://writings.stephenwolfram.com/2019/10/announcing-the-rule-30-prizes/">analyses I’ve done in the past for rule 30</a>. A lot of what we’d see would be at least qualitatively much the same as for rule 30—and essentially the result of the appearance of computational irreducibility in both cases. But it’s a feature of the computational universe—and indeed one of the many consequences of computational irreducibility—that different computational systems will inevitably have different “idiosyncrasies”. And so it is for rule 30 and code 10. Rule 30 has an “<a href="https://www.wolframscience.com/nks/p604--cryptography-and-cryptanalysis/">Xor on one side</a>” which gives it special surjectivity properties. Code 10 on the other hand has its block emulations, which lead, for example, to the surprise of nesting.</p> <p>I’ve now spent many years studying the ruliology of simple programs, and if there’s one thing that still amazes me after all that time it’s that there are always surprises. Even with very simple underlying rules one can never be sure what will happen; there’s no choice but to just do the experiments and see. And, in my experience, pretty much whenever one thinks one’s “got to the end” and “seen everything there is to see”, something completely unexpected will pop out—a reminder that, as the <a href="https://www.wolframscience.com/nks/chap-12--the-principle-of-computational-equivalence/">Principle of Computational Equivalence</a> tells us, these simple computational systems are in some sense a microcosm of everything that’s possible.</p> <p>Ruliology is in many ways the ultimate foundational science—a science concerned with pure abstract rules not set up with any particular reference either to nature or to human choice. In a sense ruliology is our best path to ultimate pure abstraction—and unfettered exploration of the ruliad. And at least for me, it’s also something very satisfying to do. These days, with modern Wolfram Language, it’s all very streamlined and fast. Sitting at one’s computer, one can immediately start visiting vast uncharted areas of the computational universe, seeing things—and often very beautiful things—that have never been seen before, and discovering new but everlasting things anchored in the bedrock of computation and of simple programs.</p> <p>It’s been fun spending a few hours studying the ruliology of code 10. Essentially everything I’ve done here I could have done (though not nearly as efficiently) back in 1983 when I first came up with code 10. But as it was, code 10 in a sense had to “wait patiently” for someone to come and look at it. The form of the <a href="https://writings.stephenwolfram.com/2017/06/oh-my-gosh-its-covered-in-rule-30s/">rule 30 pattern</a> is in some ways more “human-scaled” than code 10. But, as we’ve seen here, code 10 still manifests the same core phenomenon as rule 30. And now, forty years after printing that “<em>C10</em>” picture, I’m happy to be able to say that I think I’ve finally gotten at least a passing acquaintance with another remarkable “computational world” out there in the computational universe: the world of code 10. </p> ]]></content:encoded> <wfw:commentRss>https://writings.stephenwolfram.com/2024/06/ruliology-of-the-forgotten-code-10/feed/</wfw:commentRss> <slash:comments>2</slash:comments> </item> <item> <title>Why Does Biological Evolution Work? A Minimal Model for Biological Evolution and Other Adaptive Processes</title> <link>https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/</link> <comments>https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/#comments</comments> <pubDate>Fri, 03 May 2024 18:40:24 +0000</pubDate> <dc:creator><![CDATA[Stephen Wolfram]]></dc:creator> <category><![CDATA[Artificial Intelligence]]></category> <category><![CDATA[Computational Science]]></category> <category><![CDATA[Life Science]]></category> <category><![CDATA[New Kind of Science]]></category> <category><![CDATA[Ruliology]]></category> <guid isPermaLink="false">https://writings.stephenwolfram.com/?p=58916</guid> <description><![CDATA[<span class="thumbnail"><img width="128" height="108" src="https://content.wolfram.com/sites/43/2024/05/bioevo-tile-1.png" class="attachment-thumbnail size-thumbnail wp-post-image" alt="" /></span>See also: Foundations of Biological Evolution: More Results & More Surprises [December 5, 2024]. The Model Why does biological evolution work? And, for that matter, why does machine learning work? Both are examples of adaptive processes that surprise us with what they manage to achieve. So what’s the essence of what’s going on? I’m going […]]]></description> <content:encoded><![CDATA[<span class="thumbnail"><img width="128" height="108" src="https://content.wolfram.com/sites/43/2024/05/bioevo-tile-1.png" class="attachment-thumbnail size-thumbnail wp-post-image" alt="" /></span><p style="font-size:14px;background:#e5f2f85c;padding:5px 15px;border:1px solid #cfdde3c7;max-width:620px;margin:25px 0px;"><em>See also: <a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/">Foundations of Biological Evolution: More Results & More Surprises</a> <br />[December 5, 2024].</em></p> <h2 id="the-model">The Model</h2> <p>Why does biological evolution work? And, for that matter, why does machine learning work? Both are examples of adaptive processes that surprise us with what they manage to achieve. So what’s the essence of what’s going on? I’m going to concentrate here on biological evolution, though much of what I’ll discuss is also relevant to machine learning—but I’ll plan to explore that in more detail elsewhere.</p> <p>OK, so what is an appropriate minimal model for biology? My core idea here is to think of biological organisms as computational systems that develop by following simple underlying rules. These underlying rules in effect correspond to the genotype of the organism; the result of running them is in effect its phenotype. <a href="https://www.wolframscience.com/nks/chap-2--the-crucial-experiment#sect-2-1--how-do-simple-programs-behave">Cellular automata</a> provide a convenient example of this kind of setup. Here’s an example involving cells with 3 possible colors; the rules are shown on the left, and the behavior they generate is shown on the right:</p> <div id="gpt-stripe" style="background: #f6fcff87;padding: 0.75rem 1.5rem;border: 1px solid #aeccd987;font-family: 'Source Sans Pro', sans-serif;margin-bottom: 2.5rem;max-width: 620px;/* font-size: .6rem; */"> <p style="font-size: .85rem;color: #3f5f6a;line-height: 1.5;padding-bottom: 0;display: block;"><em>Note: Click any diagram to get Wolfram Language code to reproduce it.</em></p> </div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' style="margin-top: -15px" src='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg1.png' alt='' title='' width='395' height='362'> </div> </p></div> <p>We’re starting from a single (<svg style="width:10px;height:10px;" version="1.1" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 40 40"><rect x="0.6" y="0.7" style="fill:#FF5D01;" width="38.7" height="38.7"></rect></svg>) cell, and we see that from this “seed” a structure is grown—which in this case dies out after 51 steps. And in a sense it’s already remarkable that we can generate a structure that neither goes on forever nor dies out quickly—but instead manages to live (in this case) for exactly 51 steps.<span id="more-58916"></span></p> <p>But let’s say we start from the trivial (“null”) rule that makes any pattern die out immediately. Can we end up “adaptively evolving” to the rule above? Imagine making a sequence of randomly chosen “point mutations”—each changing just one outcome in the rule, as in:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg2.png' alt='' title='' width='262' height='68'> </div> </p></div> <p>Then suppose that at each step—in a minimal analog of natural selection—we “accept” any mutation that makes the lifetime longer (though not infinite), or at least the same as before, and we reject any mutation that makes the lifetime shorter, or infinite. It turns out that with this procedure we can indeed “adaptively evolve” to the rule above (where here we’re showing only “waypoints” of progressively greater lifetime): </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg3.A.png' alt='' title='' width='579' height='329'> </div> </p></div> <p>Different sequences of random mutations give different sequences of rules. But the remarkable fact is that in almost all cases it’s possible to “make progress”—and routinely reach rules that give long-lived patterns (here with lifetimes 107, 162 and 723) with elaborate morphological structure:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050324tallimageimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224modelimg4X-v3.png' alt='' title='' width='607' height='1200'></div> </div> <p>Is it “obvious” that our simple process of adaptive evolution will be able to successfully “wrangle” things to achieve this? No. But the fact that it can seems to be at the heart of why biological evolution manages to work. </p> <p>Looking at the sequences of pictures above we see that there are often in effect “different mechanisms” for producing long lifetimes that emerge in different sequences of rules. Typically we first see the mechanism in simple form, then as the adaptive process continues, the mechanism gets progressively more developed, elaborated and built on—not unlike what we often appear to see in the fossil record of biological evolution.</p> <p>But let’s drill down and look in a little more detail at what’s happening in the simple model we’re using. In the 3-color nearest-neighbor (<em>k</em> = 3, <em>r</em> = 1) cellular automata we’re considering, there are 26 (= 3<sup>3</sup> – 1) relevant cases in the rule (there’d be 27 if we didn’t insist that<br /> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg6.png' width='72' height='14'>). “Point mutations” affect a single case, changing it to one of two (= 3 – 1) possible alternative outcomes—so that there are altogether 52 (= 26 × 2) possible distinct “point mutations” that can be made to a given rule. </p> <p>For example, starting from the rule</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg9.png' alt='' title='' width='230' height='10'> </div> </p></div> <p>the results of possible single point mutations are:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg10-v2.png' alt='' title='' width='561' height='85'> </div> </p></div> <p>And even with such point mutations there’s usually considerable diversity in the behavior they generate: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg11A.png' alt='' title='' width='567' height='290'> </div> </p></div> <p>In quite a few cases the pattern generated is exactly the same as the one for the original rule. In other cases it dies out more quickly—or it doesn’t die out at all (either becoming periodic, or growing forever). And in this particular example, in just one case it achieves “higher fitness”, surviving longer. </p> <p>If we make a sequence of random mutations, many will produce shorter-lived or infinite lifetime (“tumor”) patterns, and these we’ll reject (or, in biological terms, we can imagine they’re “selected out”):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg12A.png' alt='' title='' width='591' height='289'> </div> </p></div> <p>But still there can be many “neutral mutations” that don’t change the final pattern (or at least give a pattern of the same length). And at first we might think that these don’t achieve anything. But actually they’re critical in allowing single point mutations to build up to larger mutations that can eventually give longer-lived patterns:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg13.png' alt='' title='' width='241' height='271'> </div> </p></div> <p>Tracing our whole adaptive evolution process above, the total number of point mutations involved in getting from one (increasingly long-lived) “fitness waypoint” to another is: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg14_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg14A.png' alt='' title='' width='667' height='242'> </div> </p></div> <p>Here are the underlying rules associated with these fitness waypoints (where the numbers count cumulative “accepted mutations”, ignoring ones that go “back and forth”):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg15_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg15.png' alt='' title='' width='244' height='341'> </div> </p></div> <p>One way to get a sense of what’s going on is to take the whole sequence of (“accepted”) rules in the adaptive evolution process, and plot them in a <a href="https://reference.wolfram.com/language/ref/DimensionReduce.html">dimension-reduced</a> rendering of the (27-dimensional) rule space:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg16_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg16.png' alt='' title='' width='522' height='522'> </div> </p></div> <p>There are periods when there’s a lot of “wandering around” going on, with many mutations needed to “make progress”. And there are other periods when things go much faster, and fewer mutations are needed. </p> <p>As another way to see what’s going on, we can plot the maximum lifetime achieved so far against the total number of mutation steps made:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg17_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg17.png' alt='' title='' width='359' height='183'> </div> </p></div> <p>We see plateaus (including an extremely long one) in which “no progress” is made, punctuated by sometimes-quite-large, sudden changes, often brought on by just a single mutation. </p> <p>If we include “rejected mutations” we see that there’s a lot of activity going on even in the plateaus; it just doesn’t manage to make progress (one can think of each red dot that lies below a plateau as being like a mutation—or an organism—that “doesn’t make it”, and is selected out):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg18_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg18.png' alt='' title='' width='548' height='280'> </div> </p></div> <p>It’s worth noting that there can be multiple different (“phenotype”) patterns that occur across a plateau. Here’s what one sees in the particular example we’re considering:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg19_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg19A.png' alt='' title='' width='651' height='247'> </div> </p></div> <p>But even between these “phenotypically different” cases, there can be many “genotypically different” rules. And in a sense this isn’t surprising, because usually only parts of the underlying rule are “coding”; other parts are “noncoding”, in the sense that they’re not sampled during the generation of the pattern from that rule. </p> <p>And for example this highlights for each “fitness waypoint rule” which cells make use of a “fresh” case in the rule that hasn’t so far been sampled during the generation of the pattern:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg20_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg20A.png' alt='' title='' width='516' height='341'> </div> </p></div> <p>And we see that even in the last rule shown here, only 18 of the 26 relevant cases in the rule are actually ever sampled during the generation of the pattern (from the particular, single-red-cell initial condition used). So this means that 8 cases in the rule are “undetermined” from the phenotype, implying that there are 3<sup>8</sup> = 6561 possible genotypes (i.e. rules) that will give the same result. </p> <p>So far we’ve mostly been talking about one particular random sequence of mutations. But what happens if we look at many possible such sequences? Here’s how the longest lifetime (or, in effect, “fitness”) increases for 100 different sequences of random mutations:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg22_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg22.png' alt='' title='' width='615' height='164'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg23_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg23.png' alt='' title='' width='615' height='163'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg24_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg24.png' alt='' title='' width='615' height='163'> </div> </p></div> <p>And what’s perhaps most notable here is that it seems as if these adaptive processes indeed don’t “get stuck”. It may take a while (with the result that there are long plateaus) but these pictures suggest that eventually “adaptive evolution will find a way”, and one will get to rules that show longer lifetimes—as the progressive development of the distribution of lifetimes reflects: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg25_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124modelimg25.png' alt='' title='' width='523' height='127'> </div> </p></div> <h2 id="the-multiway-graph-of-all-possible-mutation-histories">The Multiway Graph of All Possible Mutation Histories</h2> <p>In what we’ve done so far we’ve always been discussing particular paths of adaptive evolution, determined by particular sequences of random mutations. But a powerful way to get a more global view of the process of adaptive evolution is to look—in the spirit of our <a href="https://www.wolframphysics.org/" target="_blank" rel="noopener">Physics Project</a>, the <a href="https://writings.stephenwolfram.com/2021/11/the-concept-of-the-ruliad/">ruliad</a>, etc.—not just at individual paths of adaptive evolution, but instead at the <a href="https://writings.stephenwolfram.com/2021/09/multicomputation-a-fourth-paradigm-for-theoretical-science/">multiway graph</a> of all possible paths. (And in making a correspondence with biology, multiway graphs give us a way to talk about adaptive evolution not just of individual sequences of organisms, but also populations.)</p> <p>To start our discussion, let’s consider not the 3-color cellular automata of the previous section, but instead <a href="https://www.wolframscience.com/nks/p55--more-cellular-automata/">(nearest-neighbor) 2-color cellular automata</a>—for which there are just 128 possible relevant rules. How are these rules related by point mutations? We can construct a graph of every possible way that one rule from this set can be transformed to another by a single point mutation:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg1.png' alt='' title='' width='440' height='405'> </div> </p></div> <p>If we imagine 5-bit rather than 7-bit rules, there are only 16 relevant ones, and we can readily see that the graph of possible mutations has the form of a <a href="https://reference.wolfram.com/language/ref/GridGraph.html">Boolean hypercube</a>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg2.png' alt='' title='' width='282' height='243'> </div> </p></div> <p>Let’s say we start from the “null rule” <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg3.png' alt='' title='' width='68' height='9'/>. Then we enumerate the rules obtained by a single point mutation (and therefore directly connected to the null rule in the graph above)—then we see what behavior they produce, say from the initial condition …<img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg4A.png' alt='' title='' width='90' height='12'/>…:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg5.png' alt='' title='' width='580' height='134'> </div> </p></div> <p>Some of these rules we can view as “making progress”, in the sense that they yield patterns with longer lifetimes (not impressively longer, just 2 rather than 1). But other rules “make no progress” or generate patterns that “live forever”. Keeping only mutations that don’t lead to shorter or infinite lifetimes, we can construct a multiway graph that shows all possible mutation paths:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg6.png' alt='' title='' width='583' height='354'> </div> </p></div> <p>Although this is a very small graph (with just 15 rules appearing), we can already see hints of some important phenomena. There are “fitness-neutral” mutations that can “go both ways”. But there are also plenty of mutations that only “go one way”—because the other way they would decrease fitness. And a notable feature of the graph is that once one’s “committed” to a particular part of the graph, one often can’t reach a different one—suggesting an analogy to the existence of distinct branches in the tree of life. </p> <p>Moving beyond 2-color, nearest-neighbor (<em>k</em> = 2, <em>r</em> = 1) cellular automata, we can consider <em>k</em> = 2, <em>r </em>= <span class='InlineFormula'><img style="margin-top: -1px" src='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg10.png' width= '9' height='33' align='absmiddle'></span> ones. A typical such cellular automaton is:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg8.png' alt='' title='' width='519' height='86'> </div> </p></div> <p>For <em>k</em> = 2, <em>r</em> = 1 there were a total of 128 (= 2<sup>2<sup>3</sup> – 1</sup>) relevant rules. For <em>k</em> = 2, <em>r</em> = <span class='InlineFormula'><img style="margin-top: -1px" src='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg10.png' width= '9' height='33' align='absmiddle'></span>, there are a total of 32,768 (= 2<sup>2<sup>4</sup> – 1</sup>). Starting with the null rule, and again using initial condition <nobr>…<img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg4A.png' alt='' title='' width='90' height='12'/>…,</nobr> here are a few specific examples of adaptive evolution paths for such cellular automata:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg13A_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg13.png' alt='' title='' width='497' height='210'> </div> </p></div> <p>And here is the beginning of the multiway graph for <em>k</em> = 2,<em> r</em> = <span class='InlineFormula'><img style="margin-top: -1px" src='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg14.png' width= '9' height='33' align='absmiddle'></span> rules—showing rules reached by up to two mutations starting from the null rule:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg15_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg15.png' alt='' title='' width='636' height='239'> </div> </p></div> <p>This graph contains many examples of “fitness-neutral sets”—rules that have the same fitness and that can be transformed into each other by mutations. A few examples of such fitness-neutral sets:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg16_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg16.png' alt='' title='' width='619' height='443'> </div> </p></div> <p>In the first case here, the “morphology of the phenotypic patterns” is the same for all “genotypic rules” in the fitness-neutral set. But in the other cases there are multiple morphologies within a single fitness-neutral set. </p> <p>If we included all individual rules we’d get a complete <em>k</em> = 2, <em>r</em> = <span class='InlineFormula'><img style="margin-top: -1px" src='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg17.png' width= '9' height='33' align='absmiddle'></span> multiway graph with a total of 1884 nodes. But if we just include one representative from every fitness-neutral set, we get a more manageable multiway graph, with a total of 86 nodes:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg18_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg18.png' alt='' title='' width='617' height='419'> </div> </p></div> <p>Keeping only one representative from pairs of patterns that are related by left-right symmetry, we get a still-simpler graph, now with a total of 49 nodes:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg19_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg19.png' alt='' title='' width='646' height='340'> </div> </p></div> <p>There’s quite a lot of structure in this graph, with both divergence and convergence of possible paths. But overall, there’s a certain sense that different sections of the graph separate into distinct branches in which adaptive evolution in effect “pursues different ideas” about how to increase fitness (i.e. lifetime of patterns).</p> <p>We can think of fitness-neutral sets as representing a certain kind of equivalence class of rules. There’s quite a range of possible structures to these sets—from ones with a single element to ones with many elements but few distinct morphologies, to ones with different morphologies for every element:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg20_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg20.png' alt='' title='' width='652' height='387'> </div> </p></div> <p>What about larger spaces of rules? For <em>k</em> = 2, <em>r</em> = 2 there are altogether about 2 billion (2<sup>2<sup>5</sup> – 1</sup>) relevant rules. But if we choose to look only at left-right symmetric ones, this number is reduced to 524,288 (= 2<sup>19</sup>). Here are some examples of sequences of rules produced by adaptive evolution in this case, starting from the null rule, and allowing only mutations that preserve symmetry (and now using a single black cell as the initial condition):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg22_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg22.png' alt='' title='' width='374' height='471'> </div> </p></div> <p>Once again we can identify fitness-neutral sets—though this time, in the vast majority of cases, the patterns generated by all members of a given set are the same:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg23_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg23.png' alt='' title='' width='569' height='345'> </div> </p></div> <p>Reducing out fitness-neutral sets, we can then compute the complete (transitively reduced) multiway graph for symmetric <em>k</em> = 2, <em>r</em> = 2 rules (containing a total of 60 nodes):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg24_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg24A.png' alt='' title='' width='689' height='550'> </div> </p></div> <p>By reducing out fitness-neutral sets, we’re creating a multiway graph in which every edge represents a mutation that “makes progress” in increasing fitness. But actual paths of adaptive evolution based on random sequences of mutations can do any amount of “rattling around” within fitness-neutral sets—not to mention “trying” mutations that decrease fitness—before reaching mutations that “make progress”. So this means that even though the reduced multiway graph we’ve drawn suggests that the maximum number of steps (i.e. mutations) needed to adaptively evolve from the null rule to any other is 9, it can actually take any number of steps because of the “rattling around” within fitness-neutral sets.</p> <p>Here’s an example of a sequence of accepted mutations in a particular adaptive evolution process—with the mutations that “make progress” highlighted, and numbers indicating rejected mutations:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg25_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg25.png' alt='' title='' width='601' height='145'> </div> </p></div> <p>We can see “rattling around” in a fitness-neutral set, with a cycle of morphologies being generated. But while this represents one way to reach the final pattern, there are also plenty of others, potentially involving many fewer mutations. And indeed one can determine from the multiway graph that an absolutely shortest path is:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg26_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg26.png' alt='' title='' width='447' height='100'> </div> </p></div> <p>This involves the sequence of rules:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg27_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg27.png' alt='' title='' width='185' height='145'> </div> </p></div> <p>We’re starting from the null rule, and at each step making a single point mutation (though because of symmetry two bits can sometimes be changed). The first few mutations don’t end up changing the “phenotypic behavior”. But after a while, enough mutations (here 6) have built up that we get morphologically different behavior. And after just 3 more mutations, we end up with our final pattern.</p> <p>Our original random sequence of mutations gets to the same result, but in a much more tortuous way, doing a total of 169 mutations which often cancel each other out:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg28_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg28.png' alt='' title='' width='105' height='308'> </div> </p></div> <p>In drawing a multiway graph, we’re defining what evolutionary paths are possible. But what about probabilities? If we assume that every point mutation is equally likely, we can in effect “analyze the flow” in the multiway graph, and determine the ultimate probability that each rule will be reached (with higher probabilities here shown redder):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg29_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg29.png' alt='' title='' width='396' height='234'> </div> </p></div> <h2 id="the-fitness-landscape">The Fitness Landscape</h2> <p>Multiway graphs give a very global view of adaptive evolution. But in understanding the process of adaptive evolution, it’s also often useful to think somewhat more locally. We can imagine that all the possible rules are laid out in a certain space, and that adaptive evolution is trying to find appropriate paths in this space. Potentially we can suppose that there’s a “fitness landscape” defined in the space, and that adaptive evolution is trying to follow a path that progressively ascends to higher peaks of fitness. </p> <p>Let’s consider again the very first example we gave above—of adaptive evolution in the space of 3-color cellular automata. At each step in this adaptive evolution, there are 52 possible point mutations that can be made to the rule. And one can think of each of these mutations as corresponding to making an “elementary move” in a different direction in the (26-dimensional) space of rules. </p> <p>Here’s a visual representation of what’s going on, based on the particular path of adaptive evolution from our very first example above: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124fitnessimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124fitnessimg1.png' alt='' title='' width='557' height='356'> </div> </p></div> <p>What we’re showing here is in effect the sequence of “decisions” that are being made to get us from one “fitness waypoint” to another. Different possible mutations are represented by different radial directions, with the length of each line being proportional to the fitness achieved by doing that mutation. At each step the gray disk represents the previous fitness. And what we see is that many possible mutations lead to lower fitness outcomes, shown “within the disk”. But there are at least some mutations that have higher fitness, and “escape the disk”. </p> <p>In the multiway graph, we’d trace every mutation that leads to higher fitness. But for a particular path of adaptive evolution as we’ve discussed it so far, we imagine we always just pick at random one mutation from this set—as indicated here by a red line. (Later we’ll discuss different strategies.) </p> <p>Our radial icons can be thought of as giving a representation of the “local derivative” at each point in the space of rules, with longer lines corresponding to directions with larger slopes “up the fitness landscape”. </p> <p>But what happens if we want to “knit together” these local derivatives to form a picture of the whole space? Needless to say, it’s complicated. And as a first example, consider <em>k</em> = 2, <em>r</em> = 1 cellular automaton rules. </p> <p>There are a total of 128 relevant such rules, that (as we discussed above) can be thought of as connected by point mutations to form a 7-dimensional Boolean hypercube. As also discussed above, of all 128 relevant rules, only 15 appear in adaptive evolution processes (the others are in effect never selected because they represent lower fitness). But now we can ask where these rules lie on the whole hypercube:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw051324fC2Cfinalimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw051324fC2Cfinalimg1.png' alt='' title='' width='347' height='342'> </div> </p></div> <p>Each node here represents a rule, with the size of the highlighted nodes indicating their corresponding fitness (computed from lifetime with initial condition …<img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg4A.png' alt='' title='' width='90' height='12'/>…). The node shown in green corresponds to the null rule. </p> <p>Rendering this in 3D, with fitness shown as height, we get what we can consider a “fitness landscape”:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124fitnessimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124fitnessimg4.png' alt='' title='' width='401' height='342'> </div> </p></div> <p>And now we can think of our adaptive evolution as proceeding along paths that never go to nodes with lower height on this landscape. </p> <p>We get a more filled-in “fitness landscape” when we look at <em>k</em> = 2, <em>r </em>= <span class='InlineFormula'><img style="margin-top: -1px" src='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg17.png' width= '9' height='33' align='absmiddle'></span> rules (here with initial condition …<img loading='lazy' style="margin-bottom: -2px" src='https://content.wolfram.com/sites/43/2024/05/sw050124multiwayimg4A.png' alt='' title='' width='90' height='12'/>…):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124fitnessimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124fitnessimg7.png' alt='' title='' width='521' height='349'> </div> </p></div> <p>Adaptive evolution must trace out a “never-go-down” path on this landscape: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124fitnessimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124fitnessimg8.png' alt='' title='' width='457' height='306'> </div> </p></div> <p>Along this path, we can make “derivative” pictures like the ones above to represent “local topography” around each point—indicating which of the possible upwards-on-the-landscape directions is taken: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124fitnessimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124fitnessimg9.png' alt='' title='' width='634' height='60'> </div> </p></div> <p>The rule space over which our “fitness landscape” is defined is ultimately discrete and effectively very high-dimensional (15-dimensional for <em>k</em> = 2, <em>r</em> = <span class='InlineFormula'><img style="margin-top: -2px" src='https://content.wolfram.com/sites/43/2024/05/sw050124fitnessimg10.png' width= '9' height='33' align='absmiddle'></span> rules)—and it’s quite challenging to produce an interpretable visualization of it in 3D. We’d like it if we could lay out our rendering of the rule space so that rules which differ just by one mutation are a fixed (“elementary”) 2D distance apart. In general this won’t be possible, but we’re trying to at least approximate this by finding a good layout for the underlying “mutation graph”.</p> <p>Using this layout we can in principle make a “fitness landscape surface” by interpolating between discrete points. It’s not clear how meaningful this is, but it’s perhaps useful in engaging our spatial intuition:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124fitnessimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124fitnessimg11.png' alt='' title='' width='571' height='251'> </div> </p></div> <p>We can try machine learning and dimension reduction, operating on the set of “rule vectors” (i.e. outcome lists) that won’t be rejected in our adaptive evolution process—and the results are perhaps slightly better:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124fitnessimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124fitnessimg12.png' alt='' title='' width='571' height='251'> </div> </p></div> <p>By the way, if we use this dimension reduction for rule space, here’s how the behavior of rules lays out:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw052324NikUpdateimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw052324NikUpdateimg1.png' alt='' title='' width='675' height='418'> </div> </p></div> <p>And here, for comparison, is a <a href="https://reference.wolfram.com/language/ref/FeatureSpacePlot.html">feature space plot</a> based on the visual appearance of these patterns:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw052324NikUpdateimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw052324NikUpdateimg2.png' alt='' title='' width='679' height='419'> </div> </p></div> <h2 id="the-whole-space-exhaustive-search-vs-adaptive-evolution">The Whole Space: Exhaustive Search vs. Adaptive Evolution</h2> <p>In adaptive evolution, we start, say, from the null rule and then make random mutations to try to reach rules with progressively larger fitness. But what about just exhaustively searching the complete space of possible rules? The <a href="https://www.wolframscience.com/nks/notes-3-2--numbers-of-cellular-automaton-rules/">number of rules</a> rapidly becomes unmanageably big—but some cases are definitely accessible:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg1X.png' alt='' title='' width='334' height='288'> </div> </p></div> <p>For example, there are just 524,288 symmetric <em>k</em> = 2, <em>r</em> = 2 rules—of which 77,624 generate patterns with finite lifetimes. Ultimately, though, there are just 77 distinct phenotypic patterns that appear—with varying lifetimes and varying multiplicity (where at least in this case the multiplicity is always associated with “unused bits” in the rule):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg2X.png' alt='' title='' width='669' height='530'> </div> </p></div> <p>How do these exhaustive results compare with what’s generated in the multiway graph of adaptive evolutions? They’re almost the same, but for the addition of the two extra cases</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg3A.png' alt='' title='' width='164' height='85'> </div> </p></div> <p>which are generated by rules of the form (where the gray entries don’t matter):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg4.png' alt='' title='' width='305' height='33'> </div> </p></div> <p>Why don’t such rules ever appear in our adaptive evolution? The reason is that there isn’t a chain of point mutations starting from the null rule that can reach these rules without going through rules that would be rejected by our adaptive evolution process. If we draw a multiway graph that includes every possible “acceptable” rule, then we’ll see a separate part in the graph, with its own root, that contains rules that can’t be reached by our adaptive evolution from the null rule: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw052324NikUpdateimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw052324NikUpdateimg3.png' alt='' title='' width='659' height='512'> </div> </p></div> <p>So now if we look at all (symmetric <em>k</em> = 2, <em>r</em> = 2) rules, here’s the distribution of lifetimes we get:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050424C2CupdatesAimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg6.png' alt='' title='' width='356' height='202'></div> </div> <p>The maximum, as seen above, is 65. The overall distribution roughly follows a power law, with an exponent around –3:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw051024finalC2Cimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg7A.png' alt='' title='' width='294' height='153'></div> </div> <p>As we saw above, not all rules make use of all their bits (i.e. outcomes) in generating phenotypic patterns. But what we see is that the larger the lifetime achieved, the more bits tend to be needed:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg8.png' alt='' title='' width='370' height='201'> </div> </p></div> <p>And in a sense this isn’t surprising: as we’ll discuss later, we can expect to need “more bits in the program” to specify more elaborate behavior—or, in particular, behavior that embodies a larger number for lifetime.</p> <p>So what about general (i.e. not necessarily symmetric) <em>k</em> = 2, <em>r</em> = 2 rules? There are <nobr>2<sup>31</sup> ≅ 2 billion</nobr> of these. If we exhaustively search through them, we find about 75 million that have finite lifetimes. The distribution of these lifetimes is:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050424NikC2Cimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg10.png' alt='' title='' width='356' height='197'></div> </div> <p>Again it’s roughly a power law, but now with exponent around –3.5:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050424NikC2Cimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg11A.png' alt='' title='' width='294' height='153'></div> </div> <p>Here are the actual 100 patterns produced that have the longest lifetimes (in all asymmetric cases there are also rules giving left-right flipped patterns):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050424C2CupdatesAimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg12A.png' alt='' title='' width='667' height='1441'></div> </div> <p>It’s interesting to see here a variety of “qualitatively different ideas” being used by different rules. Some (like the one with lifetime 151) we might somehow imagine could have been constructed <a href="https://www.wolframscience.com/nks/p832--intelligence-in-the-universe/">specifically “for the purpose”</a> of having their particular, long lifetime. But others (like the one with lifetime 308) somehow seem more “coincidental”—behaving in an apparently random way, and then just “happening to die out” after a certain number of steps.</p> <p>Since we found these rules by exhaustive search, we know they’re the only possible ones with such long lifetimes (at least with <em>k</em> = 2, <em>r</em> = 2). So then we can infer that the ornate structures we see are in some sense necessary to achieve the objective of, say, having a finite lifetime of more than 100 steps. So that means that if we go through a process of adaptive evolution and achieve a lifetime above 100 steps, we see a complex pattern of behavior not because of “complicated choices in our process of adaptive evolution”, but rather because to achieve such a lifetime one has no choice but to use such a complex pattern. Or, in other words, the complexity we see is a reflection of “computational necessity”, not historical accidents of adaptive evolution.</p> <p>Note also that (as we’ll discuss in <a href="https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/#what-can-adaptive-evolution-achieve">more detail below</a>) there are certain behaviors we can get, and others that we cannot. So, for example, there is a rule that gives lifetime 308, but none that gives lifetime 300. (Though, yes, if we used more complicated initial conditions or a more complicated family of rules we could find such a rule.)</p> <p>Much as we saw in the symmetric <em>k</em> = 2, <em>r</em> = 2 case above, almost any long lifetimes require using all the available bits in the rule:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050424NikC2Cimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg13.png' alt='' title='' width='366' height='189'></div> </div> <p>But, needless to say, there’s an exception—a pair of rules with lifetime 84 where the outcome for the <img loading='lazy' style="margin-bottom: -4px" src='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg14.png' alt='' title='' width='54' height='16'/> case doesn’t matter: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg15_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg15.png' alt='' title='' width='257' height='9'> </div> </p></div> <p>But, OK, so can these long-lifetime rules be reached by single-mutation adaptive evolution from the null rule? Rather than trying to construct the whole multiway graph for the general <em>k</em> = 2, <nobr><em>r</em> = 2</nobr> case starting from the null rule, we can instead construct what’s <a href="https://www.wolframscience.com/metamathematics/relations-to-automated-theorem-proving/">in effect an inverse multiway graph</a> in which we start from a given rule, then successively make all single point mutations that reach rules with progressively shorter lifetimes: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg16_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg16.png' alt='' title='' width='538' height='491'> </div> </p></div> <p>And what we see is that at least in this case such a procedure never reaches the null rule. The “furthest” it gets is to lifetime-2 rules, and among these rules the closest to the null rule are: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg17_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg17.png' alt='' title='' width='310' height='95'> </div> </p></div> <p>But it turns out that there’s no way to reach these 2-bit rules by a single point mutation from any of the 26 1-bit rules that aren’t rejected by our adaptive evolution process. And in fact this isn’t just an issue for this particular long-lifetime rule; it’s something quite general among <em>k</em> = 2 rules. And indeed, constructing the “forward” multiway graph starting from the null rule, we find we can only ever reach lifetime-1 rules.</p> <p>Ultimately this is a particular feature of rules with just 2 colors—and it’s specific to starting with something like the null rule that has lifetime 1—but it’s an illustration of the fact that there can even be large swaths of rule space that can’t be reached by adaptive evolution with point mutations.</p> <p>What about symmetric <em>k</em> = 2, <em>r</em> = 2 rules? Well, to maintain symmetry we have to deal with mutations that change not just one but two bits. And this turns out to mean that (except in the cases we discovered above) the inverse multiway system starting from long-lifetime rules always successfully reaches the null rule:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg18_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg18.png' alt='' title='' width='388' height='411'> </div> </p></div> <p>There’s something else to notice here, however. Looking at this graph, we see that there’s a way to get with just one 2-bit mutation from a lifetime-1 to a lifetime-65 rule: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg19_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg19.png' alt='' title='' width='300' height='39'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg20_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg20.png' alt='' title='' width='200' height='158'> </div> </p></div> <p>We didn’t see this in our multiway graph above because we had applied transitive reduction to it. But if we don’t do that, we find that a few large lifetime jumps are possible—as we can see on this plot of possible lifetimes before and after a single point mutation:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg21_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg21.png' alt='' title='' width='333' height='332'> </div> </p></div> <p>Going beyond <em>k</em> = 2, <em>r</em> = 2 rules, we can consider <span id="symmetric">symmetric</span> <em>k</em> = 3, <em>r</em> = 1 rules, of which there are 3<sup>17</sup>, or about 129 million. The distribution of lifetimes in this case is</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg23_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg23.png' alt='' title='' width='356' height='198'> </div> </p></div> <p>which again roughly fits a power law, again with exponent around –3.5:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050424markC2Cimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg24A.png' alt='' title='' width='294' height='153'> </div> </p></div> <p>But now the maximum lifetime found is not just 308, but 2194:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg25_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg25A.jpg' alt='' title='' width='577' height='1843'> </div> </p></div> <p>Once again, there are some different “ideas” on display, with a few curious examples of convergence—such as the rules we see with lifetimes 989 and 990 (as well as 1068 and 1069) which give essentially the same patterns after just exchanging colors, and adding one “prefatory” step.</p> <p>What about general <em>k</em> = 3, <em>r</em> = 1 rules? There are too many to easily search exhaustively. But directed random sampling reveals plenty of long-lifetime examples, such as:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg26_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg26A.png' alt='' title='' width='528' height='859'> </div> </p></div> <p>And now the tail of very long lifetimes extends further, for example with: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw051424SWc2cupdateimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg27-v3.png' alt='' title='' width='402' height='850'> </div> </p></div> <p>It’s a little easier to see what the lifetime-10863 rule does if one visualizes it in sections (and adjusts colors to get more contrast): </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg28_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg28.png' alt='' title='' width='598' height='783'> </div> </p></div> <p>Sampling 100 steps out every 2000 (as well as at the very end), we see elaborate alternation between periodic and seemingly random behavior—but none of it gives any obvious clue of the remarkable fact that after 10863 steps the whole pattern will die out:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050424C2CupdatesAimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224spaceimg29A.png' alt='' title='' width='547' height='261'></div> </div> <h2 id="the-issue-of-undecidability">The Issue of Undecidability</h2> <p>As our example criterion for the “fitness” of cellular automaton rules, we’ve used the lifetimes of the patterns they generate—always assuming that if the patterns don’t terminate at all they should be considered to have fitness zero. </p> <p>But how can we tell if a pattern is going to terminate? In the previous section, for example, we saw patterns that live a very long time—but do eventually terminate. </p> <p>Here are some examples of the first 100 steps of patterns generated by a few <em>k</em> = 3, <em>r </em>= 1 symmetric rules:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124issueimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124issueimg1A.png' alt='' title='' width='638' height='404'> </div> </p></div> <p>What will happen with these patterns? We know from what we see here that none of them have lifetimes less than 100 steps. But what would allow us to say more? In a few cases we can see that the patterns are periodic, or have obvious repeating structures, which means they’ll never terminate. But in the other cases there’s no obvious way to predict what will happen. Explicitly running the rules for another 100 steps we discover some more outcomes:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124issueimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124issueimg2A.png' alt='' title='' width='647' height='404'> </div> </p></div> <p>Going to 500 steps there are some surprises. Rule (a) becomes periodic after 388 steps; rules (o) and (v) terminate after 265 and 377 steps, respectively:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124issueimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124issueimg3A.png' alt='' title='' width='623' height='371'> </div> </p></div> <p>But is there a way to systematically say what will happen “in the end” with all the remaining rules? The answer is that in general there is not; it’s something that must be considered <a href="https://www.wolframscience.com/nks/chap-12--the-principle-of-computational-equivalence#sect-12-8--undecidability-and-intractability">undecidable by any finite computation</a>. </p> <p>Given how comparatively simple the cellular automaton rules we’re considering are, we might have assumed that with all our sophisticated mathematical and computational methods we’d always be able to “jump ahead of them”—and figure out their outcome without the computational effort of explicitly running each step. </p> <p>But the <a href="https://www.wolframscience.com/nks/chap-12--the-principle-of-computational-equivalence/">Principle of Computational Equivalence</a> suggests that pretty much whenever the behavior of these rules isn’t obviously simple, it will in effect be of equal computational sophistication to any other system, and in particular to any methods that we might use to predict it. And the result is the phenomenon of <a href="https://www.wolframscience.com/nks/chap-12--the-principle-of-computational-equivalence#sect-12-6--computational-irreducibility">computational irreducibility</a> that implies that in many systems—presumably including most of the cellular automata here—there isn’t any way to figure out their outcome much more efficiently than by explicitly tracing each of their steps. So this means that to know what will happen “in the end”—after an infinite number of steps—might take an unlimited amount of computational effort. Or, in other words, it must be considered effectively undecidable by any finite computation.</p> <p>As a practical matter we might look at the observed distribution of lifetimes for a particular type of cellular automaton, and become pretty confident that there won’t be longer finite lifetimes for that type of cellular automaton. But for the <em>k</em> = 3, <em>r</em> = 1 rules from the previous section, we might have been fairly confident that a few thousand steps was the longest lifetime that would ever occur—until we discovered the 10,863-step example.</p> <p>So let’s say we run a particular rule for 10,000 steps and it hasn’t died out. How can we tell if it never will? Well, we have to <a href="https://writings.stephenwolfram.com/2021/03/after-100-years-can-we-finally-crack-posts-problem-of-tag-a-story-of-computational-irreducibility-and-more/#so-does-it-always-halt">construct a proof of some kind</a>. And that’s easy to do if we can see that the pattern becomes, say, completely periodic. But in general, computational irreducibility implies we won’t be able to do it. Might there, though, still be special cases where we can? In effect, those would have to correspond to “pockets of computational reducibility” where we manage to find a compressed description of the cellular automaton behavior.</p> <p>There are cases like this where there isn’t strict periodicity, but where in the end there’s basically repetitive behavior (here with period 480):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124issueimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124issueimg4A.png' alt='' title='' width='580' height='219'> </div> </p></div> <p>And there are cases of nested behavior, which is never periodic, but is nevertheless simple enough to be predictable:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124issueimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124issueimg5A.png' alt='' title='' width='210' height='209'> </div> </p></div> <p>But there are always surprises. Like this example—which eventually resolves to have period 6, but only after 7129 steps:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124issueimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124issueimg6A.png' alt='' title='' width='606' height='295'> </div> </p></div> <p>So what does all this mean for our adaptive evolution process? It implies that in principle we could miss a very long finite lifetime for a particular rule, assuming it to be infinite. In a biological analogy we might have a genome that seems to lead to unbounded perhaps-tumor-like growth—but where actually the growth in the end “unexpectedly” stops.</p> <h2 id="computation-theoretic-perspectives-and-busy-beavers">Computation Theoretic Perspectives and Busy Beavers</h2> <p>What we’re asking about the dying out of patterns in cellular automata is directly analogous to the classic <a href="https://www.wolframscience.com/nks/notes-12-8--halting-problems/">halting problem for Turing machines</a>, or the <a href="https://writings.stephenwolfram.com/2020/12/combinators-a-centennial-view/#combinators-in-the-wild-some-zoology">termination problem for term rewriting</a>, <a href="https://writings.stephenwolfram.com/2021/03/after-100-years-can-we-finally-crack-posts-problem-of-tag-a-story-of-computational-irreducibility-and-more">Post tag systems</a>, etc. And in looking for cellular automata that have the longest-lived patterns, we’re studying a cellular automaton analog of the so-called <a href="https://www.wolframscience.com/nks/notes-3-4--history-of-turing-machines/">busy beaver problem</a> for Turing machines.</p> <p>We can summarize the results we’ve found so far (all for single-cell initial conditions): </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124beaversimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124beaversimg1.png' alt='' title='' width='573' height='195'> </div> </p></div> <p>The profiles (i.e. widths of nonzero cells) for the patterns generated by these rules are</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050424C2CupdatesAimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124beaversimg2A.png' alt='' title='' width='619' height='274'> </div> </div> <p>and the “integrals” of these curves are what give the “areas” in the table above. </p> <p>For the reasons described in the previous section, we can only be certain that we’ve found lower bounds on the actual maximum lifetime—though except in the last few cases listed it seems very likely that we do in fact have the maximum lifetime.</p> <p>It’s somewhat sobering, though, to compare with <a href="https://datarepository.wolframcloud.com/resources/The-Busy-Beaver-Competition">known results</a> for maximum (“busy beaver”) lifetimes for Turing machines (where now <em>s</em> is the number of Turing machine states, the Turing machines are started from blank tapes, and they are taken to “halt” when they reach a particular halt state):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124beaversimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124beaversimg3.png' alt='' title='' width='331' height='252'> </div> </p></div> <p>Sufficiently small Turing machines can have only modest lifetimes. But even slightly bigger Turing machines can have vastly larger lifetimes. And in fact it’s a consequence of the undecidability of the halting problem for Turing machines that the maximum lifetime grows with the size of the Turing machine <a href="https://writings.stephenwolfram.com/2021/03/after-100-years-can-we-finally-crack-posts-problem-of-tag-a-story-of-computational-irreducibility-and-more/#so-does-it-always-halt">faster than any computable function</a> (i.e. any function that can be computed in finite time by a Turing machine, or whose value can be proved by a finite proof in a finite axiom system).</p> <p>But, OK, the maximum lifetime increases with the “size of the rule” for a Turing machine, or a cellular automaton. But what defines the “size of a rule”? Presumably it should be roughly the number of <a href="https://www.wolframscience.com/nks/notes-10-3--algorithmic-information-theory/">independent bits needed to specify the rule</a> (which we can also think of as an approximate measure of its “information content”)—or something like log<sub>2</sub> of the number of possible rules of its type. </p> <p>At the outset, we might imagine that all 2<sup>32</sup> <em>k </em>= 2, <em>r</em> = 2 rules would need 32 bits to specify them. But as we discussed above, in some cases some of the bits in the rule don’t matter when it comes to determining the patterns they produce. And what we see is that the more bits that matter (and so have to be specified), the longer the lifetimes that are possible:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050424NikC2Cimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124beaversimg6.png' alt='' title='' width='357' height='190'></div> </div> <p>So far we’ve only been discussing cellular automata with single-cell initial conditions. But if we use more complicated initial conditions what we’re effectively doing is adding more information content into the system—with the result that maximum lifetimes can potentially get larger. And as an example, here are possible lifetimes for <em>k</em> = 2, <em>r</em> = <span class='InlineFormula'><img style="margin-top: -2px" src='https://content.wolfram.com/sites/43/2024/05/sw050124beaversimg7.png' width= '9' height='33' align='absmiddle'></span> rules with a sequence of possible initial conditions:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124beaversimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124beaversimg8.png' alt='' title='' width='610' height='245'> </div> </p></div> <h2 id="probabilistic-approximations">Probabilistic Approximations?</h2> <p>Cellular automata are at their core deterministic systems: given a particular cellular automaton rule and a particular initial condition, every aspect of the behavior that is generated is completely determined. But is there any way that we can approximate this behavior by some probabilistic model? Or might we at least usefully be able to use such a model if we look at the aggregate properties of large numbers of different rules?</p> <p>One hint along these lines comes from the power-law distributions we found above for the frequencies of different possible lifetimes for cellular automata of given types. And we might wonder whether such distributions—and perhaps even their exponents—could be found from some probabilistic model.</p> <p>One possible approach is to approximate a cellular automaton by a probabilistic process—say one in which a cell becomes black with probability <em>p</em> if it or either of its neighbors were black on the step before. Here are some examples of what can happen with <a href="https://www.wolframscience.com/nks/p591--statistical-analysis/">this (“directed percolation”) setup</a>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050424C2CupdatesAimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224approximationsimg1.png' alt='' title='' width='608' height='93'></div> </div> <p>The behavior varies greatly with <em>p</em>; for small <em>p</em> everything dies out, while for large <em>p</em> it fills in:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224approximationsimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224approximationsimg2.png' alt='' title='' width='618' height='115'> </div> </p></div> <p>And indeed the final density—starting from random initial conditions—has a <a href="https://writings.stephenwolfram.com/2021/05/the-problem-of-distributed-consensus">sharp (phase) transition</a> at around <em>p</em> = 0.54 as one varies <em>p</em>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw051024finalC2Cimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224approximationsimg3.png' alt='' title='' width='325' height='107'></div> </div> <p>If instead one starts from a single initial black cell one sees a slightly different transition:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw051024finalC2Cimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224approximationsimg4.png' alt='' title='' width='328' height='108'></div> </div> <p>One can also plot the probabilities for different “survival times” or “lifetimes” for the pattern:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw051024finalC2Cimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw051224frombradimg6.png' alt='' title='' width='349' height='143'></div> </div> <p>And right around the transition the distribution of lifetimes follows a power law—that’s roughly τ<sup>–1</sup> (which happens to be what one gets from a <a href="https://www.wolframscience.com/nks/notes-6-5--probabilistic-estimates-of-cellular-automaton-properties/">mean field theory estimate</a>). </p> <p>So how does this relate to cellular automata? Let’s say we have a <em>k</em> = 2 rule, and we suppose that the colors of cells can be approximated as somehow random. Then we might suppose that the patterns we get could be like in our probabilistic model. And a potential source for the value of <em>p</em> to use would be the fraction of cases in the rule that give a black cell as output. </p> <p>Plotting the lifetimes for <em>k</em> = 2, <em>r</em> = 2 rules against these fractions, we see that the longest lifetimes do occur when a little under half the outcomes are black (though notice this is also where the binomial distribution implies the largest number of rules are concentrated):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw051024finalC2Cimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224approximationsimg7.png' alt='' title='' width='391' height='249'></div> </div> <p>If we don’t try thinking about the details of cellular automaton evolution, but instead just consider the boundaries of finite-lifetime patterns we generate, we can imagine approximating these (say for symmetric rules) just by random walks—that when they collide correspond to the pattern dying out:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw051024finalC2Cimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224approximationsimg8.png' alt='' title='' width='469' height='200'></div> </div> <p>The <a href="https://writings.stephenwolfram.com/2021/03/after-100-years-can-we-finally-crack-posts-problem-of-tag-a-story-of-computational-irreducibility-and-more/#are-they-like-random-walks">standard theory of random walks</a> then tells us that the probability to survive τ steps is proportional to τ<sup>–3/2</sup> for large τ—a power law, though not immediately one of the same ones that we’ve observed for our cellular automaton lifetimes.</p> <h2 id="other-adaptive-evolution-strategies">Other Adaptive Evolution Strategies</h2> <p>In what we’ve done so far, we’ve always taken each step of our process of adaptive evolution to pick an outcome of equal or greater fitness. But what if we adopt a “more impatient” procedure in which at each step we insist on an outcome that has strictly greater fitness? </p> <p>For <em>k</em> = 2 it’s simply not possible with this procedure (at least with a null initial condition) to “escape” the null rule; everything that can be reached with 1 mutation still has lifetime 1. With <nobr><em>k</em> = 3</nobr> it’s possible to go one step, but only one, as captured by this multiway graph:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124strategiesimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124strategiesimg1.png' alt='' title='' width='232' height='136'> </div> </p></div> <p>But we’re assuming here that we have to reach greater fitness with just one mutation. What if we allow two mutations at a time? Well, then we can “make progress”. And here’s the multiway graph in this case for symmetric<em> k</em> = 2, <em>r</em> = 2 rules:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124strategiesimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw051324fC2Cfinalimg2.png' alt='' title='' width='238' height='430'> </div> </p></div> <p>We don’t reach as many phenotypic patterns as by using single mutations and allowing “fitness-neutral moves”, but where we do get, we get much quicker, without any “back and forth” in fitness-neutral spaces. </p> <p>If we allow up to 3 mutations, we get still further: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124strategiesimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124strategiesimg3.png' alt='' title='' width='484' height='496'> </div> </p></div> <p>And indeed we seem to get a pretty good representative sampling of “what’s out there” in this rule space, even though we reach only 37 rules, compared to the 77,624 (albeit with many duplicated phenotypic patterns) from our standard approach allowing neutral moves.</p> <p>For <em>k</em> = 3, <em>r</em> = 1 symmetric rules single mutations can get 2 steps:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124strategiesimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124strategiesimg4.png' alt='' title='' width='92' height='158'> </div> </p></div> <p>But now if we allow up to 2 mutations, we can go much further—and the fact that we now don’t have to deal with neutral moves means we can explicitly construct at least the first few steps of the multiway graph in this case:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124strategiesimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124strategiesimg5.png' alt='' title='' width='550' height='326'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124strategiesimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124strategiesimg6.png' alt='' title='' width='736' height='314'> </div> </p></div> <p>We can go further if at each step we just pick a random higher-fitness rule reached with two or fewer mutations:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124strategiesimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124strategiesimg7-v2.png' alt='' title='' width='561' height='332'> </div> </p></div> <p>The adaptive evolution histories we just showed can be generated in effect by randomly trying a series of possibilities at each step, then picking the first one that exhibits increased fitness. Another approach is to use what amounts to “local exhaustive search”: at each step, look at results from all possible mutations, and pick one that gives the largest fitness. At least in smaller rule spaces, it’s common that there will be several results with the same fitness, and as an example we’ll just pick among these at random:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050124strategiesimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124strategiesimg8-v2.png' alt='' title='' width='332' height='198'> </div> </p></div> <p>One might think that this approach would in effect always be an optimization of the adaptive evolution process. But in practice its systematic character can end up making it get stuck, in some sense repeatedly “trying to do the same thing” even if it “isn’t working”. </p> <p>Something of an opposite approach involves loosening our criteria for which paths can be chosen—and for example allowing paths that temporarily reduce fitness, say by one step of lifetime:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw051024finalC2Cimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124strategiesimg7A-v4.png' alt='' title='' width='732' height='536'></div> </div> <p>In effect here we’re allowing less-than-maximally-fit organisms to survive. And we can represent the overall structure of what’s happening by a multiway graph—which now includes “backtracking” to lower fitnesses:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050424C2CupdatesAimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124strategiesimg10.png' alt='' title='' width='660' height='452'></div> </div> <p>But although the details are different, in the end it doesn’t seem as if allowing this kind of backtracking has any dramatic effect. Somehow the basic phenomena around the process of adaptive evolution are strong enough that most of the details of how the adaptive evolution is done don’t ultimately matter much. </p> <h2 id="an-aside-sexual-reproduction">An Aside: Sexual Reproduction</h2> <p>In everything we’ve done so far, we’ve been making mutations only to individual rules. But there’s another mechanism that exists in many biological organisms: sexual reproduction, in which in effect a pair of rules (i.e. genomes) get mixed to produce a new rule. As a simple model of the crossover that typically happens with actual genomes, we can take two rules, and splice together the beginning of one with the end of the other: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050424redkubeimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224reproimg2B-v2.png' alt='' title='' width='313' height='94'></div> </div> <p>In general there will be many ways to combine pairs of rules like this. In a direct analogy to our <a href="https://www.wolframphysics.org/" target="_blank" rel="noopener">Physics Project</a>, we can represent such “recombinations” as “events” that take two rules and produce one:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224reproimg3A_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224reproimg3B.png' alt='' title='' width='600' height='162'></div> </div> <p>The analog of our multiway graph for all possible paths of adaptive evolution by mutations is now what we call in our Physics Project a <a href="https://writings.stephenwolfram.com/2021/09/multicomputation-a-fourth-paradigm-for-theoretical-science/#the-formal-structure-of-multicomputation">token-event graph</a>:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224reproimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224reproimg4.png' alt='' title='' width='535' height='359'> </div> </p></div> <p>In dealing just with mutations we were able to take a single rule and progressively modify it. Now we always have to work with a “population” of rules, combining them two at a time to generate new rules. We can represent conceivable combinations among one set of rules as follows:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw051324fC2Cfinalimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw051324fC2Cfinalimg4.png' alt='' title='' width='176' height='165'> </div> </p></div> <p>There are at this point many different choices we could make about how to set up our model. The particular approach we’ll use selects just <em>n</em> of the <span class='InlineFormula'><img src='https://content.wolfram.com/sites/43/2024/05/sw050224reproimg6.png' width= '26' height='36' align='absmiddle'></span> = <em>n</em> (<em>n</em> – 1)/2 possible combinations:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw051324fC2Cfinalimg6_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw051324fC2Cfinalimg6.png' alt='' title='' width='164' height='153'> </div> </p></div> <p>Then for each of these selected combinations we attempt a crossover, keeping those “children” (drawn here between their parents) that are not rejected as a result of having lower fitness:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw051324fC2Cfinalimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw051324fC2Cfinalimg8.png' alt='' title='' width='165' height='154'> </div> </p></div> <p>Finally, to “maintain our gene pool”, we carry forward parents selected at random, so that we still end up with <em>n</em> rules. (And, yes, even though we’ve attempted to make this whole procedure as clean as possible, it’s still a mess—which seems to be inevitable, and which has, as we’ll discuss below, bedeviled computational studies of evolution in the past.)</p> <p>OK, so what happens when we apply this procedure, say to <em>k</em> = 3, <em>r</em> = 1 rules? We’ll pick 4 rules at random as our initial population (and, yes, two happen to produce the same pattern):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224reproimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224reproimg9.png' alt='' title='' width='374' height='179'> </div> </p></div> <p>Then in a sequence of steps we’ll successively pick various combinations: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050424markC2Cimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224reproimg11.png' alt='' title='' width='535' height='349'></div> </p></div> <p>And here are the distinct “phenotype patterns” produced in this process (note that even though there can be multiple copies of the same phenotype pattern, the underlying genotype rules are always distinct):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224reproimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224reproimg12.png' alt='' title='' width='447' height='379'> </div> </p></div> <p>As a final form of summarization we can just plot the successive fitnesses of the patterns we generate (with the size of each dot reflecting the number of times a particular fitness occurs):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224reproimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224reproimg13.png' alt='' title='' width='382' height='197'> </div> </p></div> <p>In this case we reach a steady state after 9 steps. The larger the population the longer the adaptive evolution will typically keep going. Here are a couple of examples with population 10, showing all the patterns obtained: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224spacingimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224spacingimg1.png' alt='' title='' width='694' height='607'> </div> </p></div> <p>Showing in each case only the longest-lifetime rule found so far we get:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224reproimg17_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224reproimg17.png' alt='' title='' width='533' height='305'> </div> </p></div> <p>The results aren’t obviously different from what we were finding with mutation alone—even though now we’ve got a much more complicated model, with a whole population of rules rather than a single rule. (One obvious difference, though, is that here we can end up with overall cycles of populations of rules, whereas in the pure-mutation case that can only happen among fitness-neutral rules.)</p> <p>Here are some additional examples—now obtained after 500 steps with population 25</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224reproimg18_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224reproimg18A.png' alt='' title='' width='558' height='206'> </div> </p></div> <p>and with population 50:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224reproimg19_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224reproimg19A.png' alt='' title='' width='625' height='278'> </div> </p></div> <p>And so far as one can tell, even here there are no substantial differences from what we saw with mutation. Certainly there are detailed features introduced by sexual reproduction and crossover, but for our purposes in understanding the big picture of what’s happening in adaptive evolution it seems sufficient to do as we have done so far, and consider only mutation.</p> <h2 id="an-even-more-minimal-model">An Even More Minimal Model</h2> <p>By investigating adaptive evolution in cellular automata we’re already making dramatic simplifications relative, say, to actual biology. But in the effort to understand the essence of phenomena we see, it’s helpful to go even further—and instead of thinking about computational rules and their behavior, just think about vertices on a “mutation graph”, each assigned a certain fitness.</p> <p>As an example, we can set up a 2D grid, assigning each point a certain random fitness:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224minimalimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224minimalimg1.png' alt='' title='' width='279' height='281'> </div> </p></div> <p>And then, starting from a minimum-fitness point, we can follow the same kind of adaptive evolution procedure as above, at each step going to a neighboring point with an equal or greater fitness:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224minimalimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224minimalimg2.png' alt='' title='' width='579' height='131'> </div> </p></div> <p>Typically we don’t manage to go far before we get stuck, though with the uniform distribution of fitness values used here, we still usually end on a fairly large fitness value.</p> <p>We can summarize the possible paths we can take by the multiway graph:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224minimalimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224minimalimg4.png' alt='' title='' width='266' height='144'> </div> </p></div> <p>In our cellular automaton rule space—and, for that matter, in biology—neighboring points don’t just have independent random fitnesses; instead, the fitnesses are determined by a definite computational procedure. So as a simple approximation, we can just take the fitness of each point to be a particular function of its graph coordinates. If the function forms something like a “uniform hill”, then the adaptive evolution procedure will just climb it:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224minimalimg5_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224minimalimg5.png' alt='' title='' width='579' height='132'> </div> </p></div> <p>But as soon as the function has “systematic bumpiness” there’s a tremendous tendency to quickly get stuck:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050424C2CupdatesAimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224minimalimg6.png' alt='' title='' width='579' height='131'></div> </div> <p>And if there’s some “unexpected spot of high fitness”, adaptive evolution typically won’t find it (and it certainly won’t if it’s surrounded by a lower-fitness “moat”):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050424C2CupdatesAimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224minimalimg7.png' alt='' title='' width='579' height='132'></div> </div> <p>So what happens if we increase the dimensionality of the “mutation space” in which we’re operating? Basically it becomes easier to find a path that increases fitness:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224minimalimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224minimalimg8.png' alt='' title='' width='261' height='283'> </div> </p></div> <p>And we can see this, for example, if we look at Boolean hypercubes in increasing numbers of dimensions:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224minimalimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224minimalimg9.png' alt='' title='' width='579' height='133'> </div> </p></div> <p>But ultimately this relies on the fact that in the neighborhood reachable by mutations from a given point, there’ll be a “sufficiently random” collection of fitness values that it’ll (likely) be possible to find a “direction” that’s “going up” in fitness. Yet this alone won’t in general be enough, because we also need it to be the case that there’s enough regularity in the fitness landscape that we can systematically navigate it to find its maximum—and that the maximum is not somehow “unexpected and isolated”.</p> <h2 id="what-can-adaptive-evolution-achieve">What Can Adaptive Evolution Achieve? </h2> <p>We’ve seen that adaptive evolution can be surprisingly successful at finding cellular automata that produce patterns with long but finite lifetimes. But what about other types of “traits”? What can (and cannot) adaptive evolution ultimately manage to do?</p> <p>For example, what if we’re trying to find cellular automata whose patterns don’t just live “as long as possible” but instead die after a specific number of steps? It’s clear that within any finite set of rules (say with particular <em>k</em> and <em>r</em>) there’ll only be a limited collection of possible lifetimes. For symmetric <em>k</em> = 2, <em>r</em> = 2 rules, for example, the possible lifetimes are: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050424NikC2Cimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224adaptiveimg1.png' alt='' title='' width='523' height='31'></div> </div> <p>But as soon as we’re dealing even with <em>k</em> = 3, <em>r</em> = 1 symmetric rules it’s already in principle possible to get every lifetime up to 100. But what about adaptive evolution? How well does it do at reaching rules with all those lifetimes? Let’s say we do single point mutation as before, but now we “accept” a mutation if it leads not specifically to a larger finite lifetime, but to a lifetime that is closer in absolute magnitude to some desired lifetime. (Strictly, and importantly, in both cases we also allow “fitness-neutral” mutations that leave the lifetime the same.)</p> <p>Here are examples of what happens if we try to adaptively evolve to get lifetime exactly 50 in <nobr><em>k</em> = 3,</nobr> <em>r</em> = 1 rules:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224adaptiveimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224adaptiveimg2.png' alt='' title='' width='629' height='681'> </div> </p></div> <p>It gets close—and sometimes it overshoots—but, at least in these particular examples, it never quite makes it. Here’s what we see if we look at the lifetimes achieved with 100 different random sequences of mutations:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224daptiveAimg3_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224daptiveAimg3.png' alt='' title='' width='540' height='193'> </div> </p></div> <p>Basically they mostly get stuck at lifetimes close to 50, but not exactly 50. It’s not that <em>k</em> = 3, <nobr><em>r</em> = 1</nobr> rules can’t yield lifetime 50; exhaustive search shows that even many symmetric such rules can: </p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224adaptiveimg4_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224adaptiveimg4.png' alt='' title='' width='554' height='312'> </div> </p></div> <p>It’s just that our adaptive evolution process usually gets stuck before it reaches rules like these. Even though there’s usually enough “room to maneuver” in <em>k</em> = 3, <em>r</em> = 1 rule space to get to generally longer lifetimes, there’s not enough to specifically get to lifetime 50. </p> <p>But what about <em>k</em> = 4, <em>r</em> = 1 rule space? There are now not 10<sup>12</sup> but about 10<sup>38</sup> possible rules. And in this rule space it becomes quite routine to be able to reach lifetime 50 through adaptive evolution:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224adaptiveimg7_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224adaptiveimg7.png' alt='' title='' width='616' height='819'> </div> </p></div> <p>It can sometimes take a while, but most of the time in this rule space it’s possible to get exactly to lifetime 50:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224adaptiveimg8_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224adaptiveimg8.png' alt='' title='' width='552' height='199'> </div> </p></div> <p>What happens with other “lifetime goals”? Even symmetric <em>k</em> = 3, <em>r</em> = 1 rules can achieve many lifetime values:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224adaptiveimg9_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224adaptiveimg9.png' alt='' title='' width='578' height='31'> </div> </p></div> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224adaptiveimg10_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224adaptiveimg10.png' alt='' title='' width='582' height='32'> </div> </p></div> <p>Indeed, the first “missing” values are 129, 132, 139, etc. And, for example, many multiples of 50 can be achieved:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224adaptiveimg11_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224adaptiveimg11.png' alt='' title='' width='501' height='494'> </div> </p></div> <p>But it becomes increasingly difficult for adaptive evolution to reach these specific goals. Increasing the size of the rule space always seems to help; so for example with <em>k</em> = 4, <em>r</em> = 1, if one’s aiming for lifetime 100, the actual distribution of lifetimes reached is:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224adaptiveimg12_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224adaptiveimg12.png' alt='' title='' width='360' height='103'> </div> </p></div> <p>In general the distribution gets broader as the lifetime sought gets larger:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224adaptiveimg13_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224adaptiveimg13-v2.png' alt='' title='' width='358' height='350'> </div> </p></div> <p>We saw above that across the whole space of, say, <em>k</em> = 4, <em>r</em> = 1 rules, the frequency of progressively larger lifetimes falls roughly according to a power law. So this means that the fractional region in rule space that achieves a given lifetime gets progressively smaller—with the result that typically the paths followed by adaptive evolution are progressively more likely to get stuck before they reach it.</p> <p>OK, so what about other kinds of objectives? Say ones more related to the morphologies of patterns? As a simple example, let’s consider the objective of maximizing the “widths” of finite-lifetime patterns. We can try to achieve this by adaptive evolution in which we reject any mutations that lead to decreased width (where “width” is defined as the maximum horizontal extent of the pattern). And once again this process manages to “discover” all sorts of “mechanisms” for achieving larger widths (here each pattern is labeled by its height—i.e. lifetime—and width):</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224caspacingimg1_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224caspacingimg1.png' alt='' title='' width='694' height='1052'> </div> </p></div> <p>There are certain structural constraints here. For example, the width can’t be too large relative to the height—because if it’s too large, patterns tend to grow forever. </p> <p>But what if we specifically try to select for maximal “pattern aspect ratio” (i.e. ratio of width to height)? In essentially every case so far, adaptive evolution has in effect “invented many different mechanisms” to achieve whatever objective we’ve defined. But here it turns out we essentially see “the same idea” being used over and over again—presumably because this is the only way to achieve our objective given the overall structure of how the underlying rules we’re using work:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224adaptiveimg16_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224adaptiveimg16.png' alt='' title='' width='569' height='223'> </div> </p></div> <p>What if we ask something more specific? Like, say, that the aspect ratio be as close to 3 as possible. Much of the time the “solution” that adaptive evolution finds is the correct if trivial:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224adaptiveimg17_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224adaptiveimg17.png' alt='' title='' width='57' height='57'> </div> </p></div> <p>But sometimes it finds another solution—and often a surprisingly elaborate and complicated one:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224adaptiveimg18_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224adaptiveimg18.png' alt='' title='' width='535' height='634'> </div> </p></div> <p>How about if our goal is an aspect ratio of π ≈ 3.14? It turns out adaptive evolution can still do rather well here even just with the symmetric <em>k</em> = 4, <em>r</em> = 1 rules that we’re using:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224adaptiveimg19_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224adaptiveimg19.png' alt='' title='' width='677' height='463'> </div> </p></div> <p>We can also ask about properties of the “inside” of the pattern. For example, we can ask to maximize the lengths of uniform runs of nonwhite cells in the center column of the pattern. And, once again, adaptive evolution can successfully lead us to rules (like these random examples) where this is large:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224adaptiveimg20_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224adaptiveimg20.png' alt='' title='' width='634' height='511'> </div> </p></div> <p>We can go on and get still more detailed, say asking about runs of particular lengths, or the presence or number of particular subpatterns. And eventually—just like when we asked for too long a lifetime—we’ll find that the cases we’re looking for are “too sparse”, and adaptive evolution (at least in a given rule space) won’t be able to find them, even if exhaustive search could still identify at least a few examples.</p> <p>But just what kinds of objectives (or fitness functions) can be handled how well by adaptive evolution, operating for example on the “raw material” of cellular automata? It’s an important question—an analog of which is also central to the <a href="https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/#surely-a-network-thats-big-enough-can-do-anything">investigation of machine learning</a>. But as of now we don’t really have the tools to address it. It’s somehow reminiscent of asking <a href="https://writings.stephenwolfram.com/2024/03/can-ai-solve-science/#finding-whats-interesting">what kinds of functions can be approximated</a> how well by different methods or basis functions. But it’s more complicated. Solving it, though, would tell us a lot about the “reach” of adaptive evolution processes, not only for biology but also for machine learning.</p> <h2 id="what-it-means-for-whats-going-on-in-biology">What It Means for What’s Going On in Biology</h2> <p>How do biological organisms manage to be the way they are, with all their complex and seemingly clever solutions to such a wide range of challenges? Is it just natural selection that does it, or is there in effect more going on? And if “natural selection does it”, how does it actually manage it? </p> <p>From the point of view of traditional engineering what we see in biology is often very surprising, and much more complex and “clever” than we’d imagine ever being able to create ourselves. But is the secret of biology in a sense just natural selection? Well, actually, there’s often an analog of natural selection going on even in engineering, as different designs get tried and only some get selected. But at least in traditional engineering a key feature is that one always tries to come up with designs where one can foresee their consequences. </p> <p>But biology is different. Mutations to genomes just happen, without any notion that their consequences can be foreseen. But still one might assume that—when guided by natural selection—the results wouldn’t be too different to what we’d get in traditional engineering.</p> <p>But there’s a crucial piece of intuition missing here. And it has to do with how randomly chosen programs behave. We might have assumed (based on our typical experience with programs we explicitly construct for particular purposes) that at least a simple random program wouldn’t ever do anything terribly interesting or complicated. </p> <p>But the <a href="https://www.wolframscience.com/nks/chap-2--the-crucial-experiment/">surprising discovery I made in the early 1980s</a> is that this isn’t true. And instead, it’s a ubiquitous phenomenon that in the computational universe of possible programs, one can get immense complexity even from very simple programs. So this means that as mutation operates on a genome, it’s essentially inevitable that it’ll end up sampling programs that show highly complex behavior. At the outset, one might have imagined that such complexity could only be achieved by careful design and would inevitably be at best rare. But the surprising fact is that—because of how things fundamentally work in the computational universe—it’s instead easy to get. </p> <p>But what does complexity have to do with creating “successful organisms”? To create a “successful organism” that can prosper in a particular environment there fundamentally has to be some way to get to a genome that will “solve the necessary problems”. And this is where natural selection comes in. But the fact that it can work is something that’s not at all obvious.</p> <p>There are really two issues. The first is whether a program (i.e. genome) even exists that will “solve the necessary problems”. And the second is whether such a program can be found by a “thread” of adaptive evolution that goes only through intermediate states that are “fit enough” to survive. As it turns out, both these issues are related to the same fundamental features of computation—which are also responsible for the ubiquitous appearance of complexity.</p> <p>Given some underlying framework—like cellular automata, or like the basic apparatus of life—is there some rule that can be implemented in that framework that will achieve some particular (computational) objective? The <a href="https://www.wolframscience.com/nks/chap-12--the-principle-of-computational-equivalence/">Principle of Computational Equivalence</a> says that generically the answer will be yes. In effect, given almost any “underlying hardware”, it’ll ultimately be possible to come up with “software” (i.e. a rule) that achieves almost any (“physically possible”) given objective—like growing an organism of at least some kind that can survive in a particular environment. But how can we actually find a rule that achieves this?</p> <p>In principle we could do exhaustive search. But that will be exponentially difficult—and in all but toy cases will be utterly infeasible in practice. So what about adaptive evolution? Well, that’s the big question. And what we’ve seen here is that—rather surprisingly—simple mutation and selection (i.e. the mechanisms of natural selection) very often provide a dramatic shortcut for finding rules that do what we want.</p> <p>So why is this? In effect, adaptive evolution is finding a path through rule space that gets to where we want to go. But the surprising part is that it’s managing to do this one step at a time. It’s just trying random variations (i.e. mutations) and as soon as it finds one that’s not a “step down in fitness”, it’ll “take it”, and keep going. At the outset it’s certainly not obvious that this will work. In particular, it could be that at some point there just won’t be any “way forward”. All “directions” will lead only to lower fitness, and in effect the adaptive evolution will get stuck.</p> <p>But the key observation from the experiments in our simple model here is that this typically doesn’t happen. And there seem to be basically two things going on. The first is that rule space is in effect very high-dimensional. So that means there are “many directions to choose from” in trying to find one that will allow one to “take a step forward”. But on its own this isn’t enough. Because there could be correlations between these directions that would mean that if one’s blocked in one direction one would inevitably be blocked in all others. </p> <p>So why doesn’t this happen? Well, it seems to be the result of the fundamental computational phenomenon of <a href="https://www.wolframscience.com/nks/chap-12--the-principle-of-computational-equivalence#sect-12-6--computational-irreducibility">computational irreducibility</a>. A traditional view based on experience with mathematical science had been that if one knew the underlying rule for a system then this would immediately let one predict what the system would do. But what became clear from my explorations in the 1980s and 1990s is that in the computational universe this generically isn’t true. And instead, that the only way one can systematically find out what most computational systems will do is explicitly to run their rules, step by step, doing in effect the same irreducible amount of computational work that they do.</p> <p>So if one’s just presented with behavior from the system one won’t be in a position to “decode it” and “see its simple origins”. Unless one’s capable of doing as much computational work as the system itself, one will just have to consider what it’s doing as (more or less) “random”. And indeed this seems to be at the root of many important phenomena, such as the <a href="https://writings.stephenwolfram.com/2023/02/computational-foundations-for-the-second-law-of-thermodynamics/">Second Law of thermodynamics</a>. And I also suspect it’s at the root of the effectiveness of adaptive evolution, notably in biology. </p> <p>Because what computational irreducibility implies is that around every point in rule space there’ll be a certain “effective randomness” to fitnesses one sees. And if there are many dimensions to rule space that means it’s overwhelmingly likely that there’ll be “paths to success” in some directions from that point. </p> <p>But will the adaptive evolution find them? We’ve assumed that there are a series of mutations to the rule, all happening “at random”. And the point is that if there are <em>n</em> elements in the rule, then after some fraction of <em>n</em> mutations we should find our “success direction”. (If we were doing exhaustive search, we’d instead have to try about <em>k<sup>n</sup></em> possible rules.)</p> <p>At the outset it might seem conceivable that the sequence of mutations could somehow “cleverly probe” the structure of rule space, “knowing” what directions would or would not be successful. But the whole point is that going from a rule (i.e. genotype) to its behavior (i.e. phenotype) is generically a computationally irreducible process. So assuming that mutations are generated in a computationally bounded way it’s inevitable that they can’t “break computational irreducibility” and so will “experience” the fitness landscape in rule space as “effectively random”. </p> <p>OK, but what about “achieving the characteristics an organism needs”? What seems to be critical is that these characteristics are in a sense computationally simple. We want an organism to live long enough, or be tall enough, or whatever. It’s not that we need the organism to perform some specific computationally irreducible task. Yes, there are all sorts of computationally irreducible processes happening in the actual development and behavior of an organism. But as far as biological evolution is concerned all that matters is ultimately some computationally simple measure of fitness. It’s as if biological evolution is—in the sense of my recent <a href="https://writings.stephenwolfram.com/2023/12/observer-theory/">observer theory</a>—a computationally bounded observer of underlying computationally irreducible processes. </p> <p>And to the observer what emerges is the “simple law” of biological evolution, and the idea that, yes, it is possible just by natural selection to successfully generate all sorts of characteristics.</p> <p>There are all sorts of consequences of this for thinking about biology. For example, in thinking about <a href="https://www.wolframscience.com/nks/chap-8--implications-for-everyday-systems#sect-8-5--fundamental-issues-in-biology">where complexity in biology “comes from”</a>. Is it “generated by natural selection”, perhaps reflecting the complicated sequence of historical accidents embodied in the particular collection of mutations that occurred? Or is it from somewhere else? </p> <p>In the picture we’ve developed here it’s basically from somewhere else—because it’s essentially a reflection of computational irreducibility. Having said that, we should remember that the very possibility of being able to have organisms with such a wide range of different forms and functions is a consequence of the universal computational character of their underlying setup, which in turn is closely tied to computational irreducibility. </p> <p>And it’s in effect because natural selection is so coarse in its operation that it does not somehow avoid the ubiquitous computational irreducibility that exists in rule space, with the result that when we “look inside” biological systems we tend to see computational irreducibility and the complexity associated with it. </p> <p>Something that we’ve seen over and over again here is that, yes, adaptive evolution manages to “solve a problem”. But its solution looks very complex to us. There might be some “simple engineering solution”—involving, say, a very regular pattern of behavior. But that’s not what adaptive evolution finds; instead it finds something that to us is typically very surprising—very often an “unexpectedly clever” solution in which lots of pieces fit together just right, in a way that our usual “understand-what’s-going-on” engineering practices would never let us invent.</p> <p>We might not have expected anything like this to emerge from the simple process of adaptive evolution. But—as the models we’ve studied here highlight—it seems to be an inevitable formal consequence of core features of computational systems. And as soon as we recognize that biological systems can be viewed as computational, then it also becomes something inevitable for them—and something we can view as in a sense formally derivable for them.</p> <p>At the outset we might not have been able to say “what matters” in the emergence of complexity in biology. But from the models we’ve studied, and the arguments we’ve made, we seem to have quite firmly established that it’s a fundamentally computational phenomenon, that relies only on certain general computational features of biological systems, and doesn’t depend on their particular detailed components and structure.</p> <p>But in the end, how “generic” is the complexity that comes out of adaptive evolution? In other words, if we were to pick programs, say completely at random, how different would the complexity they produce be from the complexity we see in programs that have been adaptively evolved <a href="https://writings.stephenwolfram.com/2018/01/showing-off-to-the-universe-beacons-for-the-afterlife-of-our-civilization/#aliens-and-the-philosophy-of-purpose">“for a purpose”</a>? The answer isn’t clear—though to know it would provide important foundational input for theoretical biology.</p> <p>One has the general impression that computational irreducibility is a strong enough phenomenon that it’s the “dominant force” that determines behavior and produces complexity. But there’s still usually something a bit different about the patterns we see from rules we’ve found by adaptive evolution, compared to rules we pick at random. Often there seems to be a certain additional level of “apparent mechanism”. The details still look complicated and in some ways quite random, but there seems to be a kind of “overall orchestration” to what’s going on.</p> <p>And whenever we can identify such regularities it’s a sign of some kind of computational reducibility. There’s still plenty of computational irreducibility at work. But “high fitness” rules that we find through adaptive evolution typically seem to exhibit traces of their specialness—that manifests in at least a certain amount of computational reducibility. </p> <p>Whenever we manage to come up with a <a href="https://writings.stephenwolfram.com/2023/12/observer-theory/">“narrative explanation”</a> or a “natural law” for something, it’s a sign that we’ve found a pocket of computational reducibility. If we say that a cellular automaton manages to live long because it generates certain robust geometric patterns—or, for that matter, that an organism lives long because it proofreads its DNA—we’re giving a narrative that’s based on computational reducibility. </p> <p>And indeed whenever we can successfully identify a “mechanism” in our cellular automaton behavior, we’re in effect seeing computational reducibility. But what can we say about the aggregate of a whole collection of mechanisms? </p> <p>In a different context I’ve discussed the concept of a <a href="https://writings.stephenwolfram.com/2023/02/computational-foundations-for-the-second-law-of-thermodynamics/#class-4-and-the-mechanoidal-phase">“mechanoidal phase”</a>, distinguished, say, from solids and liquids by the presence of a “bulk orchestration” of underlying components. It’s something closely related to <a href="https://www.wolframscience.com/nks/chap-6--starting-from-randomness#sect-6-2--four-classes-of-behavior">class 4 behavior</a>. And it’s interesting to note that if we look, for example, at the rules we found from adaptive evolution at the end of the previous section, their evolution from random initial conditions mostly shows characteristic class 4 behavior:</p> <div> <div class='wolfram-c2c-wrapper writtings-c2c_above' data-c2c-file='https://content.wolfram.com/sites/43/2024/05/sw050224biologyimg2_copy.txt' data-c2c-type='text/html'> <img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224biologyimg2.png' alt='' title='' width='671' height='406'> </div> </p></div> <p>In other words, adaptive evolution is potentially bringing us to “characteristically special” places in rule space—perhaps suggesting that there’s something “characteristically special” about the kind of structures that are produced in biological systems. And if we could find a way to make general statements about that “characteristic specialness” it would potentially lead us to a framework for constructing a new broad formal theory in biology.</p> <h2 id="correspondence-with-biological-phenomena">Correspondence with Biological Phenomena</h2> <p>The models we’ve studied here are extremely simple in their basic construction. And at some level it’s remarkable that—without for example including any biophysics or biochemistry—they can get anywhere at all in capturing features of biological systems and biological evolution.</p> <p>In a sense this is ultimately a reflection of the fundamentally computational character of biology—and the generality of computational phenomena. But it’s very striking that even the patterns of cellular automaton behavior we see look very “lifelike and organic”. </p> <p>In actual biology even the shortest genomes are vastly longer than the tiny cellular automaton rules we’ve considered. But even by the time we’re looking at the length-27 “genomic sequences” in <em>k</em> = 3, <em>r</em> = 1 cellular automata, there are already 3 trillion possible sequences, which seems to be enough to see many core “combinatorially driven” biology-like phenomena.</p> <p>The running of a cellular automaton rule might also at first seem far away from the actual processes that create biological organisms—involving as they do things like the construction of proteins and the formation of <a href="https://www.wolframscience.com/nks/chap-8--implications-for-everyday-systems#sect-8-6--growth-of-plants-and-animals">elaborate functional and spatial structures</a>. But there are more analogies than one might at first imagine. For example, it’s common for only particular cases in the cellular automaton rule to be used in a given region of the pattern that’s formed, much as particular genes are typically turned on in different tissues in biological organisms. </p> <p>And, for example, the “geometrical restriction” to a simple 1D array of “cells” doesn’t seem to matter much as soon as there’s sophisticated computation going on; we still get lots of structures that are actually surprisingly reminiscent of typical patterns of biological growth.</p> <p>One of the defining features of biological organisms is their capability for <a href="https://www.wolframscience.com/nks/p824--intelligence-in-the-universe/">self-reproduction</a>. And indeed if it wasn’t for this kind of “copying” there wouldn’t be anything like adaptive evolution to discuss. Our models don’t attempt to derive self-reproduction; they just introduce it as something built into the models.</p> <p>And although we’ve considered several variants, we’re basically also just building into our models the idea of mutations. And what we find is that it seems as if single point mutations made one at a time are enough to capture basic features of adaptive evolution. </p> <p>We’ve also primarily considered what amounts to a single lineage—in which there’s just a single rule (or genome) at a given step. We do mutations, and we mostly “implement natural selection” just by keeping only rules that lead to patterns whose fitness is no less than what we had before.</p> <p>If we had a whole population of rules it probably wouldn’t be so significant, but in the simple setup we’re using, it turns out to be important that we don’t reject “fitness-neutral mutations”. And indeed we’ve seen many examples where the system wanders around fitness-neutral regions of rule space before finally “discovering” some “innovation” that allows it to increase fitness. The way our models are set up, that “wandering” always involves changes in the “genotype”—but usually at most minor changes in the phenotype. So it’s very typical to see long periods of “apparent equilibrium” in which the phenotype changes rather little, followed by a “jump” to a new fitness level and a rather different phenotype.</p> <p>And this observation seems quite aligned with the phenomenon of “punctuated equilibrium” often reported in the fossil record of actual biological evolution.</p> <p>Another key feature of biological organisms and biological evolution is the formation of distinct species, as well as distinct phyla, etc. And indeed we ubiquitously see something that seems to be directly analogous in our multiway graphs of all possible paths of adaptive evolution. Typically we see distinct branches forming, based on what seem like “different mechanisms” for achieving fitness. </p> <p>No doubt in actual biology there are all sorts of detailed phenomena related to reproductive or spatial isolation. But in our models the core phenomenon that seems to lead to the analog of “branching in the tree of life” is the existence of “distinctly different computational mechanisms” in different parts of rule space. It’s worth noting that at least with our finite rule spaces, branches can die out, with no “successor species” appearing the multiway graph.</p> <p>And indeed looking at the actual patterns produced by rules in different parts of the multiway graph it’s easy to imagine morphologically based taxonomic classifications—that would be somewhat, though not perfectly, aligned with the phylogenetic tree defined by actual rule mutations. (At a morphological level we quite often see some level of “convergent evolution” in our multiway graphs; in small examples we sometimes also see actual “genomic convergence”—which will typically be astronomically rare in actual biological systems.)</p> <p>One of the remarkable features of our models is that they allow quite global investigation of the “overall history” of adaptive evolution. In many of the simple cases we’ve discussed, the rule space we’re using is small enough that in a comparatively modest number of mutation steps we get to the “highest fitness we can reach”. But (as the examples we saw in Turing machines suggest) expanding the size of the rules we’re using even just a little can be expected to be sufficient to allow us to get astronomically further. </p> <p>And the further we go, the more “mechanisms” will be “invented”. It’s an inevitable feature of systems that involve computational irreducibility that there are new and unpredictable things that will go on showing up forever—along with new pockets of computational reducibility. So even after a few billion years—and the <a href="https://www.wolframscience.com/nks/p386--fundamental-issues-in-biology/">trillion generations</a> and <a href="https://writings.stephenwolfram.com/2018/01/showing-off-to-the-universe-beacons-for-the-afterlife-of-our-civilization/#the-molecular-version">10<sup>40</sup> or so organisms</a> that have ever lived—there’s still infinitely further for biological evolution to go, and more and more branches to be initiated in the tree of life, involving more and more “new mechanisms”. </p> <p>I suppose one might imagine that at some point biological organisms would reach a “maximum fitness”, and go no further. But even in our simple model with fitness measured in terms of pattern lifetime, there’ll be no upper limit on fitness; given any particular lifetime, it’s a feature of the fundamental theory of computation that there’ll always be a program that will yield a larger lifetime. Still, one might think, at some point enough is enough: the giraffe’s neck is long enough, etc. But if nothing else, competition between organisms will always drive things forward: yes, a particular lineage of organisms achieved a certain fitness, but then another lineage can come along and get to that fitness too, forcing the first lineage to go even further so as to not lose out.</p> <p>Of course, in our simple model we’re not explicitly accounting for interactions with other organisms—or for detailed properties of the environment, as well as countless other effects. And no doubt there are many biological phenomena that depend on these effects. But the key point here is that even without explicitly accounting for any of these effects, our simple model still seems to capture many core features of biological evolution. Biological evolution—and, indeed, adaptive evolution in general—is, it seems, fundamentally a computational phenomenon that robustly emerges quite independent of the details of systems.</p> <p>In the past few years our Physics Project has given strong evidence that the foundations of physics are fundamentally computational—with the core laws of physics arising as inevitable consequences of the way <a href="https://writings.stephenwolfram.com/2023/12/observer-theory/">observers like us</a> “parse” <a href="https://writings.stephenwolfram.com/2021/11/the-concept-of-the-ruliad/">the ruliad</a> of all possible computational processes. And what we’ve seen here now suggests that there’s a remarkable commonality between the foundations of physics and biology. Both are anchored in computational irreducibility. And both sample slices of computational reducibility. Physics because that’s what observers like us do to get descriptions of the world that fit in our finite minds. Biology because that’s what biological evolution does in order to achieve the “coarse objectives” set by natural selection. </p> <p>The intuition of physics tends to be that there are ultimately simple models for things, whereas in biology there’s a certain sense that everything is always almost infinitely complicated, with a new effect to consider at every turn. But presumably that’s in large part because what we study in biology tends to quickly come face to face with computational irreducibility—whereas in physics we’ve been able to find things to study that avoid this. But now the commonality in foundations between physics and biology suggests that there should also be in biology the kind of structure we have in physics—complete with general laws that allow us to make useful, broad statements. And perhaps the simple model I’ve presented here can help lead us there—and in the end help build up a new paradigm for thinking about biology in a fundamentally theoretical way. </p> <h2 id="historical-notes" style=" border-top: solid 2px #ccc; padding-top: .8rem; margin-top: 2rem !important;">Historical Notes</h2> <p>There’s a long—if circuitous—history to the things I’m discussing here. Basic notions of heredity—particularly for humans—were already widely recognized in antiquity. Plant breeding was practiced from the earliest days of agriculture, but it wasn’t until the late 1700s that any kind of systematic selective breeding of animals began to be commonplace. Then in 1859 Charles Darwin described the idea of “natural selection” whereby competition of organisms in their natural environment could act like artificial selection, and, he posited, would over long periods lead to the development of new species. He ended his <em><a href="https://datarepository.wolframcloud.com/resources/On-the-Origin-of-Species/">Origin of Species</a></em> with the claim that:</p> <blockquote> <p style="font-style: normal; line-height: 1.4; font-size: .98rem; text-indent: 0;">… from the war of nature … the production of the higher animals, directly follows. … and whilst this planet has gone cycling on according to the fixed law of gravity, from so simple a beginning endless forms most beautiful and most wonderful have been, and are being, evolved.</p> </blockquote> <p>What he appears to have thought is that there would somehow follow from natural selection a general law—like the law of gravity—that would lead to the evolution of progressively more complex organisms, culminating in the “higher animals”. But absent the kind of model I’m discussing here, nothing in the later development of traditional evolutionary theory really successfully supported this—or was able to give much analysis of it.</p> <p>Right around the same time as Darwin’s <em>Origin of Species</em>, <a href="https://www.wolframalpha.com/input?i=Gregor+Mendel">Gregor Mendel</a> began to identify simple probabilistic laws of inheritance—and when his work was rediscovered at the beginning of the 1900s it was used to develop mathematical models of the frequencies of genetic traits in populations of organisms, with key contributions to what became the field of population genetics being made in the 1920s and 1930s by <a href="https://www.wolframalpha.com/input?i=J.B.S.+Haldane">J. B. S. Haldane</a>, <a href="https://www.wolframalpha.com/input?i=R.A.+Fisher">R. A. Fisher</a> and <a href="https://www.wolframalpha.com/input?i=Sewall+Wright">Sewall Wright</a>, who came up with the concept of a “fitness landscape”. </p> <p>On a quite separate track there had been efforts ever since antiquity to classify and understand the growth and form of biological organisms, sometimes by analogy to physical or mathematical ideas—and by the 1930s it seemed fairly clear that chemical messengers were somehow involved in the control of growth processes. But the mathematical methods used for example in population genetics basically only handled discrete traits (or simple numerical ones accessible to biometry), and didn’t really have anything to say about something like the development of complexity in the forms of biological organisms.</p> <p>The 1940s saw the introduction of what amounted to electrical-engineering-inspired approaches to biology, often under the banner of cybernetics. Idealized neural networks were introduced by <a href="https://en.wikipedia.org/wiki/Warren_Sturgis_McCulloch" target="_blank">Warren McCulloch</a> and <a href="https://en.wikipedia.org/wiki/Walter_Pitts" target="_blank">Walter Pitts</a> in 1943, and soon the idea emerged (notably in the work of <a href="https://en.wikipedia.org/wiki/Donald_O._Hebb" target="_blank">Donald Hebb</a> in 1949) that learning in such systems could occur through a kind of adaptive evolution process. And by the time practical electronic computing began to emerge in the 1950s there was widespread belief that ideas from biology—including evolution—would be useful as an inspiration for what could be done. Often what would now just be described as adaptive algorithms were couched in biological evolution terms. And even when iterative methods were used for optimization (say in industrial production or engineering design) they were sometimes presented as being grounded in biological evolution.</p> <p>Meanwhile, by the 1960s, there began to be what amounted to Monte Carlo simulations of population-genetics-style evolutionary processes. A particularly elaborate example was work by <a href="https://en.wikipedia.org/wiki/Nils_Aall_Barricelli" target="_blank">Nils Barricelli</a> on what he called “numeric evolution” in which a fairly complicated numerical-cellular-automaton-like “competition between organisms” program with “randomness” injected from details of data layout in computer memory showed what he claimed to be biological-evolution-like phenomena (such as symbiosis and parasitism). </p> <p>In a different direction there was an attempt—notably by <a href="https://writings.stephenwolfram.com/2003/12/john-von-neumanns-100th-birthday/">John von Neumann</a>—to “mathematicize the foundations of biology” leading by the late 1950s to what we’d now call 2D cellular automata “engineered” in complicated ways to show phenomena like self-reproduction. The followup to this was mostly early-theoretical-computer-science work, with no particular connection to biology, and no serious mention of adaptive evolution. When the Game of Life was introduced in 1970 it was widely noted as “doing lifelike things”, but essentially no scientific work was done in this direction. By the 1970s, though, L-systems and fractals had introduced the idea of recursive tree-like or nested structures that could be generated by simple algorithms and rendered by computer graphics—and seemed to give forms close to some seen in biology. My own work on 1D cellular automata (starting in 1981) focused on systematic scientific investigation of simple programs and what they do—with the surprising conclusion that even very simple programs can produce highly complex behavior. But while I saw this as informing the generation of complexity in things like the growth of biological organisms, I didn’t at the time (as I’ll describe below) end up seriously exploring any adaptive evolution angles.</p> <p>Still another thread of development concerned applying biological-like evolution not just to parameters but to operations in programs. And for example in 1958 <a href="https://en.wikipedia.org/wiki/Richard_M._Friedberg" target="_blank">Richard Friedberg</a> at IBM tried making random changes to instructions in machine-code programs, but didn’t manage to get this to do much. (Twenty years later, superoptimizers in practical compilers did begin to successfully use such techniques.) Then in the 1960s, <a href="https://en.wikipedia.org/wiki/John_Henry_Holland" target="_blank">John Holland</a> (who had at first studied learning in neural nets, and was then influenced by <a href="https://en.wikipedia.org/wiki/Arthur_Burks" target="_blank">Arthur Burks</a> who had worked on cellular automata with von Neumann) suggested representing what amounted to programs by simple strings of symbols that could readily be modified like genomic sequences. The typical idea was to interpret the symbols as computational operations—and to assign a “fitness” based on the outcome of those operations. A “genetic algorithm” could then be set up by having a population of strings that was adaptively evolved. Through the 1970s and 1980s occasional practical successes were reported with this approach, particularly in optimization and data classification—with much being made of the importance of sexual-reproduction-inspired crossover operations. (Something that began to be used in the 1980s was the much simpler approach of simulated annealing—that involves randomly changing values rather than programs.)</p> <p>By the beginning of the 1980s the idea had also emerged of adaptively modifying the structure of mathematical expressions—and of symbolic expressions representing programs. There were notable applications in computer graphics (e.g. by <a href="https://en.wikipedia.org/wiki/Karl_Sims" target="_blank">Karl Sims</a>) as well as to things like the 1984 <em>Core War</em> “game” involving competition between programs in a virtual machine. In the 1990s <a href="https://www.wolframalpha.com/input?i=John+Koza">John Koza</a> was instrumental in developing the idea of “genetic programming”, notably as a way to “automatically create inventions”, for example in areas like circuit and antenna design. And indeed to this day scattered applications of these methods continue to pop up, particularly in geometrical and mechanical design.</p> <p>From the very beginning there’d been controversy around Darwin’s ideas about evolution. First, there was the issue of conflict with religious accounts of creation. But there were also—often vigorous—disagreements within the scientific community about the interpretation of the fossil record and about how large-scale evolution was really supposed to operate. A notable issue—still very active in the 1980s—was the relationship between the “freedom of evolution” and the constraints imposed by the actual dynamics of growth in organisms (and interactions between organisms). And despite much insistence that the only reasonable “scientific” (as opposed to religious) point of view was that “natural selection is all there is”, there were nagging mysteries that suggested there must be other forces at work.</p> <p>Building on the possibilities of computer experimentation (as well as things like my work on cellular automata) there emerged in the mid-1980s, particularly through the efforts of <a href="https://www.wolframalpha.com/input?i=Chris+Langton">Chris Langton</a>, a focus on investigating computational models of “artificial life”. This resulted in all sorts of simulations of ecosystems, etc. that did produce a variety of evolution-related phenomena known from field biology—but typically the models were far too complex in their structure for it to be possible to extract fundamental conclusions from them. Still, there continued to be specific, simpler experiments. For example, in 1986, for his book <em>The Blind Watchmaker</em>, <a href="https://www.wolframalpha.com/input?i=Richard+Dawkins">Richard Dawkins</a> made pictures of what he called “biomorphs”, produced by adaptively adjusting parameters for a simple tree-growth algorithm based on the overall shapes generated. </p> <p>In the 1980s, stimulated by my work, there were various <a href="https://www.complex-systems.com/abstracts/v01_i01_a11/">isolated studies</a> of “rule evolution” in cellular automata (as well as art and museum exhibits based on this), and in the 1990s there was more systematic work—notably by <a href="https://en.wikipedia.org/wiki/James_P._Crutchfield" target="_blank">Jim Crutchfield</a> and <a href="https://en.wikipedia.org/wiki/Melanie_Mitchell" target="_blank">Melanie Mitchell</a>—on using genetic algorithms to try to evolve cellular automaton rules to solve tasks like density classification. (Around this time “evolutionary computation” also began to emerge as a general term covering genetic algorithms and other usually-biology-inspired adaptive computational methods.)</p> <p>Meanwhile, accelerating in the 1990s, there was great progress in understanding actual molecular mechanisms in biology, and in figuring out how genetic and developmental processes work. But even as huge amounts of data accumulated, enthusiasm for large-scale “theories of biology” (that might for example address the production of complexity in biological evolution) seemed to wane. (The discipline of systems biology attempted to develop specific, usually mathematical, models for biological systems—but there never emerged much in the way of overarching theoretical principles, except perhaps, somewhat specifically, in areas like immunology and neuroscience.)</p> <p>One significant exception in terms of fundamental theory was <a href="https://www.wolframalpha.com/input?i=Gregory+Chaitin">Greg Chaitin</a>’s concept from around 2010 of “metabiology”: an effort (see below) to use ideas from the theory of computation to understand very general features of the evolution of programs and relate them to biological evolution.</p> <p>Starting in the 1950s another strand of development (sometimes viewed as a practical branch of artificial intelligence) involved the idea of “machine learning”. Genetic algorithms were one of half a dozen common approaches. Another was based on artificial neural nets. For decades machine learning languished as a somewhat esoteric field, dominated by engineering solutions that would occasionally deliver specific application results. But then in 2011 there was unexpectedly dramatic success in using neural nets for image identification, followed in subsequent years by successes in other areas, and culminating in 2022 with the arrival of large language models and ChatGPT. </p> <p>What hadn’t been anticipated was that the behavior of neural nets can change a lot if they’re given sufficiently huge amounts of training. But there isn’t any good understanding of just why this is so, and just how successful neural nets can be at what kinds of tasks. Ever since the 1940s it has been recognized that there are relations between biological evolution and learning in neural nets. And having now seen the impressive things neural nets can do, it seems worthwhile to look again at what happens in biological evolution—and to try to understand why it works, not least as a prelude to understanding more about neural nets and machine learning.</p> <h2 id="personal-notes">Personal Notes</h2> <p>It’s strange to say, but most of what I’ve done here I should really have done forty years ago. And I almost did. Except that I didn’t try quite the right experiments. And I didn’t have the intuition to think that it was worth trying more. </p> <p>Forty years later, I have new intuition, particularly informed by experience with modern machine learning. But even now, what made possible what I’ve done here was a chance experiment done for a somewhat different purpose.</p> <p>Back in 1981 I had become very interested in the question of how complexity arises in the natural world, and I was trying to come up with models that might capture this. Meanwhile, I had just finished <a href="https://writings.stephenwolfram.com/2013/06/there-was-a-time-before-mathematica/">Version 1.0 of SMP</a>, the forerunner to <a href="https://www.wolfram.com/language/">Mathematica and the Wolfram Language</a>—and I was wondering how one might generalize its pattern-matching paradigm to “general AI”. </p> <p>As it happened, right around that time, neural nets gained some (temporary) popularity. And seeing them as potentially relevant to both my topics I started simulating them and trying to see what kind of general theory I could develop about them. But I found them frustrating to work with. There seemed to be too many parameters and details to get any clear conclusions. And, at a practical level, I couldn’t get them to do anything particularly useful.</p> <p>I decided that for my science question I needed to come up with something much simpler. And as a kind of minimal merger of spin systems and neural nets I ended up <a href="https://www.wolframscience.com/nks/chap-1--the-foundations-for-a-new-kind-of-science#sect-1-4--the-personal-story-of-the-science-in-this-book">inventing cellular automata</a> (only later did I discover that versions of them had been invented several times before). </p> <p>As soon as I started doing experiments on them, I discovered that cellular automata were a window into an amazing new scientific world—that I have continued to explore in one way or another ever since. My key methodology, at least at first, was just to enumerate the simplest possible cellular automaton rules, and see what they did. The diversity—and complexity—of their behavior was remarkable. But the simplicity of the rules meant that the details of “successive rules” were usually fairly different—and while there were <a href="https://www.wolframscience.com/nks/chap-6--starting-from-randomness#sect-6-2--four-classes-of-behavior">common themes in their overall behavior</a>, there didn’t seem to be any particular structure to “rule space”. (Occasionally, though, <a href="https://writings.stephenwolfram.com/2023/02/a-50-year-quest-my-personal-journey-with-the-second-law-of-thermodynamics/#computational-irreducibility-and-rule-30">particularly in finding examples for exposition</a>, I would look at slightly more complicated and “multicolored” rules, and I certainly anecdotally noticed that rules with nearby rule numbers often had definite similarities in their behavior.)</p> <p>It so happened that around the time I started publishing about cellular automata in 1983 there was a fair amount of ambient interest in theoretical biology. And (perhaps in part because of the “cellular” in “cellular automata”) I was often invited to theoretical biology conferences. People would sometimes ask about adaptation in cellular automata, and I would usually just emphasize what individual cellular automata could do, without any adaptation, and what significance it might have for the development of organisms.</p> <p>But in 1985 I was going to a conference (at Los Alamos) on “Evolution, Games and Learning” and I decided I should take a look at the relation of these topics to cellular automata. But, too quickly, I segued away from investigating adaptation to trying to see what kind of pattern matching and other operations cellular automata <a href="https://content.wolfram.com/sw-publications/2020/07/approaches-complexity-engineering.pdf">might be able to be explicitly set up to do</a>: </p> <p><a class='magnific image' alt='' title='' href='https://content.wolfram.com/sites/43/2024/05/sw050124personalimg1.png'><img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124personalimg1A.png' alt='Click to enlarge' title='Click to enlarge' width='550' height='161'></a></p> <p>Many aspects of this paper still seem quite modern (and in fact should probably be investigated more now!). But—even though I absolutely had the tools to do it—I simply failed at that time to explore what I’ve now explored here.</p> <p>Back in 1984 Problem 7 in my “<a href="https://content.wolfram.com/sw-publications/2020/07/problems-theory-cellular-automata.pdf">Twenty Problems in the Theory of Cellular Automata</a>” was “How is different behavior distributed in the space of cellular automaton rules?” And over the years I’d occasionally think about “cellular automaton rule space”, wondering, for example, what kind of geometry it might have, particularly in the continuum limit of infinitely large rules.</p> <p>By the latter half of the 1980s “theoretical biology” conferences had segued to “artificial life” ones. And when I went to such conferences I was often frustrated. People would show me simulations that seemed to have far too many parameters to ever be able to conclude much. People would also often claim that natural selection was a “very simple theory”, but <a href="https://www.wolframscience.com/nks/notes-8-5--studying-natural-selection/">as soon as it was “implemented” there’d be all kinds of issues</a>—and choices to be made—about population sizes, fitness cutoffs, interactions between organisms, and so on. And the end result was usually a descent into some kind of very specific simulation without obvious robust implications.</p> <p>(In the mid-1980s I put a fair amount of effort into developing both the content and the organization of a new direction in science that I called “complex systems research”. My emphasis was on systems—like cellular automata—that had definite simple rules but highly complex behavior. Gradually, though, “complexity” started to be a popular general buzzword, and—I suspect partly to distinguish themselves from my efforts—some <a href="https://writings.stephenwolfram.com/2019/06/my-part-in-an-origin-story-the-launching-of-the-santa-fe-institute/">people started emphasizing</a> that they weren’t just studying complex systems, they were studying complex adaptive systems. But all too often this seemed mostly to provide an excuse to dilute the clarity of what could be studied—and I was sufficiently put off that I paid very little attention.)</p> <p>By the mid-1990s, I was in the middle of writing <em><a href="https://www.wolframscience.com/nks/">A New Kind of Science</a></em>, and I wanted to use biology as an example application of my methodology and discoveries in the computational universe. In a section entitled “<a href="https://www.wolframscience.com/nks/chap-8--implications-for-everyday-systems#sect-8-5--fundamental-issues-in-biology">Fundamental Issues in Biology</a>” I argued (as I have here) that computational irreducibility is a fundamentally stronger force than natural selection, and that when we see complexity in biology it’s most likely of “computational origin” rather than being “sculpted” by natural selection. And as part of that discussion, I included a <a href="https://www.wolframscience.com/nks/p391--fundamental-issues-in-biology/">picture of the “often-somewhat-gradual changes”</a> in behavior that one sees with successive 1-bit changes in a <em>k</em> = 3, <nobr><em>r</em> = 1</nobr> cellular automaton rule (yes, the book was not in color):</p> <p><a class='magnific image' alt='' title='' href='https://content.wolfram.com/sites/43/2024/05/sw050124personalimg2A.png'><img src='https://content.wolfram.com/sites/43/2024/05/sw050124personalimg2A.png' alt='Click to enlarge' title='Click to enlarge' width='340' height='415'/></a></p> <p>This wasn’t done adaptively; it was basically just looking along a “random straight line” in rule space. And indeed both here and in most of the book, I was concerned with what systems like cellular automata “naturally do”, not what they can be constructed (or adaptively evolved) to do. I did give “constructions” of how cellular automata can perform particular computational tasks (like <a href="https://www.wolframscience.com/nks/p640--computations-in-cellular-automata/">generating primes</a>), and, somewhat obscurely, in a section on “<a href="https://www.wolframscience.com/nks/chap-12--the-principle-of-computational-equivalence#sect-12-10--intelligence-in-the-universe">Intelligence in the Universe</a>” I explored finding <em>k </em>= 3, <em>r </em>= 1 rules that can <a href="https://www.wolframscience.com/nks/p833--intelligence-in-the-universe/">successfully “double their input”</a> (my reason for discussing these rules was to highlight the difficulty of saying whether one of these cellular automata was “constructed for a purpose” or was just “doing what it does”):</p> <p><a class='magnific image' alt='' title='' href='https://content.wolfram.com/sites/43/2024/05/sw050124personalimg3A.png'><img src='https://content.wolfram.com/sites/43/2024/05/sw050124personalimg3A.png' alt='Click to enlarge' title='Click to enlarge' width='340' height='415'/></a></p> <p>Many years went by. There’d be an occasional project at our <a href="https://education.wolfram.com/programs/">Summer School</a> about rule space, and occasionally about adaptation. I maintained an interest in foundational questions in biology, gradually collecting information and sometimes giving talks about the subject. Meanwhile—though I didn’t particularly internalize the connection then—by the mid-2010s, through our practical work on it in the <a href="https://www.wolfram.com/language/">Wolfram Language</a>, I’d gotten quite up to speed with <a href="https://www.wolfram.com/language/core-areas/machine-learning/">modern machine learning</a>. Around the same time I also heard from my friend Greg Chaitin about his efforts (as he put it) to “prove Darwin” using the kind of computational ideas he’d applied in thinking about the foundations of mathematics. </p> <p>Then in 2020 came our <a href="https://www.wolframphysics.org/" target="_blank" rel="noopener">Physics Project</a>, with its whole formalism around things like multiway graphs. It didn’t take long to realize that, yes, what I was calling “<a href="https://writings.stephenwolfram.com/2021/09/multicomputation-a-fourth-paradigm-for-theoretical-science/">multicomputation</a>” wasn’t just relevant for fundamental physics; it was something quite general that could be applied in many areas, which by 2021 I was <a href="https://writings.stephenwolfram.com/2021/09/multicomputation-a-fourth-paradigm-for-theoretical-science/#potential-application-areas">trying to catalog</a>:</p> <p><img src='https://content.wolfram.com/sites/43/2024/05/sw050124personalimg5.png' alt='Areas of multicomputation' title='Areas of multicomputation' width='432' height='67'/></p> <p>I did some thinking about each of these. The one I tackled most seriously first was <a href="https://writings.stephenwolfram.com/2022/03/the-physicalization-of-metamathematics-and-its-implications-for-the-foundations-of-mathematics/">metamathematics, about which I finished a book in 2022</a>. Late that year I was finishing a (<a href="https://writings.stephenwolfram.com/2023/02/a-50-year-quest-my-personal-journey-with-the-second-law-of-thermodynamics/">50-year-in-gestation</a>) project—informed by our Physics Project—on <a href="https://writings.stephenwolfram.com/2023/02/computational-foundations-for-the-second-law-of-thermodynamics/">understanding the Second Law of thermodynamics</a>, and as part of this I made what I thought was some progress on thinking about the <a href="https://writings.stephenwolfram.com/2023/02/computational-foundations-for-the-second-law-of-thermodynamics/#the-mechanoidal-phase-and-bulk-molecular-biology">fundamental character of biological systems</a> (though not their adaptive evolution). </p> <p>And then ChatGPT arrived. And in addition to being <a href="https://writings.stephenwolfram.com/2023/03/chatgpt-gets-its-wolfram-superpowers/">involved with it technologically</a>, I started to think about the science of it, and particularly about <a href="https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/">how it could work</a>. Part of it seemed to have to do with unrecognized regularities in human language, but part of it was a reflection of the emerging “meta discovery” that somehow if you “bashed” a machine learning system hard enough, it seemed like it could manage to learn almost anything. </p> <p>But why did this work? At first I thought it must just be an “obvious” consequence of high dimensionality. But I soon realized there was more to it. And as part of trying to understand the boundaries of what’s possible I ended up a couple of months ago writing a piece exploring “<a href="https://writings.stephenwolfram.com/2024/03/can-ai-solve-science/">Can AI Solve Science?</a>”:</p> <p> <a href="https://writings.stephenwolfram.com/2024/03/can-ai-solve-science/"><img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050224personalimg6.png' alt='Can AI Solve Science?' title='Can AI Solve Science?' width='449' height='359'> </a></p> <p>I talked about different potential objectives for science (making predictions, generating narrative explanations, etc.) And deep inside the piece I had a section entitled “<a href="https://writings.stephenwolfram.com/2024/03/can-ai-solve-science/#exploring-spaces-of-systems">Exploring Spaces of Systems</a>” in which I talked about science problems of the form “Can one find a system that does X?”—and asked whether systems like neural nets could somehow let one “jump ahead” in what would otherwise be huge exhaustive searches. As a sideshow to this I thought it might be interesting to compare with what a non-neural-net adaptive evolution process could do.</p> <p>Remembering Greg Chaitin’s ideas about connecting the halting problem to biological evolution I wondered if perhaps one could just adaptively evolve cellular automaton rules to find ones that generated a pattern with a particular finite lifetime. I imagined it as a classic machine learning problem, with a “loss function” one needed to minimize. </p> <p>And so it was that just after 1 am on February 22 I wrote three lines of Wolfram Language code—and tried the experiment:</p> <p> <a class='magnific image' alt='' title='' href='https://content.wolfram.com/sites/43/2024/05/sw050124personalimg7.png'><img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124personalimg7A.png' alt='Click to enlarge' title='Click to enlarge' width='617' height='269'/></a></p> <p>And it worked! I managed to find cellular automaton rules that would generate patterns living exactly 50 steps:</p> <p> <a class='magnific image' alt='' title='' href='https://content.wolfram.com/sites/43/2024/05/sw050124personalimg9.png'><img loading='lazy' src='https://content.wolfram.com/sites/43/2024/05/sw050124personalimg9.png' alt='Click to enlarge' title='Click to enlarge' width='445' height='448'> </a></p> <p>In retrospect, I was slightly lucky. First, that this ended up being such a simple experiment to try (at least in the Wolfram Language) that I did it even though I didn’t really expect it to work. And second, that for my very first experiment I picked parameters that happened to immediately work (<em>k </em>= 4, lifetime 50, etc.).</p> <p>But, yes, I could in principle have done the same experiment 40 years ago, though without the Wolfram Language it wouldn’t have been so easy. Still, the computers I had back then were powerful enough that I could in principle have generated the same results then as now. But without my modern experience of machine learning I don’t think I would have tried—and I would certainly have given up too easily. And, yes, it’s a little humbling to realize that I’ve gone so many years assuming adaptive evolution was out of the reach of simple, clean experiments. But it’s satisfying now to be able to check off another mystery I’ve long wondered about. And to think that much more about the foundations of biology—and machine learning—might finally be within reach. </p> <h2 id='thanks' style='font-size:1.2rem'>Thanks</h2> <p style='font-size:90%'>Thanks to Brad Klee, Nik Murzin and Richard Assar for their help. </p> <p style='font-size:90%'>The specific results and ideas I’ve presented here are mostly very recent, but they build on background conversations I’ve had—some recently, some more than 40 years ago—with many people, including: Sydney Brenner, Greg Chaitin, Richard Dawkins, David Goldberg, Nigel Goldenfeld, Jack Good, Jonathan Gorard, Stephen J. Gould, Hyman Hartman, John Holland, Christian Jacob, Stuart Kauffman, Mark Kotanchek, John Koza, Chris Langton, Katja Della Libera, Aristid Lindenmayer, Pattie Maes, Bill Mydlowec, John Novembre, Pedro de Oliveira, George Oster, Norman Packard, Alan Perelson, Thomas Ray, Philip Rosedale, Robert Rosen, Terry Sejnowski, Brian Silverman, Karl Sims, John Maynard Smith, Catherine Wolfram, Christopher Wolfram and Elizabeth Wolfram.</p> ]]></content:encoded> <wfw:commentRss>https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/feed/</wfw:commentRss> <slash:comments>6</slash:comments> </item> </channel> </rss>