Thursday, January 15, 2015

Truth and beauty

I should have been warned off of Aeon because of the Evans’ fiasco, but the magazine is not really all that culpable. Its content just reflects the stuff that is “out there” and the editors seem to largely act to make topical stuff available for easy reading in one place. This is how I came to read this. It just popped into my mailbox recently one morning and I couldn’t resist.  It’s about the relation between truth and beauty, or more accurately, how and whether aesthetic judgments about a theory are or should be taken to be marks of its truth.  The author Philip Ball thinks not. I’m not sure that I agree, but it got me thinking, so I thought I would try and write something about it. I’ve done this before (here) and some of what I say rehearses earlier themes. Here it is.

Mr. Ball notes that many scientists (Einstein, Dirac, Weyl (three real biggies), Greene and Arkani-Hamed are mentioned by name) have believed that theoretical beauty is a mark of truth. Some can be quoted as saying that theoretical beauty even trumps empirical verification. Mr. Ball takes these quotes at face value. Greene suggests that what Einstein and Weyl (the two culprits) intended is that beauty should count as having some evaluative power  re truth but that “Ultimately, theories are judged how they fare when faced with cold, hard, experimental facts” (p. 3 Ball quoting Greene).[1] I’m not sure what Einstein or Weyl meant. But I personally am interested in how beauty is relevant in theory evaluation, if it is relevant at all. This is, after all, a theme talked up in early minimalism, which I personally found useful. So is beauty linked to truth? And if so, how?

Before approaching these questions, it is worth trying to pin down what it is: what are the marks of theoretical beauty?  As Ball notes, scientists themselves are all over the place on this. For some, it’s like porn to supreme court justices (viz. know it when they see it). But Ball’s essay contains at least one attempt at a specification. Arkani-Hamed (AH) (he is, btw, an important young hot shot (here)) associates theoretical beauty with a certain kind of inevitability:

“There are very few principles and there’s no other possible way they could work once you understand them deeply enough” (p. 4, Ball quoting Arkani-Hamed).

Note three things about AH’s proposal: beauty is the combination of (i) simplicity of the Occam variety (few principles) coupled with (ii) a certain kind of modality (no other possible way they could work) and (iii) is not surface visible in that it takes intellectual effort to perceive (“once you understand them deeply enough”). The third is of particular interest wrt Ball’s many comments on the issue, for much of his discussion, I believe, revolves around the fact that beauty is hard to discern, that there can be lots of disagreement and that it is elusive.  All fair points, but not really immediately relevant to AH’s point.  AH’s view seems to be that finding beauty requires hard intellectual work. It may be the eye of the beholder, but only an eye that’s put in the hours to train itself to see properly. So beauty need not be (and generally won’t be) immediately evident on inspection. Consequently, the fact that what’s beautiful is not immediately discernable or that it is sensitive to cultural mores does not seem all that relevant to AH’s point. That said, let’s concentrate on points (i) and (ii): simple and inevitable.

Occam’s razor is now part of the accepted methodological wisdom. All things being equal (ATE), simpler theories are better than more complex ones. And how is “simple” measured? By the number of axioms or assumptions (how to enumerate these is not trivial but I will put that to one side). So a theory with three axioms trumps one with four and one with four beats one with five ATE.  As Ball notes, things are seldom equal, so applying Occam in the wild is never easy. But, it is a principle that all are ready to accept in some form and we should ask why?  Why is simpler better, or, more specifically, truer?

In the good old 16th and 17th centuries (if not earlier) there was theological justification for this assumption. We live in God’s universe governed by his/her laws. God is an elegant thinker and would never do in a complicated manner what could be done simply (Why not? Is God lazy?).[2] Thus, God’s character guarantees a universe governed by simple laws.[3]

Many of us find this justification hard to accept nowadays, but luckily for us, there are other ways of buying into this viewpoint. Thus, simpler theories are generally more epistemologically solid than are profligate ones. Think of a stool carrying a weight. If three legged, each leg supports 1/3 the load. If four legged, each supports 1/4. Take weight as evidence and legs as axioms and you can see how fewer axioms means greater evidence for each the fewer there are. Bayesians can even formalize this insight, and they have tended to make a big deal out if it. So, something like Occam is something even the theologically fussy can sign onto as a virtue of theory

Improtantly, for my purposes, this epistemological virtue carries metaphysical kudos. If we assume that theories that are better empirically supported are more likely to be true, then simpler theories ATE are more likely to be true. I’m not sure why we think that better supported theories are more likely to be true, but we do think this and this provides at least one link between theoretical simplicity and truth.

Of course, things are much much more complicated. Comparing different theories can be very hard. Indeed, there is no general recipe for how to compare different axioms, definitions, primitives etc. And there seems to be no absolute measure of “simplicity,” or at lest none that garners general assent. But, though there is no general measure, there are lots of local ones and the principle can and does have teeth in many places. Examples: why assume an aether if all goes well without it? Why assume two conceptions of mass if you can get away with one? And, most relevant to us: Why assume internal a syntactic theory with levels if they are not required empirically? Why assume three rule types if one suffices? We all know the drill, and a good one it is, for this kind of simplicity ties elegance together with a notion of “independent evidence,” and every scientist worth her/his salt loves independent evidence (consilience rules!!!).

So, we value simple theories as pointing towards truth and if simplicity is part of beauty, this partly partly explains why we value beauty if it can be had (which, as Ball notes, is not always clear given that things are seldom equal). 

There is a further reason to value simple theories: they are easier to explore. Simple theories are often readily intelligible (as things with fewer moving parts often are). So, methodologically, if one’s aim is to find out what’s what then using an instrument (theory) that is understandable (simple) to probe things is ATE better than one that is opaque.  Simple ideas are generally more manageable, so they enjoy a kind of methodological or epistemological advantage. But note, this does not imply that they are more often true, though perhaps (weasel word!) a case can be made that they are good ways at getting to truth. If so, beauty may not mark truth but it may be the best road to it.

There is still another way to link simplicity and truth; via explanation. Here’s what I have in mind. What’s the “ugliest” possible theory of anything? I would say it’s a list. Why? Because it has zero explanatory value.[4] The virtue of simple theories is that they seem to carry more explanatory oomph.  And what better mark of a theory’s truth than its explanatory power? What we want out of a theory is not only that it cover the data, but that it cover it in such as way as to explain why the data is the way it is.

This locution, “why the data is the way it is” has two related sub-parts with slightly different foci. It means explaining not only why we have the data we happen to have, but also the data we will have and could have. And it means explaining why we don’t have other than the data we have (viz. why the data we don’t see is missing).[5]  The main problem with a list, then, is that it does not tell you how to expand itself so as to include what it must and exclude what it must. A simple theory, being general, does. Simple theories can explain and part of being beautiful is having explanatory oomph, which modally relates to what is possible.[6]

Note the explanatory property of simple theories already introduces the modal feature of beauty that AH noted. Why X? Because X is the only way things could have been. Theories explain the actual in terms of the possible.  This suggests the following thought: what makes a theory truly beautiful is that it perfectly explains the actual in terms of the possible. Or, put another way, in the best case, the fit between what is possible and what is actual is perfect. All that can be observed has been and all that has been observed can be. There is no spillover.[7]

A joke I’ve told before (see here) illustrates the logic, I think: why are there 1-hump camels and 2-hump camels but not 3,4…N-humped camels? Because there are concave camels and convex camels and that’s all there is.  The “joke” illustrates how changing our conception of humps (look at the curves not the humps) allows us to exhaust the kinds of humps+valley combos we expect to find. Looking at the number of humps invites the question of why are there at most 2. Why 2, and not 3,4,…N? The stopping point seems arbitrary. Looking at the curves and seeing them as concave and convex seems to exhaust the space of options (simple curves being one or the other). And this restricted space also coincides with what we find. The possible hump+valley options perfectly fit the attested realizations (viz. 2 and only 2). Thus, once we think in terms of curves, it seems clear that things could not have been otherwise hump/camel-wise (but see comments to above link for problems with this “joke”)..

Now, before I get lots of comments taking the “joke” apart (again, see earlier post), let me try to illustrate what I have in mind with a more relevant syntactic example (there are several others in the earlier post). Chomsky has famously argued that properly understood Merge yields both phrase structure and movement.  What is the proper way to understand it? Well, merge is an operation that takes two linguistic items A and B and puts them together (forms a set). What As and Bs? Chomsky says that there are two possibilities: (i) neither A nor B contains the other or (ii) one of A or B contains the other. In the first case, Merge(A,B) yields a constituent like {A,B}. In the other it yields a constituent like {B,{A… B}}. Thus, if (i) and (ii) exhaust the options then it looks like every possible instance of a very simple combination rule (merge) results in just the two products that we find prominently in Gs (viz. products of PS rules and movement rules). What makes Chomsky’s story attractive is that it looks like the two principle properties of Gs (hierarchy and displacement) follow exhaustively from the possible ways two inputs can fall under the rule. Either the to-be-combined form a part whole relation or they do not. Thus, a simple exhaustive account of the options yields (all and only?) the attested structural dependencies.[8]  

I don’t know about you, but if Chomsky is right here, then I consider this to be a very nice kind of account. Why do Gs have both PS rules and movement rules? Because the simplest conception of the combination operation has these two “kinds” of rules (and only these) as consequence.  

There are other examples of this kind of thinking in the syntax literature as I discussed here. Nor is this restricted to syntax. It is also a staple of phonological feature theories, where the aim is to adumbrate all and only the possible linguistic sounds and, if I understand Heinz and Idsardi’s Science paper, the range of possible phonological processes.

I think that there is another way of describing what lends such stories AH’s feeling of inevitability. When properly framed, a theory serves to close off further questions. Here’s what I mean. A really satisfying account brings a kind of question closure with it. The concave/vex theory of camels explains why the question of 3 humped camels is, in some sense, ill-formed. Chomsky’s account of structure dependence discussed here not only removes linear processes from the grammar but explains why they could not have been there. Properly understood, such rules cannot be stated within the theory and that’s why they don’t exist. Such rules don’t merely fail to exist, they cannot exist for they cannot be stated. Thus, they are not merely contingently absent, they are necessarily absent. Indeed, when the theory is properly understood, one sees that their possibility is actually inconceivable. It’s this sense of theoretical beauty that AH’s remark pointed to, I am suggesting.

So, is beauty a mark of truth? I think so. The problem is that which conception of beauty is the right one is generally what is theoretically up for grabs. The theorist’s challenge is to provide beautiful accounts; simple stories that exhaustively and completely adumbrate the options that completely describe what in fact happens. Such stories are simple and exhaustive and hence beautiful.  So, does beauty count? Sure. Is it the only virtue? No. But IMO it is one often worth sacrificing a few data points to. But you knew I would say that, right?

[1] I am not sure what Greene meant here. “Ultimately” is a very long time. As Keynes noted, ultimately (in the long run) we are all dead. I suspect what Greene is doing here is CYAing a bit. Among themselves, scientists like to talk about the intangibles that count towards theory evaluation. To the public, they like to appear hard-headed so that they can crap on the artsy-fartsy emotive types, especially those with religious or spiritual inclinations. Talking about experimental facts serves this aim well.  In truth, everyone wants new data to back up interesting claims and everyone wants theories that are not clunky. How these different features get weighted at any particular time is an art, not the product of algorithm.  So, here Greene, a well known purveyor of beauty in theory, is trying to cover his public posterior. 
[2] In this regard God is the anti-Thomas Mann, of whom Peter Gay said: “Mann did not like to be simple if it was at all possible to be complicated.”
[3] I never quite understood why these thinkers thought they knew God’s aesthetic preferences or work habits. But, it seems that they could and did.
[4] For the statistically inclined compare histograms with the statistics that describe them. The former represents actual data points. It does not explain them in any way. The latter carries some explanatory force as it not only “accounts” for the histograms but (in the best case) describes where a possible data point can and cannot fall.
[5] This has obvious relationship to the issue of negative data near and dear to a linguist’s heart.
[6] The link between explanatorieness and truth gets us into deep waters very quickly: why after all should the universe be comprehensible? Damn if I know. Descarte’s benevolent deity? Darwin? Dumb luck?
[7] Needless to say, this cannot be understood as every possible instance has been observed, rather an instance of every type has been observed and no instance of an impossible type has been. 
[8] I am not sure whether this argument is in fact accurate. Prima facie, there are more kinds of grammatical rules (e.g. construal, agreement, deletion).  My own view is that the right next step is to try and unify construal, etc. with movement in some way. Doing this would then derive all possible G operations from Merge as Chomsky wants to do. However, I believe that my desires here are idiosyncratic outliers theoretically. However, unless this is done, Chomsky’s “derivation” is not complete or exhaustive and so, not “perfect.”


  1. Of somewhat related interest, this post on reducibility:

    The most salient thing in that post relevant to this one is that, if you accept reductionism, then if you imagine yourself in such an advanced state of knowledge that you have a full chain of understanding from fundamental particle physics right up to neurochemistry, and then you suddenly find at some higher level, like consciousness, a phenomenon that does not comport with the rest of your framework, then your phenomenon of consciousness has just falsified your fundamental physical laws.

    This is an extreme example to try to demonstrate a broader issue, which is applicable regardless of metaphysical commitments to reductionism or emergence, that theory construction and ideas of elegance must always be tempered by a wariness that the specific domains we are probing with elegant descriptions may suddenly melt between our fingers when we look at how they fit together with other domains at other levels - for example, the Standard Model, beautiful as it is, undermined by cosmological data.

    What worries me about things like minimalism (although I have little experience of it) is that beautiful things can be made at that level of pristine abstraction, even in accounting for a great mess of empirical data, but when looking to combine it with other domains of knowledge, is there not the possibility that you could have constructed a complete and maximally efficient computational design only to find out that it can't be implemented on the particular nervous system we're looking at because of some fact of neurobiology that was unconsidered or unknown?

    What I would characterise as my central concern is that the intrinsic appeal of elegance is a dangerous lure in the absence of perfect knowledge, and I think our approach to it is upside-down because of our (well-earned) pride in scientific achievement. I don't think beauty is a marker of truth; truth teaches us what beauty is. Arkani-Hamed's comments, and the idea of knowing it when you see it, are useful here, but in an unusual way - it's not that we have some ineffable, inborn conception of the beautiful that somehow guides us; it's more that, at the moment of truth, there is an act of revelation. Some parts of the world that were once distant snap together and only then does it all seem obvious and unavoidable, but we were not led there by our own intuitions of inevitability or it would have always been obvious. Perhaps too much of beauty is hindsight.

    I think that to go about it the reverse way - to enact methodologies and paradigms in the concerted pursuit of beauty - makes us rather like Kepler desperately trying to assemble a geometric model of the planetary orbits in the name of Euclidean elegance when there's a telescope sitting unused on the table that we should dust off and use to make a few measurements, letting beauty follow in its own time, if at all. That way, it's much less depressing when you uncover the hiding astronomical object that would have otherwise obliterated your model.

    1. Thx for the additional source. I read the scientia salon post when it came up a couple of days ago and thought of linking to it. You've saved me the trouble.

      A comment or two on what you said: I think that the all hard and fast methodological dicta are dangerous. However, I also find myself convinced that insightful theory comes from considering aesthetic considerations. Need they be treated cautiously? Sure. As does experimental data! Just as you should never trust a theory without good experimental data, you would be wary of data points that are not substantiated by a good theory. There is a subtle back and forth between such desiderata and I feel that we often, mistankingly, take facts as "hard" and other criteria of theoretical felicity as soft. I do not agree. And there are many such examples in the history of science (for example, retaining the conservation of energy and waiting 30 years or so for the neutrino to save it). At any rate, what I do think is wise is to appreciate how delicate the scientific enterprise is and that there are many competing and important factors.

      As for Minimalism, I actually think that there is pretty robust data suggesting that something like it MUST be correct, at least if one wants UG, the capacity for language, to be evolvable. I will write about this next week reviewing a recent terrific little paper by Chomsky. the gist however is that unless something like it is true then there is little reason to think that we will ever have a story for how it might have evolved. In addition, Minimalism in many of its current forms functions to unify a great deal of earlier grammatical theory. This does not make it "true" but this is certainly a well respected mark of truth.

      So, I think that I disagree with the overall thrust of your comment, except to agree that like all interesting notions, theoretical beauty needs to be handled gingerly. But this is no less true for experimental results. Science is hard and there are no robust rules. Some people have a sense of beauty, some are better at finding relevant data. Both activities are worthwhile and both are fraught with lures that can mislead. there are no hard and fast rules. And that's what makes it fun.

  2. Replies
    1. Except when it isn't, or apparently isn't. Rambling things can be beautiful. But I do agree that as regards theory, simple is nicer. Thx again for the slime mold stuff btw.

  3. This reminds me of Michael Della Rocca's argument for the principle of sufficient reason. He argues that there is no non-question-begging argument against the principle of sufficient reason.

    For anyone who cares but doesn't want to read the paper, the argument goes that there are intuitions we have about explicability—namely, we think/intuit that at least some things (facts, phenomena) are explainable. And this puts pressure on us to say that all things are explainable. If all things are explainable, then the principle of sufficient reason follows. So if one wants to reject the principle of sufficient reason and still say that some things (but not all) are explicable, then one would need an account of why that is so. That is, the only way to do this and not beg the question—in the technical sense—against the person committed to the principle of sufficient reason is to draw a principled line between the explicable and the non-explicable. And he provides some arguments that no such principled line can be drawn (or at least that it has yet to be drawn and probably never could/will be drawn).

    So either one accepts the principle of sufficient reason, in which case everything is explicable—whether we have (or could have) epistemic access to the explanation is another question entirely—or one rejects the principle of sufficient reason and must commit to nothing being explicable. Or, in terms from Norbert's blog post, either one commits to the correct theory being one in which everything has a principled explanation or one commits to the correct theory being just a list of all of the facts. There's no (coherent) intermediate position.

    Della Rocca (pp. 10-11) tries to make a case that the latter option is a non-option because it entails that all things are brute facts—i.e., not explicable—which is in conflict with the intuition that at least some things are explicable—i.e., not brute facts. To me, this seems to be reasonable, and it is an intuition that I share, but I'm not sure what case can be made for its being true independent of the fact that many (all?) intuit it to be so. (This is Norbert's note [6], just in other words.) Without some independent case for this intuition being true, there's always the possibility that the intuition is false, as some human intuitions are, in fact, false.

    So, in my opinion, the import of Della Rocca's argument is methodological, not metaphysical (pending an independent case that the intuition that some things are explicable is in fact a true intuition). In other words, perhaps the correct theory really just is a list of all of the facts; who knows. But if we want to even try to come up with a theory of how the world works that isn't just a list of all of the facts—that is, if we want to try to understand the world—, we have to be committed to there being a principled explanation for everything (at least if we want to be consistent). Maybe trying to understand the world is misguided, but what else are we going to do?

    I'm not sure how much this comment adds—to some extent (though perhaps not entirely), I think it's a bit of a rehashing of what Norbert said in the "How to Play the Game" post—but, at any rate, I thought some might find the Della Rocca paper interesting, and it seemed sad to only post the link.