I should have been warned off of Aeon because of the Evans’ fiasco, but the magazine is not really
all that culpable. Its content just reflects the stuff that is “out there” and the
editors seem to largely act to make topical stuff available for easy reading in
one place. This is how I came to read this.
It just popped into my mailbox recently one morning and I couldn’t resist. It’s about the relation between truth and
beauty, or more accurately, how and whether aesthetic judgments about a theory
are or should be taken to be marks of its truth. The author Philip Ball thinks not. I’m not
sure that I agree, but it got me thinking, so I thought I would try and write
something about it. I’ve done this before (here) and some
of what I say rehearses earlier themes. Here it is.
Mr. Ball notes that many scientists (Einstein, Dirac, Weyl
(three real biggies), Greene and Arkani-Hamed are mentioned by name) have
believed that theoretical beauty is a mark of truth. Some can be quoted as
saying that theoretical beauty even trumps empirical verification. Mr. Ball takes
these quotes at face value. Greene suggests that what Einstein and Weyl (the
two culprits) intended is that beauty should count as having some evaluative
power re truth but that “Ultimately,
theories are judged how they fare when faced with cold, hard, experimental
facts” (p. 3 Ball quoting Greene).[1]
I’m not sure what Einstein or Weyl meant. But I personally am interested in how
beauty is relevant in theory evaluation, if it is relevant at all. This is,
after all, a theme talked up in early minimalism, which I personally found
useful. So is beauty linked to truth? And if so, how?
Before approaching these questions, it is worth trying to
pin down what it is: what are the marks of theoretical beauty? As Ball notes, scientists themselves are all
over the place on this. For some, it’s like porn to supreme court justices
(viz. know it when they see it). But Ball’s essay contains at least one attempt
at a specification. Arkani-Hamed (AH) (he is, btw, an important young hot shot
(here)) associates
theoretical beauty with a certain kind of inevitability:
“There are very few principles and
there’s no other possible way they could work once you understand them deeply
enough” (p. 4, Ball quoting Arkani-Hamed).
Note three things about AH’s proposal: beauty is the
combination of (i) simplicity of the Occam variety (few principles) coupled with (ii) a certain kind of modality (no
other possible way they could work)
and (iii) is not surface visible in that it takes intellectual effort to
perceive (“once you understand them deeply enough”). The third is of
particular interest wrt Ball’s many comments on the issue, for much of his
discussion, I believe, revolves around the fact that beauty is hard to discern,
that there can be lots of disagreement and that it is elusive. All fair points, but not really immediately
relevant to AH’s point. AH’s view seems
to be that finding beauty requires hard intellectual work. It may be the eye of
the beholder, but only an eye that’s put in the hours to train itself to see
properly. So beauty need not be (and generally won’t be) immediately evident on
inspection. Consequently, the fact that what’s beautiful is not immediately
discernable or that it is sensitive to cultural mores does not seem all that
relevant to AH’s point. That said, let’s concentrate on points (i) and (ii):
simple and inevitable.
Occam’s razor is now part of the accepted methodological
wisdom. All things being equal (ATE),
simpler theories are better than more complex ones. And how is “simple”
measured? By the number of axioms or assumptions (how to enumerate these is not trivial but I will put that to one
side). So a theory with three axioms trumps one with four and one with four
beats one with five ATE. As Ball notes,
things are seldom equal, so applying
Occam in the wild is never easy. But, it is a principle that all are ready to
accept in some form and we should ask why?
Why is simpler better, or, more specifically, truer?
In the good old 16th and 17th
centuries (if not earlier) there was theological justification for this
assumption. We live in God’s universe governed by his/her laws. God is an
elegant thinker and would never do in a complicated manner what could be done
simply (Why not? Is God lazy?).[2]
Thus, God’s character guarantees a universe governed by simple laws.[3]
Many of us find this justification hard to accept nowadays,
but luckily for us, there are other ways of buying into this viewpoint. Thus,
simpler theories are generally more epistemologically solid than are profligate
ones. Think of a stool carrying a weight. If three legged, each leg supports 1/3
the load. If four legged, each supports 1/4. Take weight as evidence and legs
as axioms and you can see how fewer axioms means greater evidence for each the
fewer there are. Bayesians can even formalize this insight, and they have
tended to make a big deal out if it. So, something like Occam is something even
the theologically fussy can sign onto as a virtue of theory
Improtantly, for my purposes, this epistemological virtue
carries metaphysical kudos. If we assume that theories that are better
empirically supported are more likely to be true, then simpler theories ATE are
more likely to be true. I’m not sure why
we think that better supported theories are more likely to be true, but we do
think this and this provides at least one link between theoretical simplicity
and truth.
Of course, things are much much more complicated. Comparing
different theories can be very hard. Indeed, there is no general recipe for how
to compare different axioms, definitions, primitives etc. And there seems to be
no absolute measure of “simplicity,”
or at lest none that garners general assent. But, though there is no general
measure, there are lots of local ones and the principle can and does have teeth
in many places. Examples: why assume an aether if all goes well without it? Why
assume two conceptions of mass if you can get away with one? And, most relevant
to us: Why assume internal a syntactic theory with levels if they are not
required empirically? Why assume three rule types if one suffices? We all know the
drill, and a good one it is, for this kind of simplicity ties elegance together
with a notion of “independent evidence,” and every scientist worth her/his salt
loves independent evidence
(consilience rules!!!).
So, we value simple theories as pointing towards truth and
if simplicity is part of beauty, this partly partly explains why we value
beauty if it can be had (which, as
Ball notes, is not always clear given that things are seldom equal).
There is a further reason to value simple theories: they are
easier to explore. Simple theories are often readily intelligible (as things
with fewer moving parts often are). So, methodologically, if one’s aim is to
find out what’s what then using an instrument (theory) that is understandable
(simple) to probe things is ATE better than one that is opaque. Simple ideas are generally more manageable,
so they enjoy a kind of methodological or epistemological advantage. But note,
this does not imply that they are more often true, though perhaps (weasel
word!) a case can be made that they are good ways at getting to truth. If so,
beauty may not mark truth but it may be the best road to it.
There is still another way to link simplicity and truth; via
explanation. Here’s what I have in mind. What’s the “ugliest” possible theory
of anything? I would say it’s a list. Why? Because it has zero explanatory
value.[4]
The virtue of simple theories is that they seem to carry more explanatory
oomph. And what better mark of a
theory’s truth than its explanatory power? What we want out of a theory is not
only that it cover the data, but that it cover it in such as way as to explain
why the data is the way it is.
This locution, “why the data is the way it is” has two related
sub-parts with slightly different foci. It means explaining not only why we
have the data we happen to have, but also the data we will have and could have.
And it means explaining why we don’t have other
than the data we have (viz. why the data we don’t see is missing).[5] The main problem with a list, then, is that
it does not tell you how to expand itself so as to include what it must and
exclude what it must. A simple theory, being general, does. Simple theories can
explain and part of being beautiful is having explanatory oomph, which modally
relates to what is possible.[6]
Note the explanatory property of simple theories already
introduces the modal feature of beauty that AH noted. Why X? Because X is the
only way things could have been.
Theories explain the actual in terms of the possible. This suggests the following thought: what
makes a theory truly beautiful is that it perfectly
explains the actual in terms of the possible. Or, put another way, in the best
case, the fit between what is possible and what is actual is perfect. All that
can be observed has been and all that has been observed can be. There is no
spillover.[7]
A joke I’ve told before (see here)
illustrates the logic, I think: why are there 1-hump camels and 2-hump camels
but not 3,4…N-humped camels? Because there are concave camels and convex camels
and that’s all there is. The “joke”
illustrates how changing our conception of humps (look at the curves not the
humps) allows us to exhaust the kinds
of humps+valley combos we expect to find. Looking at the number of humps
invites the question of why are there
at most 2. Why 2, and not 3,4,…N? The stopping point seems arbitrary. Looking
at the curves and seeing them as concave and convex seems to exhaust the space
of options (simple curves being one or the other). And this restricted space
also coincides with what we find. The possible
hump+valley options perfectly fit the attested realizations (viz. 2 and only 2).
Thus, once we think in terms of curves, it seems clear that things could not
have been otherwise hump/camel-wise (but see comments to above link for
problems with this “joke”)..
Now, before I get lots of comments taking the “joke” apart (again,
see earlier post), let me try to illustrate what I have in mind with a more
relevant syntactic example (there are several others in the earlier post).
Chomsky has famously argued that properly
understood Merge yields both phrase structure and movement. What is the proper way to understand it?
Well, merge is an operation that takes two linguistic items A and B and puts
them together (forms a set). What As and Bs? Chomsky says that there are two
possibilities: (i) neither A nor B contains the other or (ii) one of A or B
contains the other. In the first case, Merge(A,B) yields a constituent like
{A,B}. In the other it yields a constituent like {B,{A… B}}. Thus, if
(i) and (ii) exhaust the options then it looks like every possible instance of
a very simple combination rule (merge) results in just the two products that we
find prominently in Gs (viz. products of PS rules and movement rules). What
makes Chomsky’s story attractive is that it looks like the two principle
properties of Gs (hierarchy and displacement) follow exhaustively from the
possible ways two inputs can fall under the rule. Either the to-be-combined
form a part whole relation or they do not. Thus, a simple exhaustive account of
the options yields (all and only?) the attested structural dependencies.[8]
I don’t know about you, but if Chomsky is right here, then I
consider this to be a very nice kind
of account. Why do Gs have both PS rules and movement rules? Because the
simplest conception of the combination operation has these two “kinds” of rules
(and only these) as consequence.
There are other examples of this kind of thinking in the
syntax literature as I discussed here. Nor is
this restricted to syntax. It is also a staple of phonological feature
theories, where the aim is to adumbrate all and only the possible linguistic
sounds and, if I understand Heinz and Idsardi’s Science paper, the range of possible phonological processes.
I think that there is another way of describing what lends
such stories AH’s feeling of inevitability. When properly framed, a theory
serves to close off further questions. Here’s what I mean. A really satisfying
account brings a kind of question closure with it. The concave/vex theory of
camels explains why the question of 3 humped camels is, in some sense,
ill-formed. Chomsky’s account of structure dependence discussed here not only
removes linear processes from the grammar but explains why they could not have
been there. Properly understood, such rules cannot be stated within the theory and
that’s why they don’t exist. Such rules don’t merely fail to exist, they cannot
exist for they cannot be stated. Thus, they are not merely contingently absent,
they are necessarily absent. Indeed, when the theory is properly understood,
one sees that their possibility is actually inconceivable. It’s this sense of
theoretical beauty that AH’s remark pointed to, I am suggesting.
So, is beauty a mark of truth? I think so. The problem is
that which conception of beauty is
the right one is generally what is theoretically up for grabs. The theorist’s challenge
is to provide beautiful accounts; simple stories that exhaustively and
completely adumbrate the options that completely describe what in fact happens.
Such stories are simple and exhaustive and hence
beautiful. So, does beauty count? Sure.
Is it the only virtue? No. But IMO it is one often worth sacrificing a few data
points to. But you knew I would say that, right?
[1]
I am not sure what Greene meant here. “Ultimately” is a very long time. As
Keynes noted, ultimately (in the long run) we are all dead. I suspect what
Greene is doing here is CYAing a bit. Among themselves, scientists like to talk
about the intangibles that count towards theory evaluation. To the public, they
like to appear hard-headed so that they can crap on the artsy-fartsy emotive
types, especially those with religious or spiritual inclinations. Talking about
experimental facts serves this aim well.
In truth, everyone wants new data to back up interesting claims and
everyone wants theories that are not clunky. How these different features get
weighted at any particular time is an art, not the product of algorithm. So, here Greene, a well known purveyor of
beauty in theory, is trying to cover his public posterior.
[2]
In this regard God is the anti-Thomas Mann, of whom Peter Gay said: “Mann did
not like to be simple if it was at all possible to be complicated.”
[3]
I never quite understood why these thinkers thought they knew God’s aesthetic
preferences or work habits. But, it seems that they could and did.
[4]
For the statistically inclined compare histograms with the statistics that
describe them. The former represents actual data points. It does not explain
them in any way. The latter carries some explanatory force as it not only
“accounts” for the histograms but (in the best case) describes where a possible
data point can and cannot fall.
[5]
This has obvious relationship to the issue of negative data near and dear to a
linguist’s heart.
[6]
The link between explanatorieness and truth gets us into deep waters very
quickly: why after all should the universe be comprehensible? Damn if I know.
Descarte’s benevolent deity? Darwin? Dumb luck?
[7]
Needless to say, this cannot be understood as every possible instance has been
observed, rather an instance of every type has been observed and no instance of
an impossible type has been.
[8]
I am not sure whether this argument is in fact accurate. Prima facie, there are more kinds of grammatical rules (e.g.
construal, agreement, deletion). My own
view is that the right next step is to try and unify construal, etc. with
movement in some way. Doing this would then derive all possible G operations
from Merge as Chomsky wants to do. However, I believe that my desires here are
idiosyncratic outliers theoretically. However, unless this is done, Chomsky’s
“derivation” is not complete or exhaustive and so, not “perfect.”