I should have been warned off of Aeon because of the Evans’ fiasco, but the magazine is not really
all that culpable. Its content just reflects the stuff that is “out there” and the
editors seem to largely act to make topical stuff available for easy reading in
one place. This is how I came to read this.
It just popped into my mailbox recently one morning and I couldn’t resist. It’s about the relation between truth and
beauty, or more accurately, how and whether aesthetic judgments about a theory
are or should be taken to be marks of its truth. The author Philip Ball thinks not. I’m not
sure that I agree, but it got me thinking, so I thought I would try and write
something about it. I’ve done this before (here) and some
of what I say rehearses earlier themes. Here it is.
Mr. Ball notes that many scientists (Einstein, Dirac, Weyl
(three real biggies), Greene and Arkani-Hamed are mentioned by name) have
believed that theoretical beauty is a mark of truth. Some can be quoted as
saying that theoretical beauty even trumps empirical verification. Mr. Ball takes
these quotes at face value. Greene suggests that what Einstein and Weyl (the
two culprits) intended is that beauty should count as having some evaluative
power re truth but that “Ultimately,
theories are judged how they fare when faced with cold, hard, experimental
facts” (p. 3 Ball quoting Greene).[1]
I’m not sure what Einstein or Weyl meant. But I personally am interested in how
beauty is relevant in theory evaluation, if it is relevant at all. This is,
after all, a theme talked up in early minimalism, which I personally found
useful. So is beauty linked to truth? And if so, how?
Before approaching these questions, it is worth trying to
pin down what it is: what are the marks of theoretical beauty? As Ball notes, scientists themselves are all
over the place on this. For some, it’s like porn to supreme court justices
(viz. know it when they see it). But Ball’s essay contains at least one attempt
at a specification. Arkani-Hamed (AH) (he is, btw, an important young hot shot
(here)) associates
theoretical beauty with a certain kind of inevitability:
“There are very few principles and
there’s no other possible way they could work once you understand them deeply
enough” (p. 4, Ball quoting Arkani-Hamed).
Note three things about AH’s proposal: beauty is the
combination of (i) simplicity of the Occam variety (few principles) coupled with (ii) a certain kind of modality (no
other possible way they could work)
and (iii) is not surface visible in that it takes intellectual effort to
perceive (“once you understand them deeply enough”). The third is of
particular interest wrt Ball’s many comments on the issue, for much of his
discussion, I believe, revolves around the fact that beauty is hard to discern,
that there can be lots of disagreement and that it is elusive. All fair points, but not really immediately
relevant to AH’s point. AH’s view seems
to be that finding beauty requires hard intellectual work. It may be the eye of
the beholder, but only an eye that’s put in the hours to train itself to see
properly. So beauty need not be (and generally won’t be) immediately evident on
inspection. Consequently, the fact that what’s beautiful is not immediately
discernable or that it is sensitive to cultural mores does not seem all that
relevant to AH’s point. That said, let’s concentrate on points (i) and (ii):
simple and inevitable.
Occam’s razor is now part of the accepted methodological
wisdom. All things being equal (ATE),
simpler theories are better than more complex ones. And how is “simple”
measured? By the number of axioms or assumptions (how to enumerate these is not trivial but I will put that to one
side). So a theory with three axioms trumps one with four and one with four
beats one with five ATE. As Ball notes,
things are seldom equal, so applying
Occam in the wild is never easy. But, it is a principle that all are ready to
accept in some form and we should ask why?
Why is simpler better, or, more specifically, truer?
In the good old 16th and 17th
centuries (if not earlier) there was theological justification for this
assumption. We live in God’s universe governed by his/her laws. God is an
elegant thinker and would never do in a complicated manner what could be done
simply (Why not? Is God lazy?).[2]
Thus, God’s character guarantees a universe governed by simple laws.[3]
Many of us find this justification hard to accept nowadays,
but luckily for us, there are other ways of buying into this viewpoint. Thus,
simpler theories are generally more epistemologically solid than are profligate
ones. Think of a stool carrying a weight. If three legged, each leg supports 1/3
the load. If four legged, each supports 1/4. Take weight as evidence and legs
as axioms and you can see how fewer axioms means greater evidence for each the
fewer there are. Bayesians can even formalize this insight, and they have
tended to make a big deal out if it. So, something like Occam is something even
the theologically fussy can sign onto as a virtue of theory
Improtantly, for my purposes, this epistemological virtue
carries metaphysical kudos. If we assume that theories that are better
empirically supported are more likely to be true, then simpler theories ATE are
more likely to be true. I’m not sure why
we think that better supported theories are more likely to be true, but we do
think this and this provides at least one link between theoretical simplicity
and truth.
Of course, things are much much more complicated. Comparing
different theories can be very hard. Indeed, there is no general recipe for how
to compare different axioms, definitions, primitives etc. And there seems to be
no absolute measure of “simplicity,”
or at lest none that garners general assent. But, though there is no general
measure, there are lots of local ones and the principle can and does have teeth
in many places. Examples: why assume an aether if all goes well without it? Why
assume two conceptions of mass if you can get away with one? And, most relevant
to us: Why assume internal a syntactic theory with levels if they are not
required empirically? Why assume three rule types if one suffices? We all know the
drill, and a good one it is, for this kind of simplicity ties elegance together
with a notion of “independent evidence,” and every scientist worth her/his salt
loves independent evidence
(consilience rules!!!).
So, we value simple theories as pointing towards truth and
if simplicity is part of beauty, this partly partly explains why we value
beauty if it can be had (which, as
Ball notes, is not always clear given that things are seldom equal).
There is a further reason to value simple theories: they are
easier to explore. Simple theories are often readily intelligible (as things
with fewer moving parts often are). So, methodologically, if one’s aim is to
find out what’s what then using an instrument (theory) that is understandable
(simple) to probe things is ATE better than one that is opaque. Simple ideas are generally more manageable,
so they enjoy a kind of methodological or epistemological advantage. But note,
this does not imply that they are more often true, though perhaps (weasel
word!) a case can be made that they are good ways at getting to truth. If so,
beauty may not mark truth but it may be the best road to it.
There is still another way to link simplicity and truth; via
explanation. Here’s what I have in mind. What’s the “ugliest” possible theory
of anything? I would say it’s a list. Why? Because it has zero explanatory
value.[4]
The virtue of simple theories is that they seem to carry more explanatory
oomph. And what better mark of a
theory’s truth than its explanatory power? What we want out of a theory is not
only that it cover the data, but that it cover it in such as way as to explain
why the data is the way it is.
This locution, “why the data is the way it is” has two related
sub-parts with slightly different foci. It means explaining not only why we
have the data we happen to have, but also the data we will have and could have.
And it means explaining why we don’t have other
than the data we have (viz. why the data we don’t see is missing).[5] The main problem with a list, then, is that
it does not tell you how to expand itself so as to include what it must and
exclude what it must. A simple theory, being general, does. Simple theories can
explain and part of being beautiful is having explanatory oomph, which modally
relates to what is possible.[6]
Note the explanatory property of simple theories already
introduces the modal feature of beauty that AH noted. Why X? Because X is the
only way things could have been.
Theories explain the actual in terms of the possible. This suggests the following thought: what
makes a theory truly beautiful is that it perfectly
explains the actual in terms of the possible. Or, put another way, in the best
case, the fit between what is possible and what is actual is perfect. All that
can be observed has been and all that has been observed can be. There is no
spillover.[7]
A joke I’ve told before (see here)
illustrates the logic, I think: why are there 1-hump camels and 2-hump camels
but not 3,4…N-humped camels? Because there are concave camels and convex camels
and that’s all there is. The “joke”
illustrates how changing our conception of humps (look at the curves not the
humps) allows us to exhaust the kinds
of humps+valley combos we expect to find. Looking at the number of humps
invites the question of why are there
at most 2. Why 2, and not 3,4,…N? The stopping point seems arbitrary. Looking
at the curves and seeing them as concave and convex seems to exhaust the space
of options (simple curves being one or the other). And this restricted space
also coincides with what we find. The possible
hump+valley options perfectly fit the attested realizations (viz. 2 and only 2).
Thus, once we think in terms of curves, it seems clear that things could not
have been otherwise hump/camel-wise (but see comments to above link for
problems with this “joke”)..
Now, before I get lots of comments taking the “joke” apart (again,
see earlier post), let me try to illustrate what I have in mind with a more
relevant syntactic example (there are several others in the earlier post).
Chomsky has famously argued that properly
understood Merge yields both phrase structure and movement. What is the proper way to understand it?
Well, merge is an operation that takes two linguistic items A and B and puts
them together (forms a set). What As and Bs? Chomsky says that there are two
possibilities: (i) neither A nor B contains the other or (ii) one of A or B
contains the other. In the first case, Merge(A,B) yields a constituent like
{A,B}. In the other it yields a constituent like {B,{A… B}}. Thus, if
(i) and (ii) exhaust the options then it looks like every possible instance of
a very simple combination rule (merge) results in just the two products that we
find prominently in Gs (viz. products of PS rules and movement rules). What
makes Chomsky’s story attractive is that it looks like the two principle
properties of Gs (hierarchy and displacement) follow exhaustively from the
possible ways two inputs can fall under the rule. Either the to-be-combined
form a part whole relation or they do not. Thus, a simple exhaustive account of
the options yields (all and only?) the attested structural dependencies.[8]
I don’t know about you, but if Chomsky is right here, then I
consider this to be a very nice kind
of account. Why do Gs have both PS rules and movement rules? Because the
simplest conception of the combination operation has these two “kinds” of rules
(and only these) as consequence.
There are other examples of this kind of thinking in the
syntax literature as I discussed here. Nor is
this restricted to syntax. It is also a staple of phonological feature
theories, where the aim is to adumbrate all and only the possible linguistic
sounds and, if I understand Heinz and Idsardi’s Science paper, the range of possible phonological processes.
I think that there is another way of describing what lends
such stories AH’s feeling of inevitability. When properly framed, a theory
serves to close off further questions. Here’s what I mean. A really satisfying
account brings a kind of question closure with it. The concave/vex theory of
camels explains why the question of 3 humped camels is, in some sense,
ill-formed. Chomsky’s account of structure dependence discussed here not only
removes linear processes from the grammar but explains why they could not have
been there. Properly understood, such rules cannot be stated within the theory and
that’s why they don’t exist. Such rules don’t merely fail to exist, they cannot
exist for they cannot be stated. Thus, they are not merely contingently absent,
they are necessarily absent. Indeed, when the theory is properly understood,
one sees that their possibility is actually inconceivable. It’s this sense of
theoretical beauty that AH’s remark pointed to, I am suggesting.
So, is beauty a mark of truth? I think so. The problem is
that which conception of beauty is
the right one is generally what is theoretically up for grabs. The theorist’s challenge
is to provide beautiful accounts; simple stories that exhaustively and
completely adumbrate the options that completely describe what in fact happens.
Such stories are simple and exhaustive and hence
beautiful. So, does beauty count? Sure.
Is it the only virtue? No. But IMO it is one often worth sacrificing a few data
points to. But you knew I would say that, right?
[1]
I am not sure what Greene meant here. “Ultimately” is a very long time. As
Keynes noted, ultimately (in the long run) we are all dead. I suspect what
Greene is doing here is CYAing a bit. Among themselves, scientists like to talk
about the intangibles that count towards theory evaluation. To the public, they
like to appear hard-headed so that they can crap on the artsy-fartsy emotive
types, especially those with religious or spiritual inclinations. Talking about
experimental facts serves this aim well.
In truth, everyone wants new data to back up interesting claims and
everyone wants theories that are not clunky. How these different features get
weighted at any particular time is an art, not the product of algorithm. So, here Greene, a well known purveyor of
beauty in theory, is trying to cover his public posterior.
[2]
In this regard God is the anti-Thomas Mann, of whom Peter Gay said: “Mann did
not like to be simple if it was at all possible to be complicated.”
[3]
I never quite understood why these thinkers thought they knew God’s aesthetic
preferences or work habits. But, it seems that they could and did.
[4]
For the statistically inclined compare histograms with the statistics that
describe them. The former represents actual data points. It does not explain
them in any way. The latter carries some explanatory force as it not only
“accounts” for the histograms but (in the best case) describes where a possible
data point can and cannot fall.
[5]
This has obvious relationship to the issue of negative data near and dear to a
linguist’s heart.
[6]
The link between explanatorieness and truth gets us into deep waters very
quickly: why after all should the universe be comprehensible? Damn if I know.
Descarte’s benevolent deity? Darwin? Dumb luck?
[7]
Needless to say, this cannot be understood as every possible instance has been
observed, rather an instance of every type has been observed and no instance of
an impossible type has been.
[8]
I am not sure whether this argument is in fact accurate. Prima facie, there are more kinds of grammatical rules (e.g.
construal, agreement, deletion). My own
view is that the right next step is to try and unify construal, etc. with
movement in some way. Doing this would then derive all possible G operations
from Merge as Chomsky wants to do. However, I believe that my desires here are
idiosyncratic outliers theoretically. However, unless this is done, Chomsky’s
“derivation” is not complete or exhaustive and so, not “perfect.”
Of somewhat related interest, this post on reducibility: http://bit.ly/1x6JJqV
ReplyDeleteThe most salient thing in that post relevant to this one is that, if you accept reductionism, then if you imagine yourself in such an advanced state of knowledge that you have a full chain of understanding from fundamental particle physics right up to neurochemistry, and then you suddenly find at some higher level, like consciousness, a phenomenon that does not comport with the rest of your framework, then your phenomenon of consciousness has just falsified your fundamental physical laws.
This is an extreme example to try to demonstrate a broader issue, which is applicable regardless of metaphysical commitments to reductionism or emergence, that theory construction and ideas of elegance must always be tempered by a wariness that the specific domains we are probing with elegant descriptions may suddenly melt between our fingers when we look at how they fit together with other domains at other levels - for example, the Standard Model, beautiful as it is, undermined by cosmological data.
What worries me about things like minimalism (although I have little experience of it) is that beautiful things can be made at that level of pristine abstraction, even in accounting for a great mess of empirical data, but when looking to combine it with other domains of knowledge, is there not the possibility that you could have constructed a complete and maximally efficient computational design only to find out that it can't be implemented on the particular nervous system we're looking at because of some fact of neurobiology that was unconsidered or unknown?
What I would characterise as my central concern is that the intrinsic appeal of elegance is a dangerous lure in the absence of perfect knowledge, and I think our approach to it is upside-down because of our (well-earned) pride in scientific achievement. I don't think beauty is a marker of truth; truth teaches us what beauty is. Arkani-Hamed's comments, and the idea of knowing it when you see it, are useful here, but in an unusual way - it's not that we have some ineffable, inborn conception of the beautiful that somehow guides us; it's more that, at the moment of truth, there is an act of revelation. Some parts of the world that were once distant snap together and only then does it all seem obvious and unavoidable, but we were not led there by our own intuitions of inevitability or it would have always been obvious. Perhaps too much of beauty is hindsight.
I think that to go about it the reverse way - to enact methodologies and paradigms in the concerted pursuit of beauty - makes us rather like Kepler desperately trying to assemble a geometric model of the planetary orbits in the name of Euclidean elegance when there's a telescope sitting unused on the table that we should dust off and use to make a few measurements, letting beauty follow in its own time, if at all. That way, it's much less depressing when you uncover the hiding astronomical object that would have otherwise obliterated your model.
Thx for the additional source. I read the scientia salon post when it came up a couple of days ago and thought of linking to it. You've saved me the trouble.
DeleteA comment or two on what you said: I think that the all hard and fast methodological dicta are dangerous. However, I also find myself convinced that insightful theory comes from considering aesthetic considerations. Need they be treated cautiously? Sure. As does experimental data! Just as you should never trust a theory without good experimental data, you would be wary of data points that are not substantiated by a good theory. There is a subtle back and forth between such desiderata and I feel that we often, mistankingly, take facts as "hard" and other criteria of theoretical felicity as soft. I do not agree. And there are many such examples in the history of science (for example, retaining the conservation of energy and waiting 30 years or so for the neutrino to save it). At any rate, what I do think is wise is to appreciate how delicate the scientific enterprise is and that there are many competing and important factors.
As for Minimalism, I actually think that there is pretty robust data suggesting that something like it MUST be correct, at least if one wants UG, the capacity for language, to be evolvable. I will write about this next week reviewing a recent terrific little paper by Chomsky. the gist however is that unless something like it is true then there is little reason to think that we will ever have a story for how it might have evolved. In addition, Minimalism in many of its current forms functions to unify a great deal of earlier grammatical theory. This does not make it "true" but this is certainly a well respected mark of truth.
So, I think that I disagree with the overall thrust of your comment, except to agree that like all interesting notions, theoretical beauty needs to be handled gingerly. But this is no less true for experimental results. Science is hard and there are no robust rules. Some people have a sense of beauty, some are better at finding relevant data. Both activities are worthwhile and both are fraught with lures that can mislead. there are no hard and fast rules. And that's what makes it fun.
Beauty is simplicity.
ReplyDeleteExcept when it isn't, or apparently isn't. Rambling things can be beautiful. But I do agree that as regards theory, simple is nicer. Thx again for the slime mold stuff btw.
DeleteThis reminds me of Michael Della Rocca's argument for the principle of sufficient reason. He argues that there is no non-question-begging argument against the principle of sufficient reason.
ReplyDeleteFor anyone who cares but doesn't want to read the paper, the argument goes that there are intuitions we have about explicability—namely, we think/intuit that at least some things (facts, phenomena) are explainable. And this puts pressure on us to say that all things are explainable. If all things are explainable, then the principle of sufficient reason follows. So if one wants to reject the principle of sufficient reason and still say that some things (but not all) are explicable, then one would need an account of why that is so. That is, the only way to do this and not beg the question—in the technical sense—against the person committed to the principle of sufficient reason is to draw a principled line between the explicable and the non-explicable. And he provides some arguments that no such principled line can be drawn (or at least that it has yet to be drawn and probably never could/will be drawn).
So either one accepts the principle of sufficient reason, in which case everything is explicable—whether we have (or could have) epistemic access to the explanation is another question entirely—or one rejects the principle of sufficient reason and must commit to nothing being explicable. Or, in terms from Norbert's blog post, either one commits to the correct theory being one in which everything has a principled explanation or one commits to the correct theory being just a list of all of the facts. There's no (coherent) intermediate position.
Della Rocca (pp. 10-11) tries to make a case that the latter option is a non-option because it entails that all things are brute facts—i.e., not explicable—which is in conflict with the intuition that at least some things are explicable—i.e., not brute facts. To me, this seems to be reasonable, and it is an intuition that I share, but I'm not sure what case can be made for its being true independent of the fact that many (all?) intuit it to be so. (This is Norbert's note [6], just in other words.) Without some independent case for this intuition being true, there's always the possibility that the intuition is false, as some human intuitions are, in fact, false.
So, in my opinion, the import of Della Rocca's argument is methodological, not metaphysical (pending an independent case that the intuition that some things are explicable is in fact a true intuition). In other words, perhaps the correct theory really just is a list of all of the facts; who knows. But if we want to even try to come up with a theory of how the world works that isn't just a list of all of the facts—that is, if we want to try to understand the world—, we have to be committed to there being a principled explanation for everything (at least if we want to be consistent). Maybe trying to understand the world is misguided, but what else are we going to do?
I'm not sure how much this comment adds—to some extent (though perhaps not entirely), I think it's a bit of a rehashing of what Norbert said in the "How to Play the Game" post—but, at any rate, I thought some might find the Della Rocca paper interesting, and it seemed sad to only post the link.
Cheers.