Monday, June 25, 2018

Physics envy blues

These are rough days for those who suffer from Physics Envy (e.g. ME!!!!). It appears (at least if the fights in the popular press are any indication) that there is trouble in paradise and that physicists are beset with “ a growing sense of unease” (see herep. 1).[1]Why? Well for a variety of reasons it seems. For some (e.g. BA) the problem is that the Standard Theory in QM (ST) is unbelievably well grounded empirically (making “predictions verified to within one-in-ten-billion chance of error” (p.1)) and yet there are real questions that ST has no sway of addressing (e.g. “Where does gravity come from? Why do matter particles always possess three, ever heavier copies, with peculiar patterns in their masses, and why does the universe contain more matter than antimatter?” (p.1-2)). For others, the problem seems to be that physicists have become over obsessed with (mathematical) beauty and this has resulted in endless navel gazing and a failure to empirically justify the new theory (see here).[2]

I have no idea if this is correct. However, it is not a little amusing to note that it appears that part of the problem is that ST has just been too successful! No matter where we point the damn thing it gives the right empirical answer, up to 10 decimal points. What a pain!

And the other part of the problem is that it appears that the ways that theorists had hoped to do better (super-symmetry, string theory etc.) have not led to novel empirical results. 

In other words, the theory that we have that is very well grounded doesn’t answer questions to which we would love to have answers and the theories that provide potential answers to the questions we are interested in have little empirical backing. This hardly sounds like a crisis to me. Just the normal state of affairs when we have an excellent effective theory and are asking ever more fundamental questions. At any rate, it’s the kind of problem that I would love to see in linguistics.

However, this is not the way things are being perceived. Rather, the perception is that the old methods are running out steam and that this requires new methods to replace them. I’d like to say a word or two about this.

First off, there seems to be general agreement that the strategy of unification in which “science strives to explain seemingly disparate ‘surface’ phenomena by identifying, theorizing and ultimately proving their shared ‘bedrock’ origin” (BA;2) has been wildly successful heretofore. Or as BA puts in a more restrained manner, “has yielded many notable discoveries.” So the problem is not with the reasonable hope that such theorizing might bear fruit but with the fact that the current crop of ambitious attempts at unification have not been similarly fecund. Again as BA puts it: “It looks like the centuries-long quest for top-down unification has stalled…” (BA:2). 

Say that this is so. What is the alternative? Gelman in discussing similar issues titles a recent post “When does the quest for beauty lead science astray?” (here). The answer should be, IMO, never. It is never a bad idea to look for beautiful theories because beauty is what we attribute to theories that explain(i.e. have explanatory oomph) and as that is what we want from the sciences it can never be misleading to look for beautiful theories. Never.

However, beauty is not the onlything we want from our theories. We want empirical coverage as well. To be glib, that’s part of what makes science different from painting. Theories need empirical grounding in addition tobeauty. And sometimes you can increase a theory’s coverage at the expense of its looks and sometimes you can enhance its looks by narrowing its data coverage. All of this is old hat and if correct (and how could it be false really) then at any given time we want boththeories that are pretty and also cover a lot of empirical ground. Indeed, a good part of what makes a theory pretty is howit covers the empirical ground. Let me explain.

SH identifies (here) three dimensions of theoretical beauty: Simplicity, Naturalness and Elegance (I have capitalized them here as they are the three Graces of scientific inquiry).[3]

Theories are simple when they can “be derived from a few assumptions.” The fewer axioms the better. Unification (showing that two things that appear different are actually underlyingly the same) is a/the standard way of inducing simplicity. Note that theories with fewer axioms that cover the same empirical ground will necessarily have more intricate theorems to cover this ground. So simpler theories will have greater deductive structure, a feature necessary for explanatory oomph. There is nothing more scientifically satisfying than getting something for nothing (or, more accurately, getting two (better still some high N) for the price of one). As such, it is perfectly reasonable to prize simpler theories and treat them as evident marks of beauty.[4]

Furthermore, when it comes to simplicity it is possible to glimpse what makes it such a virtue in a scientific context. Simpler theories not only have greater deductive structure, the are also better grounded empirically in the sense that the fewer the axioms, the more empirical weight each of them supports. You can give a Bayesian rendition of this truism, but it is intuitively evident. 

The second dimension of theoretical beauty is Naturalness. As SH points out, naturalness is an assumption about types of assumptions, not the number. This turns out to be a notion best unpacked in a particular scientific local. So, for example, one reason Chomsky likes to mention “computational” properties when talking about FL is that Gs are computational systems so computational considerations should seem natural.[5]Breakthroughs come when we are able to import notions from one domain into another and make them natural. That is why a good chunk of Gallistel’s arguments against neural nets and for classical computational architectures amounts to arguing that classical CS notions should be imported into cog-neuro and that we should be looking to see how these primitives fit into our neuro picture of the brain. The argument with the connectionist/neural net types revolves around how natural a fit there is between the digital conception of the brain that comes from CS and the physical conception that we get out from neuroscience. So, naturalness is a big deal, but it requires lots of local knowledge to get a grip. Natural needs an index. Or, to put this negatively, natural talk gets very windy unless grounded in a particular problem or domain of inquiry.

The last feature of beauty is Elegance. SH notes that this is the fuzziest of the three. It is closely associated with the “aha-effect” (i.e. explanatory oomph). It lives in the “unexpected” connections a theory can deliver (a classic case being Dirac’s discovery of anti-matter). SH notes that it is also closely connected to a theory’s “rigidity” (what I think is better described as brittleness). Elegant theories are not labile or complaisant. They stand their ground and break when bent. Indeed, it is in virtue of a theory’s rigidity/brittleness that it succeeds having a rich deductive structure and explanatory oomph. Why so? Because the more brittle/rigid a theory is the less room it has for accommodating alternatives and the less things a theory makes possiblethe more it explains when what it allows as possible is discovered to be actual.

We see this in linguistics all the time. It is an excellent argument in favor of one analysis A over another B that A implies C (i.e. A and not-C are in contradiction) whereas B is merely consistent with C (i.e. A and not-C are consistent). A less rigid B is less explanatory than a more rigid A and hence is the superior explanatoryaccount. Note that this reasoning makes sense onlyif one is looking at a theory’s explanatory features. By assumption, a more labile account can cover the same empirical ground as a more brittle one. The difference is in what they excludenot what they cover. Sadly, IMO, the lack of appreciation of the difference between ‘covering the data’ and ‘explaining it’ often leads practitioners to favor “flexible” theories, when just the opposite should be the case. This makes sense if one takes the primary virtue of a theory to be regimented description. It makes no sense at all if one’s aims lie with explanation.

SH’s observations make it clear (at least to me) why theoretical beauty is prized and why we should be pursuing theories that have it. However, I think that there is something missing from SH’s account (and, IMO, Gelman’s discussion of it (here)). It doesn’t bind the three theoretical Graces as explicitly to the notion of explanation as it should. I have tried to do this a little in the comments above, but let me say a bit more, or say the same things one more time in perhaps a slightly different way.

Science is more than listing facts. It trucks in explanations. Furthermore, explanations are tied to the why-questions that identify problems that explanations are in service of elucidating. Beauty is a necessary ingredient in any answer to a why-question but what counts as beautiful will heavily depend on what the particular question at issue is. What makes beauty hard to pin down is this problem/why-question relativity. We want simple theories, but not toosimple? What is the measure? Well, stories just as complicated as required to answer the why-question at issue. Ditto with natural. Natural in one domain wrt to one question might be unnatural in another wrt to a different question. And of course the same is the case with brittle. Nobody wants a theory that is so brittle that it is immediately proven false. However, if one’s aim is explanation then beauty will be relevant and what counts as beautifulwill be contestable and rightly contested. In fact, most theorizing is argument over how to interpret Simple, Natural and Elegant in the domain of inquiry. That’s what makes it so important to understand the subtleties of the core questions (e.g. in linguistics: Why linguistic creativity? Why linguistic promiscuity? How did FL/UG arise?). At the end of the day (and, IMO, also at sunrise, high noon and sunset) the value of one’s efforts must be judged against how well the core questions of the discipline have been elucidated and their core problems explained. And this is a messy and difficult business. There is no way around this.

Actually, this is wrong. There is a way around this. One can confuse science with technology and replace explanation with data regimentation and coverage. There is a lot of that going around nowadays in the guise of Big Data and Deep Learning (see hereand here). Indeed some are calling for a revamped view of what the aim of science ought to be; simulation replacing explanation and embracing the obscurantism of overly complex uninterpretable “explanations.” For these kinds of views, theoretical beauty really is a sideshow. 

At any rate, let me end by reiterating the main point: beauty matters, it is always worth pursuing, it is question relative and cannot really ever be overdone. However, there is no guarantee that the questions we most want answered can be, or that the methods we have used successfully till now will continue to be fruitful. It seems that physicists feel that they are in a rut and that much of what they do is not getting traction. I know how they feel. But why expect otherwise? And what alternative is there to looking for beautiful accounts that cover reasonable amounts of data in interesting and enlightening ways? 

[1]This post is by Ben Allanch from Cambridge University physicist. I will refer to this post as ‘BA.’
[2]Sabine Hossenfelder (SH) has written a new book on this topic Lost in Maththat is getting a lot of play in the popular science blogosphere lately. Hereis a strong recommendation from one of our own Shravan Vasishth who believes that “she deserves to be world famous” for her views. 
[3]The Three Graces were beauty/youth, mirth and elegance (see here). This is not a bad fit with our three. Learning from the ancients, it strikes me that adding “mirth” to the catalogue of theoretical virtues would be a good idea. We not only want beautiful theories, we should also aim for ones that delight and make one smile. Great theories are not somber. Rather they tickle one’s fancy and even occasionally generate a fat smile, an “ah damn” shake of the head and a feeling of overall good feeling. 
[4]SH contrasts this discussion of simplicity with Occam’s claiming that the latter is the injunction to choose the simpler of two accounts that cover the same ground. The conception SH endorses is “absolute simplicity,” not a relative one. I frankly do not understand what this means. Unification makes sense as a strategy because it leads to a simpler theory relative to the non-unified one. Maybe what SH is pointing to is the absence of the ceteris paribus clause in the absolute notion of simplicity. If so, then SH is making the point we noted above: simpler theories might be empirically more challenged and in such circumstances Occam offers little guidance. This is correct. There is no magic formula for deciding what to opt for in such circumstances, which is why Occam is such a poor decision rule most of the time.
[5]See his discussion of Subjacency in On Wh Movementfor example.


  1. A very nice discussion! The idea that rigidity is a theoretical virtue goes back to Einstein. In his hands it means something like: a theory is right or wrong, but not modifiable without becoming nonsense. So, GR is rigid because it entails that gravitational attraction is inversely proportional to the square of the distance, and it can’t be modified to entail that it is the cube of the distance, to accommodate, say, a possible world where gravity is measuable in terms of the cube of the distance. GR is unlike classical mechanics, in this respect. So, right, rigidity is good, because it explains phenomena in a very deep way, as virtual physical necessities (MP-ish pun intended). If GR is right at all in any respect, then gravity MUST be a function of the square of the distance. This kind of reasoning applies in spades to linguistics, such as issues of structural dependence vis-à-vis PoS issues, the binarity of merge, etc. If one thinks there is nothing deep going on, little wonder that the explanations on offer are shallow.

    1. Hi John,
      Could you amplify your point about binarity of merge and how it relates to rigidity?

  2. As regards binarity of merge, I had in mind the thought that if you want a single operation, only binary merge is possibly sufficient. Why? Let’s say you come across some phenomena that appear to require ternary merge (co-ordination, etc.). If you accept the appearances, then you lose the sufficiency of the single operation, for you can’t use ternary merge for [NP black dog], etc. So, you know that binary merge will suffice for a whole range of structures, and nothing else will do. Equally, showing how binary merge can handle co-ordination, double objects, etc. counts as a deeper explanation than positing two different operations, for the disparate structures at the surface are shown to follow from the single principle that is now proving to be actually sufficient. Of course, this thought trades on a bunch of empirical claims. My observation, here, only concerns the logic of the situation. So, the MP-ish/rigid hypothesis is that a single structure building operation suffices for all structures. (We know we can curry everything to the unary case, but that is a trick insofar as the new unary function will record the original arity.)

    One might think, ‘Well, why not n-ary merge as a basic principle? This is bound to work.’ This proposal introduces massive redundancy: for any structure we find, it could have been put together in a ‘flat’ way according to our n-ary merge. The explanation for why the structure is not flat, therefore, is shifted from merge to some other factors, such processing ones, perhaps. Alternatively, you could let some portion, as it were, of n-ary merge be used in this language or that, but that makes the merge hypothesis hyper flaccid.

    I should say, I think appeals to physics are interesting for paradigms of logic and explanation, but I don’t think the above reasoning relies upon any ‘third factor’ ideas.

    1. Isn't that more of a general parsimony argument? I took Norbert's rigidity to be when a theory makes a bunch of predictions that can't be altered. I can't think of any very firm predictions that binarity makes.

    2. Insofar as there ARE "general parsimony" arguments (well, they exist, certainly; the question is rather what force do/should they carry), following Elliot Sober's "Ockham's Razors" book, one might be able to assimilate as least at least one aspect of 'rigidity' to one approach to parsimony. Here are some quotes from reviews of the Sober book suggest why that might be so (emphases added by me)

      "According to the second paradigm, parsimony is relevant to estimating a model’s predictive accuracy since models with FEWER ADJUSTABLE PARAMETERS are less prone to “overfitting,” that is, less prone to describe random noise in the data rather than the true relationship of the variables of interest."
      (Bengt Autzen, Philosophy of Science, v 83, 2016)

      "...when it comes to estimating the predictive accuracy of a model from a frequentist perspective, likelihood is only one relevant measure, another one being the NUMBER OF ADJUSTABLE PARAMETERS in the model."
      (Michael Baumgartner, Australasian Journal of Philosophy, v 96, 2018)

      "...specify a metric for assessing the comparative parsimony of hypotheses of a particular type (e.g., NUMBER OF ADJUSTABLE PARAMETERS,..."
      (Daniel Steel, Notre Dame Philosophical Reviews, 2016.1.27)

      IF having scarcity of adjustable parameters is a mark of rigidity (see John's comparison of general relativity with classical mechanics in his first comment above), then to the extent that Sober's views are correct, one might assimilate (some of) rigidity to (one aspect of one type of) parsimony. Perhaps. (I'm not a hedgehog--I like to share my hedges.)


  3. I don't think so. The idea is that binarity does explain phenomena in a way alternatives don't. So, binarity leads to the positing of covert structure, which explains novel phenomena you didn't have in mind at the time. It also insists that relations are binary - obviously - so you get to reduce c-command. Etc. Etc. Proliferating operations might fit the facts but produces shallow explanations close to the phenomena in the way a single operation dosesn't. I agree with Norbert's point that Occam offers little when it comes to deciding between alternatives.

    1. I haven't read Lost in Math yet though I have bought it -- maybe that will make it all clear. Thanks anyway.

  4. It looks great. Steven Weinberg's Dreams of a Final Theory is a classic in the same vein, although he is quite optimistic, as I recall.

  5. QM was very flexible in the beginning. But QM deals with dicrete particles, with fields having descrete sets of states. It's a very clear game, and electron is an electron anywhere.

    But is a verb a verb anywhere? There is nothing comparable in, say, biology.