Friday, September 26, 2014

Never trust a fact that is not backed up with a decent theory, and vice versa

Experimental work is really hard. Lila once said loud enough for me to hear that you need to be nuts to do an experiment if you really don't have to. The reason is that they are time consuming, hard to get right and even harder to evaluate.  We have recently being finding out how true this is, with paper after paper coming out arguing that much (most?) of what's in our journals is of dubious empirical standing.  And this is NOT because of fraud.  But because of how experiments get done, reported, evaluated and assessed with respect to professional advancement.  Add to this the wonders of statistical methodology (courses in stats seem designed as how to manuals for gaming the system) and what we end up with is, it appears, junk science with impressive looking symbols.  Paul Pietroski sent me this link to a book that reviews some of this in social psychology. But don't snigger, the problems cited go beyond this field, as the piece indicates.

I said that statistical methods are partly to blame for this. This should not be taken to imply that such methods when well used are not vital to empirical investigation. Of course they are!  The problem is first, that they are readily abusable, and second the industry has often left the impression that facts that are statistically scrutinized are indubitable. In other words, the statistical industry has left the impression that facts are solid while theories are just airy fairy confabulation, if not downright horse manure.  And you get this from the greats, i.e. how theory is fine but in the end the test of a true theory come from real world experiments yada yada yada.  It is often overlooked how misleading real world experiments can be and how it often takes a lot of theory to validate them.

I think that there is a take home message here. Science is hard. Thinking is hard. There is no magic formula for avoiding error or making progress.  What makes these things hard is that they involve judgment and this cannot be automated or rendered algorithmically safe.  Science/thinking/judgment is not long division! But many think that it really is. That speculation is fine so long as made to meet the factual tribunal on a regular basis. On this view, the facts are solid, the theories in need of justification.  Need I say that this view has a home in a rather particular philosophical view? Need I say that this view has a name (psst, it starts with 'Emp…). Need I say that this view has, ahem, problems?  Need I say that the methodological dicta this view favors are misleading at best?  We like to think that there are clear markers of the truth. That there is a method which if we follow it will get us to knowledge if only we persevere. There isn't. Here's a truism: We need both solid facts and good theories and which justify which is very much a local contextual matter. Facts need theoretical speculation as match as theoretical speculations need facts. It's one big intertwined mess, and we forget this, we are setting ourselves up for tsuris (a technical term my mother taught me).


  1. Wise words from Norbert. Never blindly trust statistics! Who knew? The slogan "There are lies, damned lies, and statistics" probably has more alleged 'originators' than words. But always good to remind the ignorant. And experiments can go wrong and/or be poorly designed. So to avoid the insanity of unnecessary experiments one has to focus on solid theory construction before one jumps into the sea of data. Again, who knew?

    So what are criteria for good theories? Seemingly, this depends on who one asks. And at what time of day [=point in their career] one asks. Take Chomsky: early in his career he was all for simplicity and parsimony [SP]. Until Paul Postal's "The Best Theory" came along. Instantaneously, SP turned into vices, making Paul's theory not merely wrong but "the worst theory". Theoretical entities happily multiplied beyond all necessity. Until, in one of Chomsky's heroic re-inventions of the field [MP] SP made a powerful comeback. [Presumably it was out of politeness that Chomsky never mentioned Paul's [and other GSers] earlier championing of SP and renamed them 'perfect, optimal solutions' - who wants to re-open old wounds?]. But who can predict when Chomsky will have his next epiphany and yet again change criteria for a good theory?

    Maybe one ought to look at theoretical proposals Norbert approves of? The Hornstein-Boeckx Suggestion [H-BS] about I-universals certainly should qualify:

    "I-universals are likely to be...quite abstract. They need not be observable. ... the fact that every language displayed some property P does not imply that P is a universal in the I-sense. Put more paradoxically, the fact that P holds universally does not imply that P is a universal. Conversely, some property can be an I-Universal even if manifested only in a single language. The only thing that makes something an I-universal ... is that it is a property of our innate ability to grow language” (Hornstein & Boeckx, 2009, 81).

    Ingeniously H-BS makes I-universals not only unfalisifiable but also unconfirmable (P could be a property of an extinct language). Another beneficial consequence of H-BS is that both P and ¬P can be I-universals: for example recursion is an I-Universal because it is attested in at least one language and the absence of recursion is an I-Universal because it is attested in at least one language. With theoretical innovations like H-BS who needs data or experiments, or knowledge about biology and potential neurophysiological implementation of our innate ability to grow language?

    Call me conservative but if I would be in the kind of theoretical glass-residence Norbert so proudly occupies I'd use paper-airplanes to attack my opponents instead of the massive boulders he tosses routinely around. Unless I were to subscribe to this innovative definition of science: "We investigate [cognitive structures] using capacities which allow us to carry out considered reflection on the nature of the world. It is given various names in different cultures. It’s called myth, or magic, or in modern times you call it science" (Chomsky, 2009, 383). Let the mystical H-BS magic continue then...

  2. I'm all for good theories, and for tempering ones faith in statistics. But rather than basking in the hue and cry of repli-gate, I'd like to give a shout out for a nice recent article by Jon Maner. The May 2014 issue of Perspectives in Psychological Science was devoted to methods in analyzing data -- the usual stuff. But Maner's article takes a different approach. The title says it all: "Let’s Put Our Money Where Our Mouth Is
    If Authors Are to Change Their Ways, Reviewers (and Editors) Must Change With Them". His argument is that reviewers create unrealistic expectations of the cleanliness of findings, and so that creates pressures for researchers to meet those expectations.

  3. I second Colin's suggestion that you take a look at the Maner paper he links to above. One thing that he focuses on, and something that I always found surprising about many published psych papers, is the apparent theoretical triviality of many of the results. As he puts it, reviewers are happy to evaluate the methodology but not the ideas. In fact, from the little I can tell, even in the intro and concluding sections theoretical speculation is frowned upon. As maker observes, this is something worth changing. Why does method trump substance? I could go all empiricist on you, but I think that there is another reasons: it's hard to engage with the ideas that the experiments are in service of. This takes judgment and some careful thinking and this is onerous. Nonetheless it is vital, as Maner observes. At any rate, I second Colin's suggestion: take a look at the piece: it's short and not mealy mouthed.

  4. Just why would linguists spend so much time discussing how others get things wrong when they are blessed with one of the greatest theory constructers alive and can learn from him how to get it right. Recently Chomsky remarked:

    "... a Merge-based system is the most elementary, so we assume it to be true of language unless empirical facts force greater UG complexity."[Chomsky, 2009, 26]

    A very similar request for simplicity was made decades ago during the ‘linguistic wars’

    I have tried to suggest that Homogeneous II is a kind of minimal linguistic theory. Given what is known and accepted about linguistic structure, it is not possible, I think, to conceive of a theory which involves less conceptual elaboration. At the same time, extra elaborations can be justified only by providing direct empirical grounds for any proposed additions. [Postal, 1969/72, 157-8]

    Back then of course Chomsky told the world why simplicity is the mother of all vices:

    "Postal's... theory ... is simpler [than the extended standard theory ST]. It is also less restrictive and less falsifiable, and thus less capable of dealing with ...the issues raised at the level of explanatory adequacy." [Chomsky, 1974, 48]

    Given that Norbert holds that Chomsky has never changed his mind on anything of importance maybe he can elaborate how one reconciles the conflicting advice from these two quotes? And please nothing as trivial as the boring tale that Postal was confused. MP is a lot simpler than Homogeneous II. So just HOW can it provide any explanatory adequacy?

    1. This is a reply to both your comments. Short version: Please don't derail the discussion.

      1) Even if this were a case of the pot calling the kettle black, it doesn't change anything about the kettle being black. Whatever misgivings one may have about contemporary linguistics (or rather, syntax, we really shouldn't conflate the two all the time), the fact remains that fields with a strong empiricist bent are currently going through a small crisis that has even been noted in popular science blogs like Neuroskeptic. That is something worth discussing, and Colin's comment offers some interesting insight on the matter. So even if we buy your claim that Minimalism is no better off because shoddy statistics is replaced by shoddy theory, we can still talk about the shoddy statistics. That's not "tossing around massive bolders" or "attacking opponents", that's just a bunch of scientists having a chitchat about what's happening in neighboring fields. You know, the kind of thing one might see on a blog frequented by scientists in their spare time.

      2) As for the Chomsky-Postal discussion, how is this pertinent to the issue at hand? Chomsky might have changed his position, and so his standards for what makes a good theory may have changed, possibly for less than noble reasons. That still doesn't invalidate Norbert's truism that you want to have both a good theory and solid evidence. He's talking about the position that all you need in order to do good science is a lot of data and a mechanical procedure for turning data into generalizations (elaborate statistics, computational simulations), with no theory to speak of at all. I don't think he's erecting a strawman, either, the EU's human brain project for instance falls into this category as far as I understand its mission statement.

      3) The quote from H&B similarly doesn't invalidate Norbert's truism, and there's nothing particularly unsound about it either. Typological language universals need not be language universals in the I-language sense.
      If you prefer a less loaded terminology, the grammar formalism need not exactly carve out the class of natural languages, it suffices to pick a superset thereof. The predicted yet unattested languages can then be ruled out by learnability and processing requirements. That's just a standard strategy for factoring the workload and spreading it out over several parts of your theory.
      In the other direction, it is also true that an I-language universal need not be instantiated in the real world. For instance, just because the class of languages generated by your grammar formalism is closed under union doesn't mean that the class of natural languages is closed under union (because the latter may be a subclass of the former). Looking at individual languages, the grammar formalism may allow for recursion, so recursion is available for every I-language, but that doesn't entail that every attested language uses recursion. By the same logic, then, we may actually be living in a world where not a single attested language shows some specific I-language universal. It's a very unlikely scenario, but we cannot rule it out.

    2. This comment has been removed by the author.

    3. I don't think replicability has anything to do with empiricism. The replication crisis is maybe most acute in social psychology where the theories are weakest, but there are replication crises unfolding in e.g. cancer research which even using Norbert's idiosyncratic definition of empiricism isn't empiricist (.e.g this report that only 11% of the studies replicated.)

    4. Yes it is. For me Empiricism has both a metaphysical and epistemological dimension. For linguists both matter. However, there is also a philo of science difference that I've talked about in older posts. For empiricists laws are summaries of data. For Rationalists they are descriptions of mechanisms. This is a big difference, and I believe an important one.

    5. @ewan: Don't flatter her, trolls at least bring the lulz. Quite generally, not feeding a troll doesn't help if said troll nonetheless stays around for years. The more casual reader may infer from our silence that we agree or, even worse, that there are no counterarguments because everything she says is the truth.

    6. @Alex: you're right that there are actually several crises going on at the same time, all of which are somehow related to experimental methods, but in slightly different ways.

      1) A replicability crisis in the social sciences. The discussion there is pretty interesting because some social scientists have taken the stance that if an experiment can't be replicated, that's not a problem with the experiment but the experimenters. Basically, they're just not sophisticated enough to reliably run these advanced experiments. If I remember my sociology of science class correctly, that's basically a return to 19th century science where the reputation of the investigator was considered the crucial determinant for the reliability of the findings, rather than their replicability.

      2) A statistics crisis in neuroscience, where it has come to light that many papers have seriously flawed statistics. This has been a hot button issue since the dead salmon experiment five years ago, and the debate seems to be mostly about whether neuroscientists need more statistics training, whether the publication bias towards surprising results favors shoddy statistics, or whether this could be prevented by having stronger theories that act as a coarse "bullshit detector".

      3) The positive result crisis in medicine where the severe publication bias against negative results has lead to worries that many published results can't be replicated and we just don't hear about it. This discussion seems to be mostly about the academic infrastructure; basically, studies should be announced in advance, and then we can tell from the number of unpublished papers how many replication attempts failed.

      I'm not sure if these crises all stem from the same problem --- Norbert would probably group it all under the umbrella of empiricism. And there might be even more going on in other fields. What more, the discussion itself is biased by virtue of how the media and publishing business works --- there could be fields that are strongly experimental, completely theory-free, hardcore empiricist, and don't run into any problems. Since the absence of problems isn't particularly news worthy, we probably won't hear about this field, then.

      Overall, there's a lot going on, and I don't know nearly enough about these fields to clearly make out the root of all these problems. That's why I really appreciated Collin's link to a paper written from an insider's perspective.

    7. I'd just like to interject a brief comment on the crisis in neuroscience. I agree that there is a statistics issue in the cog neuro community, but I'd also like to point out that we have a valuable architectural system of priors - the brain is a finite space, and there have been thousands of experiments performed examining the response profile of the finite number of brain regions.

      So, when people make some preposterous claim about a particular brain area's function, we can immediately cross-reference this claim with the many other studies reporting effects in this region and see if the new claim makes sense. I don't know of a direct analogue in the social sciences, where it seems strange and idiosyncratic claims can be made without too much justification.

      I feel that the big issue in cog neuro isn't rigorous stats or replicability, but theory construction, as Norbert pointed out more generally - people just want to find the neural basis of X without considering the larger architectural considerations.

    8. I don't see that: maybe have a look at e.g. Peter Dayan's book "Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems".

      There is a lot of theory in cognitive neuroscience: it's just called computational neuroscience and it requires a lot of maths.

    9. I'll take a look; I'm not familiar with that literature. However, this doesn't mean that experimental cog neuro people are theoretically grounded - many of them aren't, and that was my point. Millions of dollars of funding are often wasted on bad fMRI experiments, and I don't think they're chiefly bad because of the stats, I think they're chiefly bad because of the theory.