Faculty of Language: A Bayes Backlash?

Sunday, October 6, 2013

A Bayes Backlash?

Thanks to Ewan for sending me the link to this upcoming critical review of Bayesianism as cognitive theory. This time the authors are Gary Marcus Ernest Davis (M&D). Here's the abstract:

"An increasingly popular theory holds that the mind should be viewed as a near-optimal or rational engine of probabilistic inference, in domains as diverse as word learning, pragmatics, naive physics and predictions of the future. We argue that this view, often identified with Bayesian models of inference, is markedly less promising thatn widely believed, and is undermined by post hoc practices that merit wholesale reevaluation. We also show that the common equation between probabilistic and rational or optimal is not justified."

The paper reviews most of the widely cited banner Bayesian papers produced by Griffiths, Tenebaum, Chater, etc. and, IMO, shows that they are seriously defective. The problems include:

Models do not generalize to "slightly different configurations" (3).
The "probabilistic-cognition literature as a whole may disproportionately report successes...which would lead to a distorted perception of the applicability of the approach" (3).
Priors and decision procedures appear to be chosen on an ad hoc basis and these "seemingly innocuous design choices can yield models with arbitrarily different predictions" (5).

M&D identify a methodological source for these problems: "...the tendency of advocates of probabilistic theories of cognition (like researchers using many computational frameworks) to follow a breadth-first search strategy in which the formalism is extended to an ever broader range of domains...rather than a depth-first strategy, in which some challenging domain is explored in great detail with respect to a wide range of tasks" (3-4).

This breadth vs depth notion resonates in the ling domain in particular. Research in Generative Grammar over the last 60 years has uncovered a bunch of pretty good generalizations about how grammars function (e.g. Binding Theory, Bounding Theory, ECP etc.). I have often asked those that do not like Generative accounts to show me how to derive these generalizations using their favored assumptions. There aren't any. But a good alternative theory should aim to explain the properties of these rich detailed domains, at least if the intent is to convince people like me (most likely not a top priority).

At any rate, back to M&D. This is the third of a series of papers that have come out very critical of new Bayesian turn (e.g. see here). They all make similar points, citing different data/arguments each time: Bayesianism per se is to loose to explain much and the results that have been touted have been massively oversold. Or to put this in M&D's words: "...probabilistic models have not yielded a robust account of cognition. They have not converged on a uniform architecture that is applied across tasks; rather there is a family of different models, each depending on highly idiosyncratic assumptions tailored to an individual task...the approach is well on its way to becoming a Procrustean bed into which all problems are fit..." (8). I don't know about you, but to me this does not sound like a rave review.

M&D like the other critics before them, not the undeniable: Bayes is "a useful tool," nothing to sneeze at, but not quite the magic lever that will break open the hard problems of human cognition. It strikes me that now may the time to short your Bayes stock.

5 comments:

AveryAndrewsOctober 6, 2013 at 4:36 PM
This comment has been removed by the author.
ReplyDelete
Replies
AveryAndrewsOctober 6, 2013 at 4:39 PM
The article doesn't incline me to dump my position, which is in any event not the one attacked. My line of thought goes as follows:

1. Bayesian reasoning is optimal if you have the correct numbers to plug in for your circumstances as you know them (as medical statistics can provide for the basic 'do I have cancer?' example). Evolution can be counted on to provide these numbers for organisms that learn how to recognize predators, food sources, etc., so we can expect that smart animals will contain implementations of Bayesian learners (perhaps multiple ones for different tasks).

2. The faculty of language appears to have emerged/been thrown together hastily from a pile of stuff that was lying around, and clearly involves learning, which by 1. can be assumed to be a computable-by-nervous-system approximation to Bayesian, but it remains to ascertain what the priors are, and exactly how the likelihoods are calculated. I'll add that as far as I can see, you don't really need Bayes' theorem to do this, if the evidence e (the PLD) is seen stuff, the hypothesis h (the grammar) is unseen stuff, and suppose that the grammar selection task is to maximize P(h)P(e|h)=P(e,h), no P(e) or P(h|e) required, since we don't care how probable it is that the grammar that looks the best is actually the correct one.

So maybe my position isn't really Bayesian at all, but 'pre-Bayesian', but that would apply to most of the linguistically relevant stuff I'm aware of, such as Lisa Pearl's and Amy Perfors' work. But the specific (pre-?) Bayesian contribution is to clarify how indirect negative evidence can in principle be used, if it turns out to be needed (which I think is almost certainly the case, but that's a different discussion).
ReplyDelete
Replies

Add comment