Faculty of Language: Once more into the breech: Re (3d)

So, what makes an inductive theory Bayesian? I have no idea. Nor, it appears does anyone else. This is too bad. Why? Because though ti is always the case that particular models must be evaluated on their own merits (as Charles rightly notes in the previous post), the interest in particular models, IMO, stems from the light they shine on the class of models of which they are a particular instance. In other words, specific models are interesting both for their empirical coverage AND (IMO, more specifically) for the insight they provide for the theoretical commitments a model embodies (hence one model from the class of models).

My discussion of Bayes rested on the assumption that Bayes commits one to some interesting theoretical claims and that the specific models offered are in service of advancing more general claims that Bayes embodies. From where I sit, it seems to me that for many there are no theoretical claims that Bayes embodies so that the supposition that a Bayes model intends to tell us something beyond what the specific model is a model of is off base. Ok. I can live with that. It just means that the whole Bayes thing is not that interesting, except technologically. What's potential interest are the individual proposals, but they don't have theoretical legs as they are not in service of larger claims.

I should add, however, that many "experts" are not quite so catholic. Here is a quote from Gelman and Shalizi's paper on Bayes.

The common core of various conceptions of induction is some form of inference from particulars to the general – in the statistical context, presumably, inference from the observations y to parameters describing the data-generating process. But if that were all that was meant, then not only is ‘frequentist statistics a theory of inductive inference’ (Mayo & Cox, 2006), but the whole range of guess-and-test behaviors engaged in by animals (Holland, Holyoak, Nisbett, & Thagard, 1986), including those formalized in the hypothetico-deductive method, are also inductive. Even the unpromising-sounding procedure, ‘pick a model at random and keep it until its accumulated error gets too big, then pick another model completely at random’, would qualify (and could work surprisingly well under some circumstances – cf. Ashby, 1960; Foster & Young, 2003). So would utterly irrational procedures (‘pick a new random when the sum of the least significant digits in y is 13’). Clearly something more is required, or at least implied, by those claiming that Bayesian updating is inductive. (25-26)

Note the theories that they count as "inductive" under the general heading but find to be unlikely candidates for the Bayes moniker. See what they consider not Bayes inductive rules? Here are two, in case you missed it: "the whole range of guess-and-test behaviors" and even the "pick a model at random and keep it until its accumulated error gets too big, then pick another model completely at random." G&S take it that if even there methods are instances of Bayesian updating, then there is nothing interesting to discuss for it denudes Bayes of any interesting content.

Of course, you will have noticed that these two procedures are in fact the ones that people (e.g. Charles, Trueswell and Gleitman and Co) have been arguing in fact characterize acquisition in various linguistic domains of interest. Thus, they reasonably enough (at least if they understand things the way Gelman and Shalizi do) conclude that these methods are not interestingly Bayesian (or for that matter "inductive," except in a degenerate sense).

So, there is a choice: treat "Bayes" as an honorific in which case there is no interesting content to being Bayesian beyond "hooray!!" or treat it as having content, in which case it seems opposed to systems like "guess-and-test" or "pick at random." Which one picks is irrelevant to me. It would be nice to know, however, which is intended when someone offers up a Bayesian model. In the first case it 'Bayesian' just means "one that I think is correct." In the second, it has slightly more content. But what that is? Beats me.

One last thing. It is possible to understand the Aspects model of G acquisition as Bayesian (I have this from an excellent (let's say, impeccable) source). Chomsky took the computational intractability of that model (its infeasibility) to imply that we need to abandon the Aspect model in favor of a P&P view of acquisition (though whether this is tractable is an open question as well). In other words, Chomsky took seriously the mechanics of the Aspects model and thought that its intractability indicated that it was fatally flawed. Good for him. He opted for being wrong over being vacuous. May this be a lesson for us all.

5 comments:

Alex ClarkApril 15, 2016 at 2:17 AM
If someone could explain to me how the Aspects model is Bayesian when it isn't even probabilistic, then I would be grateful.

I kind of see how you could argue that the evaluation metric is a sort of prior, though of course it may be improper (not a fatal flaw).

Thursday, April 14, 2016

Once more into the breech: Re (3d)

5 comments: