Comments

Monday, February 11, 2013

What's Chomsky Thinking Now


Here’s a short post linking to what I think is a very accessible summary of Chomsky’s current views about language and its biological basis. The discussion is not extended and the cognoscenti will not likely learn anything new (though I did).  Here are some points he touches on:

1.     Distinction between generalizations about language and UG, which is “the genetic basis for language.
2.     That there are no real group differences (vs individual differences) between humans as regards FL/UG, implying that there has been no significant change in FL for a very long time.
3.     Early theories of UG allowed for a huge amount of variation between languages. Over the last 40 years, theory has narrowed the range of this difference.
4.     A simple statement of the goals of a useful theory: “A plausible theory has to account for the variety of languages and the detail that you see in the surface study of languages – and, at the same time, be simple enough to explain how language could have emerged quickly, through some small mutation of the brain, or something like that.”
5.     Analogy between UG and Jacob’s Universal Genome hypothesis.
6.     Argument for species specific native capacity for language starts from the simple observation that humans alone can “pick out anything that is relevant to language” from the “great blooming, buzzing confusion” that is the stimulus input.
7.     Fast mapping of words to meanings indicating taht Quine’s “museum myth” is in fact reality.
8.     Pattern recognition insufficient for language acquisition.
9.     Theory of Mind orthogonal to language acquisition problem.
10.  Linguistic interests are different from Languistic ones.
11.  Is the idea that culture influences language meaningful?

The distinction in (1) is particularly important to reiterate as there has been rampant confusion on this point. Chomsky’s views are not Greenberg’s and a lot of the criticism of Universals has come from running the two together.  I also liked the observations concerning the implications of research on autism for what look like Tomasello-like views about language development. The remarks are blunt, but raise a relevant point.

I also found the observations concerning how little UG has apparently changed over the last 30,000 years important to stress (2). Recall the Papuans (and the Piraha?) who were very isolated until recently are all capable of learning the same languages in the same way as anyone else. In fact, I know of no group of people whose kids suitably located cannot learn any language, on contrast to any other animal. Why? because humans have essentially the same UG and other animal's don't have one. This is sufficient to raise the generative research question: what do we have that they don't and how did it get there? 

Similarly culture changes have left UG pretty much intact (11), so far as we can tell. Older varieties of English, Icelandic, Japanese look from a UG vantage point pretty much like their contemporary counterparts, indicating that the same UG operated then as does now. Chomsky develops these points more elaborately elsewhere, but if you are like me and people ask what you do then this is something short and readable to give them, if, of course, these are the questions you are interested in.

Sunday, February 10, 2013

Another Impossibility Argument


Things that are not possible don’t happen.  Sounds innocuous huh?  Perhaps, but that’s the basis of POS arguments.  If the evidence cannot dictate the outcome of acquisition, and nonetheless the outcome is not random, then there must be something other than the evidence guiding the process. Substitute ‘PLD’ for ‘evidence’ and ‘UG’ for ‘something other than the evidence’ and you get your standard POS argument. There have been and can be legitimate fights about how rich the data/PLD is and how rich UG is, but the form of the argument is dispositive and that’s why it’s so pretty and so useful.  A good argument form is worth its weight in theoretical gold. 

That this is what I believe is not news for those of you who have been following this blog.  And don’t worry, at least for today, I am not going to present another linguistic POS argument (although I am tempted to generate some new examples so that we can get off of the Yes/No question data).  Rather, what I want to do is publicize another application of the same argument form, the one deployed in Gallistel/King (G/K) in favor of the conclusion that connectionist architectures are biologically impossible.

The argument that they provide is quit simple: connectionism requires too many neurons to code various competences.  They dub the problematic fact over which connectionist models necessarily stumble and fall ‘the infinitude of the possible’ (IoP). The problem as they understand it is endemic; computations in a connectionist/neural net architecture (C/NN) cannot be “implemented by compact procedures.” This means that such a C/NN cannot “produce as an output an answer that the maker of the system did not hard wire into the look-up tables. (261)” In effect, C/NNs are fancy lists (aka look-up tables) where all possibilities are computed out rather than being implicit in more compact form in a generative procedure. And this leads to a fundamental problem: the brain is just not big enough to house the required C/NNs.  Big as brains are, they are still too small to explicitly code all the required possible realizable cognitive states.

G/K’s argument is all in service of arguing that neuroscientists must assume that brains are effectively Turing-von Neumann (TvN) machines with addressable, symbolic, read/write memories. In a nutshell:

 … a critical distinction between procedures implemented by means of look-up tables and … compact procedures …is that the specification of the physical structure of a look-up table requires more information than will ever be extracted by the use of that table.  By contrast, the information required to specify the structure of a mechanism that implements a compact procedure may be hundreds of orders of magnitude less than the information that can be extracted using that mechanism (xi).

What’s the argument? Like POS arguments, it starts with a rich description of various animal competences. The three that play staring roles in the book are dead reckoning in ants, bee dancing and food caching behavior in jays.  Those of you who like Animal Planet will love these sections. It is literally unbelievable what these bugs and birds can do.  Here is a glimpse of jay behavior as reported in G/K (c.f. 213-217).

In summer, when times are good, scrub jays collect and store food in different locations for later winter feasting. They cache this food in as many as 10,000 different locations. Doing this involves remembering what they hid, where they hid it, when they hid it, whether they emptied it, if the morsel was tasty, how quickly the morsel goes bad, and who was watching them when they hid it. This is a lot of information. Moreover, it is very specific information, sensitive to six different parameters.  Moreover, the values for these parameters are indeterminate and thus the number of possible memories these jays can produce and access is potentially unbounded.  Though there is an upper bound on the actual memories stored, the number of potentially storable memories is effectively unbounded (aka, infinite).  This is the big fact and it has a big implication. In order to store these memories the jays need some sort of template that roughly says ‘stored X at Y at time Z, X goes bad in W days, X is +/- retrieved, X is +/- tasty, storing was +/- observed.’ This template requires a brain/mind that can link variables, value variables, write to memory and retrieve from memory so as to store useful information and access it when necessary.  Note, we can treat this template as a large sentence frame, much like ‘X weighs Y pounds’ and like the latter there is no upper bound on the number of possible realizations of this frame (e.g. John weighs 200 pounds, Mary weighs 90 pounds, Trigger weighs 1000 pounds etc.).  These templates combined with the capacity to put actual food type/time/place/ etc. values for the variables constitute “compact procedures” for coding the relevant actual information required. Notice how “small” it is relative to the number of actual instances of such templates (finite specification versus unbounded number of instances).

If this description is correct (G/K review the evidence extensively), here is what is neuronally impossible: to list all the potential instantiations of this kind of proposition and simply choose the ones that are actual. Why?

… the infinitude of the possible looms. There are many possible locations, many possible kinds of food, many possible rates of decay, many possible delays between caching and recovery – and no restrictions on the possible combinations of these possibilities. No architecture with finite resources can cope with this infinitude by allocating resources in advance to every possible combination. (217)

However, current neuroscience takes read/write memory (a necessary feature of a system able to code the above information) to be neurobiologically implausible. Thus, current neuroscience only investigates systems (viz. C/NNs) that cannot in principle handle these kinds of behavioral data.  What’s the principled reason for this inadequacy? Computation in C/NNs is not implemented by compact procedures. Rather, C/NNs are effectively elaborate look-up tables and so cannot “output an answer that the maker of the system did not hard wire into one of its look-up tables.” (261).

That’s the impossibility argument. If we assume that brains mediate cognition then it cannot be the case that animal brains are C/NN devices.

I strongly recommend reading these sections of G/K’s book. There is terrific detailed discussion of how many neurons it would take to realize a C/NN capable of dead reckoning. By G/K’s estimates (261) it would take all the neurons in an ant’s brain (ants are terrific dead reckoners) to realize a more or less adequate system of dead reckoning.

This is fun stuff, but I am no expert in these matters and though I tend to trust Gallistel when he tells me something, I am in no position to independently verify his calculations regarding the number of neurons required to implement a reasonable C/NN device.[1]  However, G/K’s conclusion should resonate with generativists.  Grammars are compact procedures for coding what is essentially an infinite number of possible sentences and UG is (part of ) a compact procedure for coding what is (at the very least) a very large number of possible Gs.[2] Thus, whatever might be true of other animals, human brains clearly capable of language cannot be C/NNs.[3] Why do I mention this. For two reasons:

First, there is still a large cohort of neuroscientists, psychologists and computationalists who try to analyze linguistic phenomena in C/NN terms. They look at pictures of brains, see interconnected neurons, and conclude that our brains are C/NNs and that our linguistic competence must be analysed to fit these “data.” G/K argue that this is exactly the wrong conclusion to draw. Note that the IoP is trivially obvious in the domain of language, so the G/K argument is very potent here. And that is cause for celebration as these same net-enchanted types are also rather grubby empiricists.

G/K discuss the unholy affinities between associationism and C/NN infatuation (c.f. chapter 11) (the slogan “what fires together wires together” reveals all).  Putting a stake through the C/NN worldview also serves to weaken the empiricist-associationist learning conception of acquisition.[4] I doubt it will kill it. Nothing can it appears. But perhaps it can intellectually wound it yet again (though only if the G/K material is taken seriously, which I suspect it won’t be either because the net-nuts won’t get it or they will simply ignore it). So an attack on C/NNs is also a welcome attack on empiricist/associationist conceptions and that’s always a valuable public service.

Second, this is very good news for the minimalistically inclined. Here’s why.  Minimalists are counting on animal brains being similar enough to human brains for there to be features of the former that can be used to explain some of the properties of FL/UG.  Recall the conceit: take the cognitive capacities of our ancestors as given and ask what we need to add to get linguistically capable minds/brains.  However, were animal brains C/NNs and ours clearly are not (recall how easy it is to launch the IoP considerations in the domain of language) then it is very hard to see how something along these lines could be true.  Is it really plausible that the shift to language brought with it an entirely new kind of brain architecture?  The question answers itself.  So G/K’s conclusions regarding animal brain architectures is very good news.

Note, we can turn this argument around: if as minimalisms requires (and seems independently reasonable) human brains are continuous with non-human ones and if the IoP requires that brains have TvN architectures, then language in humans provides a strong prima facie argument for TvNs in animals.  For the reasons that G&K offer (not unlike those Chomsky originally deployed) human linguistic competence cannot supervene on C/NN systems. And as G/K note, this has profound implications for current work in neuroscience, viz. the bulk of the work is looking in the wrong places for answers that cannot possibly serve. Quite a conclusion, but that’s what a good impossibility argument delivers.

Let me end with one last observation: there is a tendency to think that neuroscientists hold the high intellectual ground and cognitive science/linguistics must accommodate itself to its conclusions. G/K demonstrate that this is nonsense.  Cognition supervenes on brains. If some kind of brain cannot support what we know are the cognitive facts, then this view of the brain must be wrong.  Anything else would be old fashion dualism. Hmm, wouldn’t it be amusing if to defend their actual practice neuroscientists had to endorse hard core Cartesian dualism? Yet, if G/K are right, that’s what they are effectively doing right now.



[1] Note that the G/K argument is about physical implementation and so is additional (though related) to the arguments presented by Fodor and Pylyshyn or Marcus against connectionism. Not only does C/NN get the cognition wrong, it is also physically impossible given how many neurons it would take to implement a C/NN device to do even a half-assed job. 
[2] If UG is parametric with a finite number of parameters then the number of possible Gs will also be finite. However, it is virtually certain that the number of parameters (if there are parameters) is very high (see here for discussion) and so the space of possibilities is very very large. OF course, if there are no parameters, then there need be no upper bound on the number of possible Gs.
[3] Which does not mean that for some kinds of pattern recognition the brain cannot use C/NNs. The point is that these are not sufficient.
[4] The cognoscenti will notice that I am here carrying on my effort to change our terminology so that we use ‘learning’ to denote one species of acquisition, the species tightly tied to empiricism/associationism.

Thursday, February 7, 2013

More on Modularity


I just read an interesting paper by Musolino and Landau (M&L) on William’s Syndrome (WS) and its implications for the modularity of FL (“Genes, language, and the nature of scientific explanations: The case ofWilliam’s syndrome”). WS is a genetic disorder that has been relatively well localized (“about 25 genes missing from chromosome 7” (124)). As M&L describe it the disorder consists in “a highly unusual cognitive profile, with individuals showing severe impairments in a range of spatial functions but strikingly fluent and well-structured language (124).”  Not surprisingly, WS is “often cited as a compelling demonstration of the modularity of mind (124).” 

For reasons that elude me, the idea that minds have a modular structure seems extremely irritating to some.  What makes this hard to understand is that the standard brain mapping enterprise (the main current activity of neuroscience (see here)) seems to presuppose that the brain is divided into different sectors, with different properties, doing different things. This sure sounds like a commitment to something very much like modularity.  However, given the relation between modularity and domain specificity, the usual suspects have been pushing back against the idea that WS argues for modular minds/brains.  I am no expert in these areas, but I recommend the M&L paper for clarifying the ins and outs of the “debate.” Note the scare quotes. What is most interesting about their reconstruction of the discussion is the disinclination of the non-modularists (aka ‘neuro-constructivists’) to actually engage with linguistic data.  Here’s what I mean.

M&L review work that seems to indicate the WS kids function pretty much like we do in rather complex linguistic tasks.  Not identical but pretty much.  Indeed, there are two basic facts when one compares WS kids to “typicals.”  First, both WS and typicals do well above chance in a series of rather complex scope experiments involving negation and disjunctive ‘or’ (136-7). Second, typicals of the same “mental age” as WS kids do better than the WS kids while younger typicals and WS kids function more or less the same. M&L argue, reasonably it seems to me, that both facts need accounting for.  And they provide an explanation (roughly, that the high success rate of both WS kids and typicals stems from having roughly the same kind of competence, while the differences stem from lesser processing capacities in WS kids).  I found the account very plausible, but that’s not what I want to highlight.

What I want to highlight is the following difference: whereas M&L try to offer an explanation for the two facts noted above, the anti-modular neuro-constructivists never seem to offer an account for the first fact.  Rather than provide an analysis explaining how the kids clearly do what they do, they point to the profile differences and conclude that WS kids must be using different mechanisms. What mechanisms? Well, M&L quote Thomas and Kramiloff-Smith who citing (of course) Christiansen & Chater and Rumelhart & McClelland suggest the following: “presumably …lexical or semantic/pragmatic compensatory mechanisms … that contain some but not all of the grammatical properties outlined in a generative theory, or … computational mechanisms that approximate formal syntactic systems under some processing conditions but not others (139).” Whew! This is to hand waving what a piper cub is to a jumbo jet. Wave any faster and wind power would become our answer to global warming!  In other words, the above sentence is pure, unadulterated poodle poop. M&L are more generous and observe that there is a bit of a gap (“a non-trivial task”) between these insinuations and anything resembling an explanation of the observed phenomena (c.f. 140). M&L are clearly very nice people.

I have no idea whether M&L’s own proposed explanation is correct. Given how interesting the implications of WS are for the modularity issue, it is reasonable to demand high standards of evidence (Why? Because the more interesting a conclusion the more evidence we should demand).  However, at least M&L seem to be playing a game we can all recognize.  The neuro-constructivists seem not be playing at all.

Why don’t the anti-modularists feel compelled to provide a detailed account of the linguistic data?  Here’s my totally uninformed hunches:

First, I suspect they don’t really consider linguistic data to be serious data. It’s not as sexy or impressive sounding as “plasticity, adaptation, interactivity, and lexical or semantic/pragmatic compensatory mechanisms (139).” The fact that we know something about the grammar underlying these phenomena while we know next to nothing about these other factors should not blind us to how pale and wan ‘syntax’ looks next to the muscular sounding four horsemen of neuroscience above noted.

Second, the neuro-constructivists are just sure that modularity must be wrong. After all it implies domain specificity and as any decent empiricist knows (and these people have clear empiricist sympathies) this is impossible. I discussed a version of this position in the last post (here) and here it pops up again.  I find this deep bias against modular-domain specific knowledge quite incomprehensible, especially for anyone who watches Animal Planet or reads the Tuesday Science Times. Hourly we find out that animals have the most incredible domain specific knowledge (the latest one being dung beetles, who apparently navigate using the Milky Way).  And if they do, wouldn’t it be downright weird if humans didn’t?  If anything the burden of proof is heavily on those who resist any modularity in humans for it relies on an odd kind of biological dualism; one that sequesters human cognition from the rest of mammalian psychology. 

M&L have done the rest of us a service. Whether they are right or not about WS and modularity (my money IS on them), arguing against it requires explaining all the data, the linguistic data included. Those that don’t do so are not even wrong. They are irrelevant.

Tuesday, February 5, 2013

There's No There There


I grew up in a philosophy department and so I always feel a warm glow of nostalgia when a book by an eminent practitioner of this dark art turns his/her attention to my tiny areas of interest. Recently, Jesse Prinz, Distinguished Professor in Philosophy at CUNY, has directed his attention to the shortcomings of rationalist efforts in the mental sciences in an effort to resuscitate empiricist conceptions of mind. The book, Beyond Human Nature, is very much worth looking at (but for god’s sake don’t buy it!) for it is a good primer on just how little empiricist conceptions, despite the efforts of its mightiest minds, have to offer those seriously interested in cognition. I’m not talking modest contributions, I’m talking NADA!  Before proceeding, some warnings. I am a severe thaasophobe, with an attention span that requires quick capture.  Banalities and weasel wording can induce immediate narcoleptic seizure. Prinz held me to the end of chapter 6 before Morpheus bared further progress. I never fight with gods. Consequently, remarks here are limited to the first six chapters and of these they concentrate mainly on 6, this being dedicated to what I know best, the Chomsky program in generative grammar. With this caveat, let’s push forward, though caveat lector, this post is way too long.

My main objection with the book is that it refuses to play by the rules of the game. I have discussed this before (here), but it is worth reviewing what is required to be taken seriously.  Here are the ground rules.

First, we have made many empirical discoveries over the years and the aim must be to explain these facts.  In linguistics these include Island effects, fixed subject effects, binding theory effects etc. I know I have repeated myself endlessly about this, but it seems that no matter how often we emphasize this, critics refuse to address these matters. Prinz is no exception, as we shall see. 

Second, if one is interested in not merely debunking generative grammar but the whole rationalist enterprise in cognition then attention must be paid to the results there, and there have been many. We have reviewed some by Gleitman and Spelke (here, here) but there are many more (e.g. by Baillargeon on causality and Wynn on numbers a.o.). Prinz touches on these but is coy about offering counter analyses.  Rather he is satisfied with methodological reflections on the difficulties this kind of work must deal with and dismisses 40 years of research and literally hundreds of detailed proposals by pointing out the obvious, viz. that experiments must be done carefully and that this is hard. Not exactly big news. 

Third, though acknowledging these data points is a necessary first step, more is required. In addition one must propose alternative mechanisms that derive the relevant facts.  It is not enough to express hopes, desires, expectations, wishes etc. We need concrete proposals that aim to explain the phenomena.  Absent this, one has contributed nothing to the discussion and has no right to be taken seriously.

That’s it. These are the rules of the game. All are welcome to play.  So what does Prinz do.  He adopts a simple argumentative form, which can be summarized as follows.

1.                    He accepts that there are biological bases for cognition but that they vastly underdetermine human mental capacities. He dubs his position “nurturism” (just what we need another neologism) and he contrasts with “naturism.”[1]
2.                    His main claim is that cognition has a heavy cultural/environmental component and that rationalism assumes that “all brains function in the same way…” and “our behavior is mostly driven by biology” (102).
3.                    He reviews some of the empirical arguments for rationalism and concludes that they are not apodictic, i.e. it is logically possible that they are inconclusive.
4.                    He buttresses point 3 by citing work purporting to show methodological problems with rationalist proposals. Together 3 and 4 allow Prinz to conclude that matters are unsettled, i.e. to declare a draw.
5.                    Given the draw, the prize goes to the “simpler” theory.  Prinz declares that less nativism is always methodologically preferable to more and so given the empirical standoff, the laurel goes to the empiricists.

That’s the argument. Note what’s missing: no counter proposals about relevant mechanisms. In short, Prinz is violating the rules of the game, a no-no.  Nonetheless, let’s look a bit more into his argument. 

First, Prinz allows that there is some innate structure to minds (e.g. see around 152).[2] The question is not whether there is native structure, but how much and what kind.  For Prinz, associationist machinery (i.e. anything coming in through the senses with any kind of statistical massaging) is permissible. Domain specific modular knowledge with no simple perceptual correlates is not (c.f. 171).

This is standard associationsim at its grubbiest.  So despite his insistence about how the truth must lie at some point between the naturist and nurturist extremes, Prinz erects his standard on pretty conventional empiricist ground. No modularity for him. It’s general learning procedures or nothing.

Why does Prinz insist on this rather naïve version of empiricism? He wants to allow for cultural factors to affect human mental life. For some reason. he seems to think that this is inconsistent with rationalist conceptions of the mind.  Why is beyond me.  Even if the overall structure of minds/brains is the same across species, this does not prevent modulation by all sorts of environmental and cultural factors. After all, humans have four chamber hearts as a matter of biology but how good an individual heart is for marathons is surely heavily affected by cultural/environmental factors (e.g. training regimens, diet, altitude, blood doping etc.). 

So too with cognition.  Indeed, within linguistics, this has been recognized as a boundary condition on reasonable theorizing since the earliest days of generative grammar. The standard view is that UG provides design specifications for particular Gs, and particular Gs can be very different from one another. In a standard P&P theories the differences are related to varying kinds of parameter settings, but even non-parameter theories recognize the fact of variation and aim to explain how distinct Gs can be acquired on the basis of PLD.

Indeed, one of the standard arguments for some cognitive invariance (i.e. UG) arises from the fact that despite all the attested variation among particular Gs, they have many properties in common. Comparative syntax and the study of variation has been the source of some of the strongest arguments in favor of postulating a rich domain specific UG. In short, the problem from the outset has been to explain both the invariance and the variation. Given all of this, Prinz’s suggestions that rationalists ignore variation is simply mystifying.[3]

Moreover, he seems ignorant of the fact that to date this is really the only game in town.  Prinz is staking a lot on the new statistical learning techniques to supply the requisite mechanisms for his empiricism. However, to date, purely statistical approaches have had rather modest success. This is not to say that stats are useless. They are not. But they are not the miracle drug that Prinz seems to assume they are. 

This emerges rather clearly in his discussion of that old chestnut, the poverty of the stimulus argument (POS) using the only example that non-linguists seem to understand, polar questions.  Sadly, Prinz’s presentation of the POS demonstrates once again how subtle the argument must be for he clearly does not get it. The problem (as Paul Pietroski went over in detail here and that I reviewed again here) is to explain constrained homophony (i.e. the existence of systematic gaps in sound-meaning pairings).   It is not to explain how to affix stars, question marks and other diacritics to sentences (i.e. not how to rank linguistic items along an acceptability hierarchy). There has been a lot of confusion on this point and it has vitiated much of the criticism of Chomsky’s original argument.  The confusion likely stems from the fact that whereas an acceptability hierarchy is a standard byproduct of a theory of constrained homophony, the converse is not true, i.e. a theory of acceptability need not say much about the origins of constrained homophony.  But as the question of interest is how to relate sound and meaning (viz. the generative procedures relating them), simply aiming to distinguish acceptable from unacceptable sentences is to aim in the wrong direction.

Why is this important? Because of the myriad dumb critiques of Chomsky’s original POS argument that fail precisely because they misconstrue the explanadum.  The poster child of this kind of misunderstanding is Reali and Christiansen (R&C), which, of course, Prinz points to as providing a plausible statistical model for language acquisition.  As Prinz notes (2513), P&C’s analysis counts bigram and trigram word frequencies and from just so counting, is able to discriminate (1) from (2).

(1)  Is the puppy that is barking angry?
(2)  Is the puppy barking is angry?

Prinz is delighted with this discovery.  As he says:

This is an extremely important finding. By their second birthday, children have heard enough sentences to select between grammatical and ungrammatical questions even when they are more complex than the questions they have heard (loc 2513).

The problem however is that even if this is correct, the R&C proposal answers the wrong question. The question is why can’t kids form sentences like (2) with the meaning “is it the case that the angry puppy is barking” on analogy with (1)’s meaning “is it the case that the barking puppy is angry”? This is the big fact. And it exists quite independently of the overall acceptability of the relevant examples. Thus (3) carries only the meaning we find in (1), not (2) (i.e. (3) cannot mean “is it the case that the puppy that barked was the one that Bill kissed.”).

(3)  Did the puppy Bill kissed bark

This is the same fact that (1) and (2) discuss but with no unacceptable string to account for, i.e. no analogue of (2).  Bigrams and trigrams are of no use here. What we need is a rule relating form to meaning and an explanation of why some conceivable rules are absent resulting in the inexpressibility of some meanings by some sentences.  Unfortunately for Prinz, R&C’s proposals don’t even address this question let alone provide a plausible answer.

Why do Prinz and R&C so completely misunderstand what needs explaining. I cannot be sure, but here is a conjecture. They confuse data coverage with using data to probe structure.  For Chomsky, the contrast between (1) and (2) results from the fact that (1) can be generated by a structure dependent grammar while (2) cannot be. In other words, these differences in acceptability reflect differences in possible generative procedures.  It is generative procedures that are the objects of investigation not the acceptability data. As Cartwright argued (see here), empiricists are uncomfortable with the study of underlying powers/structures, viz. here the idea that there are mental powers with their own structural requirements. Empiricists confuse what something does with what it is.  This confusion is clearly at play here with the same baleful effects that Cartwright noted are endemic to empiricist conceptions of scientific explanation.

I could go on sniping at Prinz’s misunderstandings and poor argumentation. And I think I will do so to make two more points. 

First, Prinz really seems to have no idea how poor standard empiricist accounts have been.  Classical associationist theories have been deeply unsuccessful.  I want to emphasize this for Prinz sometimes leaves the impression that things are not nearly so hopeless. They are.  And not only in the study of humans, but in mammal cognition quite generally. 

Gallistel is the go-to guy on these issues (someone that I am sure that Prinz has heard of after all he just teaches across the bridge at Rutgers).  He and King review some of the shortcomings in Memory and the Computational Brain, but there is a more succinct recapitulations of the conceptual trials and empirical tribulations of the standard empiricist learning mechanisms  in a recent paper (here).  It’s not pretty.  Not only are there a slew of conceptual problems (e.g. how to deal with the effects of non-reinforcement (69)), but the classical theories fail to explain much at all. Here’s Gallistel’s conclusion (79):

Associationist theories have not explained either the lack of effect of partial reinforcement on reinforcements to acquisition or the extinction-prolonging effect of partial reinforcement. Nor have they explained spontaneous recovery, reinstatement, renewal and resurgence except by ad hoc parametric assumptions…I believe these failures derive from the failure to begin with a characterization of the problem that specific learning mechanisms and behavioral systems are designed to solve. When one takes an analysis of the problems as one’s point of departure…insights follow and paradoxes dissolve. This perspective tends, however, to lead the theorist to some version of rationalism, because the optimal computation will reflect the structure of the problem, just as the structure of the eye and the ear reflect the principles of optics and acoustics.

Gallistel’s arguments hinge on trying to understand the detailed mechanisms underlying specific capacities. It’s when the rubber hits the road that the windy generalities of empiricism start looking less than helpful.  Sadly, Prinz never really gets down to discussions of mechanisms, i.e. he refuses to play by the rules.  Maybe it’s the philosopher in him.

So what does Prinz do instead?  He spends a lot of time discussing methodological issues that he hopes will topple the main results. For example, he discusses how difficult it can be to interpret eye gaze, the standard measure used in infant and toddler studies (1547).  Eye gaze can be hard to interpret. What change it is indexing can be unclear. Sometimes it indexes stimulus similarity other times novelty. Sometimes it is hard to tell if it’s tracking a surface change in stimulus or something deeper. And that’s why people who use eye gaze measures try to determine what eye gaze duration is actually indexing in the particular context in which it’s being used.  That’s part of good experimental design in these areas. I know this because this is extensively discussed in the lab meetings I sit in on (thanks Jeff) whenever eye gaze duration is used to measure knowledge in the as yet inarticulate.  The technique has been used for a long long time. Hence its potential pitfalls are well known and for precisely this reason it is very unlikely that all the work that uses it will go down the intellectual drain for the trivial methodological reasons that Prinz cites. To put it bluntly, Baillargeon, Carey, Spelke, Wynn etc. are not experimentally inept.  As Prinz notes, there are hundreds (thousands?) of studies using this technique that all point in the same rationalist direction.  However blunt a measure eye-gaze is, the number of different kinds of experiments all pointing to the same conclusion is more than a little suggestive.  If Prinz wants to bring this impressive edifice crashing down, he needs to do a lot more than note what is common knowledge, viz. that eye gaze needs contextual interpretation.

And of course, Prinz knows this.  He is not aiming for victory. He is shooting for a tie (c.f. 1512). He doesn’t want to show that rationalists wrong (just that “they don’t make their case”) and empiricists right (ok, he does want this but clearly believes this goal is out of reasonable reach), rather he wants to muddy the waters, to insinuate that there is less to the myriad rationalist conclusions than meets the eye (and there is a lot here to meet an unbiased eye), and consequently (though this does not follow as he no doubt knows) that there is more to empiricist conceptions than there appears to be. Why? Because he believes that “empiricism is the more economical theory” (1512) and should be considered superior until rationalist prove they are right.

This strategy, aside from setting a very low bar for empiricist success, conveniently removes the necessity of presenting alternative accounts or mechanisms for any phenomena of interest. Thus, whereas rationalists try to describe human cognitive capacities and explain how they might actually arise, Prinz is satisfied with empiricist accounts that just point out that there is a lot of variation in behavior and gesture towards possible socio-environmental correlates. How this all translates into what people know or they do what they do is not something Prinz demands of empiricist alternatives. [4] He is playing for a tie, assured in the belief that this is all he needs. Why does he believe this? Because he believes that Empiricism is “a more economical theory.”

Why assume this? Why think that empiricist theories are “simpler”? Prinz doesn’t say, but here is one possible reason: domain specificity in cognition requires an account of its etiology. In other words, how did the innate structure get there (think Minimalism)?  But, if this is the reason then it is not domain specificity that is problematic, but any difference in cognitive power between descendant and ancestor. Here’s what I mean.

Say that humans speak language but other animals don’t. Why? Here’s one explanation: we have domain specific structure they don’t.  Here’s another, we have computational/statistical capacities they don’t. Why is the second account inherently methodologically superior to the first? The only reason I can think of is that enhanced computational/statistical capacities are understood as differences in degree (e.g. a little more memory) while domain specific structures are understood as differences in kind.  The former are taken to be easy to explain, the latter problematic. But is this true?

There are two reasons to think not. Consider the language case. Here’s the big fact: there’s nothing remotely analogous to our linguistic capacities in any other animal. If this is due to just a slight difference in computing capacity (e.g. some fancier stats package, a little more memory) then we need a pretty detailed story demonstrating this. Why? Because it is just as plausible that a little less computing capacity should not result in this apparently qualitative difference in linguistic capacity (indeed, this was the motivation behind the earlier teach-the-chimps/gorillas-to-talk efforts). What we might expect is more along the following lines: slower talk, shorter sentences, fewer ‘you know’s interspersed in speech. But a complete lack of linguistic competence, why expect this? Maybe the difference really is just a little more of what was there before, but, as I said, we need a very good story to accept this. Need I say that none has been provided?

Second, there are many different ways to adding to computational/statistical power. For example, some ways of statistically massaging data are computationally far more demanding than others (e.g. it is no secret that Bayesianism if interpreted as requiring updating of all relevant alternatives is too computationally extensive to be credible and that’s why many Bayesians claim not to believe that this is possible).[5] If the alternative to domain specific structure is novel (and special) counting methods then what justifies the view that the emergence of the latter is easier to explain than the former? 

Prinz’s methodological assumption here is not original.  Empiricists often assume that rationalism is the more complex hypothesis. But this really depends on the details and no general methodological conclusions are warranted.  Sometimes domain specific structures allow for economizing on computational resources.[6] At any rate, none of this can be adjudicated a priori. Specific proposals need to be proposed and examined.  This is how the game is played. There are no methodological shortcuts to the argumentative high ground.

This post has gone on far too long.  To wrap up then: there is a rising tendency to think well again of ideas that have been deservedly buried.  Prinz is the latest herald of the resurrection. However, empiricist conceptions of the mind should be left to mold peacefully with other discredited ideas: flat earth, epicycles, and phlogiston.  Whatever the merits of these ideas may once have been (these least three, did have some once) they are no longer worth taking seriously. So too with classical empiricism, as Prinz’s book ably demonstrates.



[1] I’m going to drop the naturism/nurturism lingo and just return to the conventional empiricism/rationalism labels.
[2] All references are to the Kindle version.
[3] I’m no expert in these areas but it seems obvious to me that the same can be said about most work in cognitive psychology. Actually, here, if anything, the obsession with dealing with individual differences (aka cognitive variation) has retarded the search for invariances.  In the last decades this has been partly assuaged. Baillargeon, Spelke, Carey, Gleitman a.o. have pursued research strategies logically analogous to the one described above in generative grammar.
[4] I should be careful here. I am discussing capacities, but Prinz mainly trucks in behavior loc 112-132). Most rationalists aim to understand the structure of mental capacities, not behavior. What someone does is only partially determined by his/her mental capacities.  Behavior, at least for linguists of the generative variety, is not an object of study, at least at present (and in my own view, never).  I am going to assume that Prinz is also interested in capacities, though if he is not then his discussion is irrelevant to most of what rationalists are aiming to understand.
[5] See here, here and here for discussion.
[6] Berwick had an excellent presentation to this effect at the recent LSA meeting in Boston. I’ll see if I can get the slides and post them.