Faculty of Language: Bolhuis

Showing posts with label Bolhuis. Show all posts

Sunday, February 15, 2015

A paper that deserves classic status

Phil Lieberman has written an important piece (here) (henceforth PL). It’s a reply to the Bolhuis, Tattersall, Chomsky and Berwick (BTCB) piece on Merge and Darwin’s Problem (discussed here). What makes Lieberman’s piece important is that it is an almost prefect example (being short not among its least attractive qualities) of the natural affinities that ideas have for one another. In this case, the following conceptions exert strong mutual attractions:

(i) Language as communication

(ii) Associationism

(iii) Anti-modularity (i.e. cognition as general intelligence)

(iv) Gradualist conceptions of natural selection (NS) as the sole (or most important) mechanism of evolution

(v) Connectionist models of the brain. Though they may not strictly speaking imply one another, chances are that if you are attracted to one you will find the others attractive as well. Why is this?

Lieberman’s paper offers one line of argument that links these conceptions together. I would like to review these links here for I believe PL’s main message is important, precisely because it is wrong. As many of you may have noticed, I am of the opinion that Empiricism is a coherent, intellectually tight position with wide ranging (unfortunate) implications both for the study of language, mind and brains and for scientific methodology more generally. I believe that the degree that this is so is often underestimated. PL provides an example of its various strands coming together. I did not find the piece particularly persuasive, nor particularly well crafted. However, it is often in less worried versions of a set of ideas that one can more clearly see their underlying logic. PL offers us an opportunity to examine these. Hence the importance of the piece. So let’s dive in.

Concerning language, say you believe (iv) (as PL puts it: “Language evolved over millions of years”) then for NS to work its magic there has to be something common between our ancestors and ourselves. As it is evident that what "we" do with "language" is entirely different from what “they” do with it, to tell an NS story we need to find some common property between what we do and they do language wise that NS can focus on to get us from them to us. The only plausible common factor is vocalization with the common purpose of communication. So if you like (iv) you will naturally like (i). And vice versa: if you see language’s “primary role as communication” then you can see a way of understanding what we do as emerging from what they do given a sufficiently long time span (viz. million of years).

So, (i) and (iv) come as a bundle. Moreover, both of these suggest (iii). How so? Well ‘modularity’ is the term we use to mark a qualitative difference. The visual system is different in kind from the ~~visual~~ (changed 2/16/15) auditory system. Each has its own specialized operations and primitives. Vision and audition are not reflections of some common “sensing” system. Thus, modules are mental organs with their own distinctive and specialized properties (i.e. properties that are not like those found elsewhere). But these are just the kinds of things that NS per se is not that good at explaining the origins of all by itself. NS is good at finding the genetic gold among the genetic dross. It in itself provides no account for how the gold got there in the first place. In other words, given variation NS can enhance some traits and demote others. However, this presupposes a set of selectables and if what emerges is qualitatively distinctive from what came before, then NS by itself cannot account for its emergence. Some other source for the novelty needs to be found. As modules are precisely such novelties, if you buy into (i) and (iv), then you will also purchase (iii).

So no modules. But that strongly suggests that all cognition is of a piece. After all if they are not qualitatively different they are more or less the same (and NS and associationism love more-or- less-ism with all of its lovely hill climbing). In other words, the belief that the basic mechanisms of thought are effectively the same all over leads directly to (ii), associationism. What better universal cognitive glue than “imitation and associative learning”? So, we have paths that conceptually relate (i) and (ii) and (iii) and (iv).

A side note: The conceptual link between (ii) and (iv) has long been noted. For example, Chomsky commented on the common logic between NS accounts and classic associationism in his review of Skinner. Indeed, Skinner argued (as Chomsky noted) that one of the virtues of behaviorism, his species of associationsim, was its affinity with NS.

What’s the common core? Well, both NS and associationism are species of environmentalism. They share the common conceit that structure is largely a reflex of environmental shaping, a process that requires repeated environmental feedback to guide the process of evolution or learning (e.g. hill climbing with back propagation). In one case what’s shaped is the genome, in the other the mind. However both conceptions assume that the structure of the “inside” is a pretty direct function of the shaping effects of the outside. The common logic was recently detailed once again by Fodor and Piatelli-Palmarini (here). So it is not particularly surprising that aficionados of one will be seduced by the other, which means that those partial to (ii) will find (iv) attractive and vice versa.

So all that’s left is (v), and as Gallistel has shown, connectionism is just the brain mechanism of choice for associationism (see, e.g. here and here). So we can complete the circle. Starting from any of (i)-(v), we stand a pretty good chance of getting to all of the others. The link is not quite deductive, but the affinities are more than mildly attractive.

PL manages to add one more little bedfellow to this gang of five. These mutually supporting ideas also induce an adherent inability to distinguish Chomsky from Greenberg Universals. As I’ve been wont to note before (here), Empiricism and Associationism can plausibly accommodate the latter but not the former. And, right on cue, the PL paper makes the connection. Language variation (i.e. absence of Greenberg Universals) is taken to prove the impossibility of a Universal Merge operation (i.e. a Chomsky Universal). Thus, the PL paper argues that the fact that languages differ implies that they cannot be underlyingly the same, the presupposition being that identity/similarity in surface patterning is a necessary feature of a linguistic universal. If you are an Empiricist, it really is hard to see how to distinguish Chomsky from Greenberg.

There is much more nostalgic material in this little piece: Piraha makes a cameo appearance near the end (you could have predicted this, right?), as do FoxP2, Kanzi the bonobo, the Gardner chimps, and various unfounded assertions about the recursive properties of dancing. None of the claims are argued for really, simply asserted. However, given (i)-(v) you can construct (and then deconstruct) the arguments for yourself. The piece is not convincing, but, IMO, as convincing as it can be given its starting points.

BTCB reply to PL (here) and make all the obvious points. IMO, they are completely correct (but I would think this wouldn’t I?). BTCB identify a property of language that they want an evolutionary account for (viz. hierarchical discrete recursion (HDC)). They want to know how HDC of the kind we find in natural could have evloved. They note that this is not the only question relevant to the evolution of language, but it is a good question and a pretty good place to start. Curiously, this seems to be the one question that most EVOLANG types really don’t want to address. And it is clear why: it is the one that least (in the sense of 'not at all') lends itself to standard NS styles of explanation. It points to a cognitively distinctive species specific system whose properties seem sui generic. If correct (and right now there is no reason to think it is not) it argues that natural language really is cognitively different, at least in part. PL can’t believe this (why? See (i)-(v) above), as also seems to be true for most everyone else in the EVOLANG bidness. But it is, and that’s the main problem with PL’s little rebuttal. It fails to even recognize, let alone tackle, the hard EVOLANG problem: how did HDC arise?

To end: PL’s is a very useful paper. It is an object lesson in how ideas come in bunches and exhibit a certain logic and affinity. (i)-(v) above are particularly incestuous. PL’s paper exhibits these affinities. His argument is weak and that’s because (i)-(v) are wrong. And producing a very weak argument that exposes very weak premises is a very useful thing to do. PL has done us all a great favor in replying to BTCB. Take advantage of his generosity and learn.

Monday, September 1, 2014

A concise discussion of Darwin's Problem

A bunch of the good guys (Bolhuis, Tattersall, Chomsky and Berwick (BTCB) have a new paper out in Plos Biology that reviews in a concise manner the logic relating the Minimalist Program (MP) and Darwin’s problem (DP). The paper is worth knowing about for several reasons, not the least being that it came out in Plos Biology, an important venue for bio research. It also is a nice short paper that one can give to friends (I in fact just sent it to an econ buddy of mine) if they are curious about what kinds of BIG issues linguists are trying to address.

For the cognoscenti, the discussion will be quite familiar (though if you are like me and love greatest hits albums, this will be a pleasure to read). It starts where all good evo discussions have to start; with a characterization of the faculty whose emergence one is interested in explaining). BTCB hits all the required notes: language is not speech, nor an instrument for communication. Or, more precisely, externalization is not relevant to the key features the “language faculty per se” (1). Rather (and get ready for the surprise), “language is a computational cognitive mechanism that has hierarchical syntactic structure at its core…” (1). In other words, the target of evo explanation in the domain of language is how this generative capacity came to be fixed as a biological property of humans. [1]

With the target of explanation specified, BTCB goes on to make the observation that MP has the properties required to allow an assault on this problem. What the paper does not say (but I think is important), is that prior to the emergence of MP linguists had little to contribute (or more accurately, little they could contribute) to the question of how FL evolved. How does MP advance the evo issues? By providing “an extremely simple” account of human syntax, “simple” being the key feature. In other words, what MP affords (or, at least, promises) is a characterization of syntax as comprising a single simple combinatoric operation, which, when combined with the “general cognitive requirement for computationally minimal or efficient search,” “suffices to account for much of human language syntax” (p. 1-2). In other words, what MP does is so simplify the structure of FL that it makes it possible to understand how an FL with these characteristics might have come into being. Or, to put this another way, prior to MP the understood structure of FL was so complex and sui generis (had so many moving and interacting parts) that it was impossible to see how it could have evolved. In short, the only hope we have of providing an account of how FL arose in the species is if FL has an MPish structure. Note, that this is very much a conceptual argument. Evo details, as BTCB notes, will be very hard to come by, as they note and we return to anon.

BTCB proceeds to observe two nice features (consequences?) of a simple FL: (i) it could have emerged all at once, and (ii) it would have remained stable given its lack of moving parts. As BTCB note, there is evidence that both these features are correct.

The second (viz. that FL has remained largely unchanged since its emergence) is almost certainly correct as “there is no doubt that a normal child from England raised in northern Alaska would readily learn Eskimo-Aleut, or vice versa” (p. 2). In other words, so far as we can tell all humans (even those from isolated communities) share a common FL as witnessed by the fact that any human can learn any language if properly environmentally situated. As BTCB notes, the uniformity and stability of FL “points to the absence of major evolutionary change since the emergence of the language faculty” (2). It also, IMO, supports the idea that FL is not itself the end-product of selection for if it were we might expect to see continuing differential changes in FL’s structure, with different groups having slightly different FLs facilitating the use and acquisition of some languages at the expense of others. We, apparently, do not see this, which suggests that all FLs are of a piece, which would make sense if they were very simple in an MPish sort of way.

Let’s now turn to the issue of rapid emergence. Finding evidence relevant to making evo claims turns out to be very (very very very…) difficult, with the available evidence being “quite indirect.” BTCB reiterates a point made by Lewontin long ago (here) that getting non-trivial evidence that bears on the issues is not at all easy. In fact, BTCB identifies exactly one kind of useful type of evidence for dating the emergence of “language,” and it comes from archeology. The evidence is the sudden widespread explosion of symbolic artifacts in the archeological record roughly 100 thousand years ago. These artifacts have been used as “proxies” for language, the idea being that the emergence of language (and the cultural evolution it supports?) is the main causal factor behind this sudden rise in symbolic artifacts. This, BTCB emphasizes is “quite indirect” evidence for the presence of a fully operational FL, but it is all we’ve really got given the exigencies of finding the standard kinds of relevant evidence commonly used to advance reasonable evo hypothesizing (again, see Lewontin on this for an elaborate and useful review). This archeological evidence (which I assume that Tattersall is responsible for reviewing here) points to a relatively rapid emergence of FL about 100kya (p. 3). The artifact-proxies for language emerge in the archeological record all at once, in many places and quickly. It suggests that whatever took place did not happen gradually (contra many standard Darwinian tropes). Note, the possibility of rapid emergence is conceivable if what made it possible is a simple addition to an otherwise available system, the kind of system MP aims to provide.

To wrap up: BTCB has two important virtues: (i) It illustrates the strong conceptual bond between MP and DP, and (ii) it illustrates how meager the actual data bearing on evo concerns in the domain of language really are. As a matter of facts we still know next to nothing about how FL emerged. Moreover, we are unlikely to learn anything in the near future about the details given how hard it will be to find relevant evidence bearing on the issue. Nonetheless, BTCB shows that some progress has been made, but mainly from the linguistic/conceptual side. I think that BTCB is right in thinking that MP is a conceptually important move forward, as any conceivable account of the mergence of FL will require something like MP. So if you are Darwin enchanted then you’d better become a card- carrying minimalist. It’s the only hope, even if it is a faint one.

[1] The paper also makes some nice methodological observations concerning what an evolutionary account can and cannot hope to deliver. BTCB observe that as a matter of logic, evo accounts cannot deliver explanations of how what has evolved actually works. This is why a grammatical characterization of FL is so vital. Evo accounts need specifications of mechanisms. Specifications of mechanism can proceed quite happily without evo accounts of how they got there. For further discussion of this point, see the Bolhuis and Wynne paper referred to in the notes. It’s worth a read.

Wednesday, January 16, 2013

More on Darwin's Problem

Berwick, Friederici, Chomsky and Bolhuis (BFCB) have a newpaper that discusses Darwin’s Problem and Broca’s Problem (i.e. how brains embody language). The paper is a good short review of some of the relevant issues. I found parts very provocative and timely. Parts confusing. Here are some (personal) highlights with commentary.

1. BFCB review two facts that set boundary conditions on any evolutionary speculations; (i) “[h]uman language appears to be a recent evolutionary development” (roughly in the last 100,000 years citing Tattersall) and “the capacity for language has not evolved in any significant way since human ancestors left Africa” (roughly 50-80,000 years ago). In sum “that the human language faculty emerged suddenly in evolutionary time and has not evolved since. (p.1)” These two features suggest two conclusions.

First that UG emerged more or less fully formed and that whatever precipitated its emergence was something pretty simple. It was simple in two ways. It’s design structure was not the result of a lot of selective massaging and whatever triggered the change must have been pretty minimal, e.g. one mutation. I use ‘precipitate’ deliberately. The suggested picture is of a chemical reaction where the small addition of a single novel element results in a drastic qualitative change. For language the idea is that some small addition to the pre-existing cognitive apparatus results in the distillation of FL/UG.

I like this picture a lot. It is the one that Chomsky has presented several times in outlining target of minimalist speculation. If one assumes (as I do) that GB (or its very near cousins, viz. LFG, GPSG, HPSG, RG etc.), for example, roughly describes FL/UG then the project is to try to understand how something of this apparent complexity is actually quite simple. This will involve two separate but related projects: (i) eliminating the internal modularity of FL/UG as described by GB and (ii) showing that many of the operational constraints are actually reflections of more general cognitive/computational features of mammal minds.

I have discussed (ii) in various other posts (see here, here and here). As regards (i), Chomsky’s unification of Ross’s Islands via Subjacency and ‘Move alpha’ (see ‘On Wh Movement’), offers a good model of what to look for, though the minimalist unification envisioned here is far more ambitious as it involves unifying domains of grammar that Generative Grammar (GG) has taken to be very different from day one. For example, since the get-go GG has distinguished phrase structure rules from movement rules and both from construal rules. Unificationist ambitions (aka: theoretical hubris?) motivate trying to reduce these apparently distinct kinds of rules to a common core. You gentle readers will no doubt know of certain current suggestions of how to unify Phrase Structure and Movement rules as species of Merge (E and I respectively). There has also been a small (in my humble opinion, much too small!) industry aiming to unify movement and control (yours truly among others, efforts reviewed in Boeckx, Hornstein andNunes) and movement and binding (starting with Chomsky’s adoption of Lebeaux’s suggestion regarding reflexives in Knowledge of Language). From my seat in the peanut gallery, these attempts have been very suggestive and largely persuasive (I would think that wouldn’t I?), though there are still some puzzles to be tamed before victory is declared. At any rate, aside from unification being a general scientific virtue, the project gains further empirical motivation in the context of Darwin’s problem given the boundary conditions adumbrated in BFCB.

The second consequence is that the evolution of FL/UG has little to do with natural selection (NS). Why? If FL emerged 100,000 years ago and humanity started dispersing 80,000 years ago then this leaves a very short time for NS to work its (generally assumed) gradual magic. Note whatever took place must have happened entirely before the move out of Africa for otherwise we would expect group variation in FL/UG. If NS was the prime factor in the evolution of FL/UG why did it stop after a mere 20,000 years. Did NS only need 20,000 years to squeeze out all the possible variation? If so, there couldn’t have been much to begin with (i.e. the system that emerged was more or less fully formed). If not, then why do we see no variation in FL/UG across different groups of humans. Over the last 40,000 years we have encountered many isolated groups of people, with very distinctive customs living in very diverse and remote environments. Despite these manifest differences all humans share a common FL, as attested to by the fact that kids from any of these groups can learn the language of any other in essentially the same way (even the Piraha!). The absence of any perceptible group differences in FL/UG suggests that NS did not drive the change from pre-grammatical to grammatical or if it did so then there was very little variation to begin with.

2. BFCB provide an argument that communication is “ancillary to language design.” This relates to a previous post (here) where I discussed two competing evolutionary scenarios, one driven by communication, the other by enhanced cognition. As even a non-careful reader will have surmised, I am sympathetic to the second scenario. However, truth be told, I don’t understand the argument BFCB provide for the claim that communicative efficacy is only (at best) a secondary consideration for grammar design. The paper notes “the deletion of copies” which “make[s] sentence production easier renders sentence perception harder.” They conclude from this that deletion “follows the computational dictates of factor (iii)” (i.e. third factor concerns) over the “principle of communicative efficiency.” This, they continue, supports the conclusion “that externalization (a fortiori communication) is ancillary to language design.(4)”

Here’s what I don’t get: why is communicative efficiency measured by easing the burden on the receiver rather than on the sender? Why is ease of interpretation diagnostic of a communicative end but ease of expression is not and is taken instead to reflect third factor concerns? Inquiring minds would love an answer as this presumed asymmetry appears to license the conclusion that deletion is a third factor consequence and that mapping to AP is a late accretion. I don’t see the logic here.

Moreover, doesn’t this further imply that there is no deletion on the way to CI? And don’t we regularly assume that there are “LF” deletions (e.g. see Chomsky’s 1993 paper that launched the Minimalist Program). Why should there be “deletion” of copies at LF if deletion is simply a way of reducing the computational burden arising from having to express phonological material. I don’t get it. Help!

3. The paper has an important discussion of human lexicalization and what it means for Darwin’s problem. Human language has two distinctive features.

The first is the nature of the computational system, viz. it’s hierarchical recursion. I’ve discussed this elsewhere (here and here) so I will spare you more of the same.

The second concerns computational atoms, i.e. words. There are at least two amazing things about them. First, we have soooo many and they are learned soooo quickly! Words are “learned with amazing rapididty, one per waking hour at the peak period of language acquisition. (5)” I’ve discussed some of this before and Lila has chimed in with various correctives. However, as syntacticians like me tend to focus on grammar, rather than words, the large size and rapid speed of vocabulary acquisition bears constant repeating. Just like no other animal has anything like human grammatical structure, no other animal has anything quite like our lexicon, either quantitatively or qualitatively.

Let’s spend a second on these qualitative features. As BFCB note human lexical items “appear to be radically different from anything found in animal communication. (4)” In discussing work by Laura Petitto (one of Nym Chimpsky’s original handlers), BFCB highlight her conclusion that “chimps do not really have “names for things” at all. They only have a hodge-podge of loose associations,” in contrast with even the youngest children whose earliest words are “used in a kind-concept constrained way” (5).

In fact these lexical constraints are remarkably complex. Chomsky has repeatedly noted (e.g. starting with Reflections on Language and in almost every subsequent philo book since) that “[e]ven the simplest elements of the lexicon do not pick out (‘denote’) mind independent entities. Rather their regular use relies crucially on the complex ways in which humans interpret the world: in terms of such properties as psychic continuity, intention and goal, design and function, presumed cause and effect, Gestalt properties and so on” (5). This raises Plato’s problem in the domain of lexical acquisition and, given the vast noted difference between human lexical concepts and animal “words,” a strong version of Darwin’s problem as well.

It would be nice if we could tie the two distinctive features of human language (viz. unbounded hierarchical structure and vast and intricate vocabulary) together somehow. Boy would it be nice. I have a rough rule of scientific thumb: keep miracles to a minimum! We already need (at least) one for grammar, now it looks like we need a second for the human lexicon. Can these be related? Please?!

Here’s some idle speculation with the hope that wishing might make it so. First consider the complexity of lexicalized concepts. In the previous post on Darwin's Problem, I noted the H-VSK hypothesis that what grammar adds is the capacity to combine predicates from otherwise encapsulated modules together into single representations. I suggested that this fits well with the autonomy of syntax thesis, which is basically another way of describing the fact that grammatical operations, unlike module internal operations, are free to apply to predicates independently of their module specific “meanings.” Autonomy, in effect, allows distinct predicates to be brought together. The power to conjoin properties from different modules smells similar to what we find lexical items in human language doing, viz. they combine disparate features from various modules (natural physics, persons, natural biology etc.) to construct complex predicates. If so, the emergence of an abstract syntax may be a pre-condition for the formation of human lexical concepts, which are complex in that they combine features from various informationally encapsulated modules (Note: this conjecture has roots in earlier speculations of the Generative Semanticists).

Let’s now address the size of the lexicon and its speed of acquisition. Lila noted in her posts (see here and her paper here) that syntactic bootstrapping lies behind the explosion of lexical acquisition that we witness in kids. Until the syntax kicks in, lexical items are acquired slowly and laboriously. After it kicks in, we get an explosion in the growth of the lexicon. So, following H-VSK, grammar matters in forming complex predicates (aka lexical items) as it allows words to combine cross module features and, following Gleitman and colleagues, grammar underpins the explosive growth of the lexicon. If this is correct, then maybe the two miracles are connected, but please don’t ask me for the details. As I said, this is VERY speculative.

4. Broca’s problem gets a lot of airtime in BFCB. I am no expert in the matters discussed, but I confess to having been more skeptical than I expected about the reported results. Friends more knowledgeable than I am in these matters tell me that the reported results are extremely contentious and that very reasonable people strongly disagree with the specific reported claims. Here is a paper by Rogalsky and Hickok that reviews the evidence that Broca’s area shows specific sensitivity to syntactic structure. Their conclusion does not fit well with that in BFCB: “…our review leads us to conclude that there is no compelling evidence that there are sentence specific processing regions within Broca’s area” (p. 1664). Oh well.

To end: IMO, the most valuable part of BFCB, is how it frames Darwin’s problem in the domain of language. It correctly stresses that before addressing the evolution question we need to know what it is that we think evolved; the basic design of the system. FL has two parts: a hierarchical recursive syntax and a conceptually distinctive and very large lexicon. FL seems to have sprung up very rapidly. The Minimalist Program asks how this could have happened and has started to engage in the unification necessary to offer a reasonable conjecture. It’s a sign of a fecund research program that it renews itself by adding new questions and combining them with old results to make new conjectures and launch new research projects. As BFCB show, by this measure, Generative Grammar is alive and well.

Faculty of Language

Comments