Sunday, February 15, 2015

A paper that deserves classic status

Phil Lieberman has written an important piece (here) (henceforth PL). It’s a reply to the Bolhuis, Tattersall, Chomsky and Berwick (BTCB) piece on Merge and Darwin’s Problem (discussed here). What makes Lieberman’s piece important is that it is an almost prefect example (being short not among its least attractive qualities) of the natural affinities that ideas have for one another. In this case, the following conceptions exert strong mutual attractions:

(i)             Language as communication
(ii)           Associationism
(iii)          Anti-modularity (i.e. cognition as general intelligence)
(iv)          Gradualist conceptions of natural selection (NS) as the sole (or most important) mechanism of evolution
(v)           Connectionist models of the brain. Though they may not strictly speaking imply one another, chances are that if you are attracted to one you will find the others attractive as well. Why is this?

Lieberman’s paper offers one line of argument that links these conceptions together. I would like to review these links here for I believe PL’s main message is important, precisely because it is wrong. As many of you may have noticed, I am of the opinion that Empiricism is a coherent, intellectually tight position with wide ranging (unfortunate) implications both for the study of language, mind and brains and for scientific methodology more generally. I believe that the degree that this is so is often underestimated. PL provides an example of its various strands coming together. I did not find the piece particularly persuasive, nor particularly well crafted. However, it is often in less worried versions of a set of ideas that one can more clearly see their underlying logic. PL offers us an opportunity to examine these. Hence the importance of the piece. So let’s dive in.

Concerning language, say you believe (iv) (as PL puts it: “Language evolved over millions of years”) then for NS to work its magic there has to be something common between our ancestors and ourselves. As it is evident that what "we" do with "language" is entirely different from what “they” do with it, to tell an NS story we need to find some common property between what we do and they do language wise that NS can focus on to get us from them to us. The only plausible common factor is vocalization with the common purpose of communication. So if you like (iv) you will naturally like (i). And vice versa: if you see language’s “primary role as communication” then you can see a way of understanding what we do as emerging from what they do given a sufficiently long time span (viz. million of years).

So, (i) and (iv) come as a bundle. Moreover, both of these suggest (iii). How so? Well ‘modularity’ is the term we use to mark a qualitative difference. The visual system is different in kind from the visual (changed 2/16/15) auditory system. Each has its own specialized operations and primitives. Vision and  audition are not reflections of some common “sensing” system. Thus, modules are mental organs with their own distinctive and specialized properties (i.e. properties that are not like those found elsewhere). But these are just the kinds of things that NS per se is not that good at explaining the origins of all by itself. NS is good at finding the genetic gold among the genetic dross. It in itself provides no account for how the gold got there in the first place. In other words, given variation NS can enhance some traits and demote others. However, this presupposes a set of selectables and if what emerges is qualitatively distinctive from what came before, then NS by itself cannot account for its emergence. Some other source for the novelty needs to be found. As modules are precisely such novelties, if you buy into (i) and (iv), then you will also purchase (iii).

So no modules. But that strongly suggests that all cognition is of a piece. After all if they are not qualitatively different they are more or less the same (and NS and associationism love more-or- less-ism with all of its lovely hill climbing). In other words, the belief that the basic mechanisms of thought are effectively the same all over leads directly to (ii), associationism. What better universal cognitive glue than “imitation and associative learning”? So, we have paths that conceptually relate (i) and (ii) and (iii) and (iv).

A side note: The conceptual link between (ii) and (iv) has long been noted. For example, Chomsky commented on the common logic between NS accounts and classic associationism in his review of Skinner. Indeed, Skinner argued (as Chomsky noted) that one of the virtues of behaviorism, his species of associationsim, was its affinity with NS.

What’s the common core? Well, both NS and associationism are species of environmentalism. They share the common conceit that structure is largely a reflex of environmental shaping, a process that requires repeated environmental feedback to guide the process of evolution or learning (e.g. hill climbing with back propagation). In one case what’s shaped is the genome, in the other the mind. However both conceptions assume that the structure of the “inside” is a pretty direct function of the shaping effects of the outside. The common logic was recently detailed once again by Fodor and Piatelli-Palmarini (here). So it is not particularly surprising that aficionados of one will be seduced by the other, which means that those partial to  (ii) will find (iv) attractive and vice versa.

So all that’s left is (v), and as Gallistel has shown, connectionism is just the brain mechanism of choice for associationism (see, e.g. here and here). So we can complete the circle. Starting from any of (i)-(v), we stand a pretty good chance of getting to all of the others. The link is not quite deductive, but the affinities are more than mildly attractive.

PL manages to add one more little bedfellow to this gang of five. These mutually supporting ideas also induce an adherent inability to distinguish Chomsky from Greenberg Universals. As I’ve been wont to note before (here), Empiricism and Associationism can plausibly accommodate the latter but not the former. And, right on cue, the PL paper makes the connection. Language variation (i.e. absence of Greenberg Universals) is taken to prove the impossibility of a Universal Merge operation (i.e. a Chomsky Universal). Thus, the PL paper argues that the fact that languages differ implies that they cannot be underlyingly the same, the presupposition being that identity/similarity in surface patterning is a necessary feature of a linguistic universal. If you are an Empiricist, it really is hard to see how to distinguish Chomsky from Greenberg.

There is much more nostalgic material in this little piece: Piraha makes a cameo appearance near the end (you could have predicted this, right?), as do FoxP2, Kanzi the bonobo, the Gardner chimps, and various unfounded assertions about the recursive properties of dancing. None of the claims are argued for really, simply asserted. However, given (i)-(v) you can construct (and then deconstruct) the arguments for yourself. The piece is not convincing, but, IMO, as convincing as it can be given its starting points.

BTCB reply to PL (here) and make all the obvious points. IMO, they are completely correct (but I would think this wouldn’t I?). BTCB identify a property of language that they want an evolutionary account for (viz. hierarchical discrete recursion (HDC)). They want to know how HDC of the kind we find in natural could have evloved. They note that this is not the only question relevant to the evolution of language, but it is a good question and a pretty good place to start. Curiously, this seems to be the one question that most EVOLANG types really don’t want to address. And it is clear why: it is the one that least (in the sense of 'not at all') lends itself to standard NS styles of explanation. It points to a cognitively distinctive species specific system whose properties seem sui generic.  If correct (and right now there is no reason to think it is not) it argues that natural language really is cognitively different, at least in part. PL can’t believe this (why? See (i)-(v) above), as also seems to be true for most everyone else in the EVOLANG bidness. But it is, and that’s the main problem with PL’s little rebuttal. It fails to even recognize, let alone tackle, the hard EVOLANG problem: how did HDC arise?

To end: PL’s is a very useful paper. It is an object lesson in how ideas come in bunches and exhibit a certain logic and affinity. (i)-(v) above are particularly incestuous. PL’s paper exhibits these affinities. His argument is weak and that’s because (i)-(v) are wrong. And producing a very weak argument that exposes very weak premises is a very useful thing to do. PL has done us all a great favor in replying to BTCB. Take advantage of his generosity and learn.


  1. If I were David Adger I would make a big deal out of this typo: " The visual system is different in kind from the visual system." But, then, it may be more productive to assume one 'visual' was meant to be 'auditory' and move on to a genuine question. I know you do not talk to me but some of your readers may appreciate some clarification:

    "BTCB identify a property of language that they want an evolutionary account for (viz. hierarchical discrete recursion (HDC)). They want to know how HDC of the kind we find in natural could have evloved. They note that this is not the only question relevant to the evolution of language, but it is a good question and a pretty good place to start."

    Assuming there is a 'language' missing after 'natural' [?], 2 questions arise:
    1. Is at this point anything known about the biological implementation of HDC? If not, then it would not seem such a good place to start...
    2. How does the Chomskyan account explain the fact that the extensive vocalization apparatus we need for 'mere externalization' just happened to be there when it was needed?

  2. In their reaction to Lieberman’s piece, Bolhuis et al. criticize Lieberman for defining language “functionally”. At the same time they claim that their individual-computational version of FL is “domain-specific” and, I suppose, organ-like. How, the curious reader wonders, can domain-specificity and organ status be defined without functional considerations? Personally, I believe in modularity and individual-computational capacities distinct from general intelligence. However, in the case of language, the domain of such capacities is neither intrinsic nor provided by things like natural selection (as in the case of organs). The individual-computational capacity is what it is (namely FLN) thanks to externalization, i.e., its application in an invented, shared lexical environment (cf. Saussure;s “langue”).

    1. 'Functional' has two interpretations. The first is use based, as in the function of a hammer is to punt nails. The other is what relates to what: as FL is a function that maps lexical atoms to Phon Sem pairs. FL has a functional interpretation in the second sense: FL is what it is because of what it maps to. However, and this I take it is BTCB's point, it has not function in the first sense. In particular, the property they are interested in, Discrete hierarchical recursion does not have the properties it has BECAUSE it is used in communication, though having these properties might indeed facilitate communication. Now, this might not be so for other properties of FL. SO, for al I know, Gs map to AP by first mapping via phonological phrases because of the communicative benefits of so doing. This is not a proposal, but it is something that BTCB should be willing to consider as within their ambit of concerns.

      I am quite sure that none of this is new to you. You understand these distinctions as well, if not better, than I do. As such, I may have misunderstood your point. At any rate, there is one sense of functional (what does the system map to?) that is perfectly consistent with their rejection of PL's point.

  3. Norbert, we mostly agree, particularly about the fact that ‘functionality’ has two interpretations. The first one (“use determines form”) we both are against, I once upon a time under the label “the radical autonomy of syntax” in my 1987 book Domains and Dynasties. However, the second form of functionality (“mapping determines function”) deserves further critical reflection as well in the case of language. Compare it to (the individual capacity for) arithmetic in its relation to business accounting. The latter gives the former a function in your second sense. However, it would make us vulnerable to panglossian ridicule if we would call arithmetic “business accounting in the narrow sense” or “the faculty of business accounting.” The point is that we are talking about APPLICATIONS, which entail a form of agentive function assignment not found in the biology of our internal organs. So, in principle, we could GIVE other applications to Merge-type recursive computation, as we in fact do in the construction of natural numbers.

    1. Yes, I think we do agree. If there is a function it is something amorphous like: it serves the development and expression of thought. But all this points to is the understanding that Language relates to meaning and expression and this does not seem like much of a "use." That said, I am not sure I understand what you intend with the following locution:

      "form of agentive function assignment not found in the biology of our internal organs."

      What are you getting at here? I am sure that I agree, but I am not sure what I am agreeing to.

    2. @Jan K and Norbert: Yes we use the term 'function' in the Tinbergian sense, i.e. 'what is the trait for'. Rather confusingly, Aristotle called this 'final cause', e.g. the final cause of a chair is that it is used to sit in. Wynne and I discussed this matter in more detail in an essay in Nature that is our ref list. Essentially, 'cause' and 'function' are logically distinct. What we think Lieberman is doing is "fallaciously confounding the function(s) [e.g. 'communication'] of a trait [e.g. language] with its mechanism [or 'cause']. Clearly, if you think that language is the same as 'communication', you end up with a completely different evolutionary reconstruction. A similar thing happens when you equate 'language' with 'speech'.

    3. Norbert, you agree with the following. Internal organs, like the heart and the kidneys, are functional by non-agentive, natural causes, for instance by natural selection. Organs that are partially under our control, like the lungs, are a different story. They have an internal function, as in breathing, but they can also be GIVEN a function by creating a particular cultural context, as when you are a musician and use your lungs to play your favorite wind instrument. The latter type I call “agentive function assignment”, known as “application” in ordinary language. Language, in my opinion, is an APPLICATION of specific biological structures rather than a biological structure (“FLN”) itself. As I wrote earlier here, I find Stan Dehaene’s approach to reading (“recycling”) paradigmatic in this regard.

      @Johan What is called HDC above cannot be seen as a Tinbergian “trait for” language, for the same reason why the nose cannot be seen as a “trait for” carrying glasses. To read cultural functions into biological structures (as is common in biolinguistics) is what Gould and Lewontin called “the panglossian fallacy.”

    4. Jan, I recall you and Norbert having virtually the same discussion a couple of years ago, including you using the same 'noses did not evolve for carrying glasses' example. If memory serves right this discussion led nowhere then. What makes you think things will be different this time? There is no indication in Norbert's post or his replies to you that he is willing to budge an inch from his dogmatism...

    5. @Jan K
      Thx for reminding me of the App view of FL. I liked that idea when you mooted it (and began referring to it as the App idea in writing), as I did Dehaene's. Well, I mostly liked it. The one thing I did not buy is that FL is ALL just an App, though it is, hopefully MOSTLY an App. In Dehaene's story the face region and FL were not Apps. They were the ingredients for the reading App. What's the analogue of these two fixed faculties in the case of FL? I have no trouble believing that many parts of FL have been recruited from other cognitive domains (e.g. checking seems like a non-domain specific operation), but I am less convinced that this is so for whatever the operation licenses recursion (let's call it 'merge'). I don't see how this is reducible to other available operations in the cognitive domain. Of course this stems from the idea that recursion of the kind we find in language really is different from anything we find elsewhere. THis may be wrong, but right now I don't see any reason to doubt it. If so merge would be like the face recognition system in Dehaene, a given in terms of which the reading App is designed though not an App itself. So, as I don't see how it could be Apps all the way down I think that BTCB are probably right in assuming that the one thing that has not been recruited from elsewhere is merge or some analogue. As for the rest I am "appy" to go along with you. Thx.

    6. @JanK Gould & Lewontin were quite right, and if you take a look at Bolhuis & Wynne 2009, you will see that we completely agree with them. I think we should forget about functional arguments here (as we say in the original PLoS Biol paper), they are not helpful. "Communication" is a possible function of language, but as we say in the essay "that particular function is largely irrelevant in this context". "HDC" is the result of repeated application of merge. Merge seems to have evolved very recently. Functional considerations don't come into it. I think the whole functional argument is a red herring here. It's a fallacy, as we say in the reply to Lieberman.

    7. @ Johan You seem to misunderstand me. I am with you against Lieberman (about communication) and I have always been an anti-functionalist as far as Merge is concerned (and about much evolution in general). What I object to is the idea that Merge has anything to do with language in abstraction from the linguistic function we have GIVEN to it by applying it to `"invented" lexical material. Hence my critique of the panglossian concept of FLN.

    8. @ Norbert Thanks for appreciating the app view. In my opinion, “recruiting from elsewhere`” does not necessarily mean recruiting from some other functional domain (as in Dehaene’s reading example). It can also mean recruiting from epiphenomenal, functionless material (a “spandrel”). We @have no idea where Merge came from. But that leaves the logic of application intact, even if Merge has only one application (I mentioned, in fact, two). No matter what, Merge only has linguistic relevance thanks to the fact that we applied it to the lexical expressions of our culture.

    9. @Jan K
      "Merge only has linguistic relevance thanks to the fact that we applied it to the lexical expressions of our culture."

      There is one reading of this where it is a truism, i.e. merge has linguistic relevance because it is expressed in linguistic forms. Sure. were it not expressed in such forms we would not think it relevant to linguistics/language. But I suspect you mean more than this. I have a feeling that we might be going after different things. What I want (and here I think that BTCB have similar desires) is an account of the evolutionary origin of merge (or whatever the recursive engine is) as witnessed in natural language Gs. There are several options: merge is a co-opted from some other cognitive domain. I don't think that this there is any evidence for this, nor it seems, do you. Second, it was selected for. This implies that the capacity for merge was latent in our genome and environmental demands brought it to the surface. Again, I don't think that either of us think that this is the case, if for no other reason, there is little reason to think that a merge like operation exists anywhere else in the animal kingdom. Third, it was the product of mutation, in other words, a miracle. This is roughly my view and BTCB's view. Merge just popped in as is about 100kya. Forth, there was a merge like spandrel that was unconnected to any interface system that became somehow connected. So Merge was "there" in the genome but its links to CI and AP were not and until so linked there is no FL. This is possible I guess but I find it hard to think of how to distinguish this from the third option. And the question still arises, is this spandrel unique to us or shared across biology like eyeless or the seeds of vocalization? If the latter, then why are we the only ones that realize it in any form? Or do you think that other animals have merge too? At any rate, this is very subtle and it seems that we agree (do we?) that FL was not selected for nor co-opted from other domains of cognition. I am not sure that I have much to say about what might have taken place after rejecting these two options.

    10. @ Norbert It would be disappointing from an explanatory point of view if Merge turned out to be the result of divine intervention. Actually, we have no idea what this miraculous arrival of Merge would mean. It can’t simply mean the addition of a computational facility that can implement Merge. Merge is so simple that almost any computational mechanism could implement it, from Gallistel’s DNA computers to your iPhone. Running Merge wouldn’t make your iPhone more human. So, I propose that we look in a slightly different direction, namely not only at the presence of Merge but also at the fact that it can be freely accessed and used. For that, we have to focus on another remarkable human trait, the fact that we live in symbiosis with shared, external memories. Lexical items are the main carriers of our external memory, which, moreover, might have become more potent over time by classical Darwinian means (i.e., gradually). With a stable and powerful external memory and its freely accessible lexical items, the addition of compounding these by Merge might have been a relatively small step. In other words we should, I think, stop treating “externalization” as a kind of afterthought but rather see shared, external memory as the necessary breeding ground for Merge.

    11. @Jan K, this is a very interesting idea which merits careful consideration. Are there any coherent reasons to believe that Merge should be prior to other cognitive and cultural capacities that make human language what it is today? This seems to be a default assumption, but its grounds are unclear. Exploring the idea that "externalisation" might play a crucial role for the emergence (or adaptiveness) of Merge could lead to productive interaction between those interested in the evolution of human linguistic abilities in general and those committed to the importance of Merge.

    12. @Jan K
      By 'miracle' I intend 'mutation' or something similarly random. As, so far as I know, mutations are not predictable nor selected for why they occur will not be explicable. Of course, once they occur why they persist may be explicable and the kinds of explanations we will look for abound. My understanding of BTCB is that they take the emergence of merge in the genome of ancestors to be a fortuitous accident. It arose for no rhyme or reason (hence a miracle).

      That said, I agree that lexicalization really is important and sadly neglected. In fact, I think that it is quite unclear what a lexical item is and how it functions. On a more personal note, I would go further. I don't personally think that merge is the locus of the interesting action. I have argued that labeling is right place to look to understand recursion. Labeling is the capacity to form equivalence classes based on lexical atoms. It's the capacity to map a complex of atoms to one of its members and then treat it like you would the atom itself. So we gained a recursive procedure for forming larger and larger complexes which we then conjoin using a rather trivial set union operation. However, regardless of the details here, the source of this recursive engine, just like the source of merge, seems adventitious.

      That said, I second Mark's enthusiasm: were it possible to show how this linked to externalization it would be interesting. Anytime you want to post something on this, let me know. I think FoL readers would be very interested.

    13. Thanks, mark and Norbert, for your encouragement. You can relax, Norbert: of course I knew what you meant by 'miracle'!

  4. I always feel a momentary pang at being a scientist when I read an article like this one. But on the bright side, that particular instance was especially reader-friendly: the author is kind enough to produce a howler in just about every paragraph lest the attention of the reader should wane. I guess the concise response was worth it though.

  5. Yesterday, I came across a reference to this paper which treads the same five-point path to un-enlightenment:

    I wasn't sure, though, whether its argument was just a bit silly or completely ridiculous. Invoking the five pillars, they suggest that sequential analyses of linguistic structure are better than hierarchical ones.

    My uncertainty regarding how poorly to think of this comes from their attempt to make a distinction along the lines of language *use*, which I assume is calling upon pillar (i) - language as communication - to evade criticism from people who actually believe in grammar.

    But then I'm left wondering whether what they're saying is just fatuous and obviously wrong, or whether they're sneaking in one of Dan Dennett's so-called 'deepities': by relying on ambiguity as to what 'language' is, they're saying something that is actually trivially true (the production of language is word-by-word sequential because we can't vocalise in trees) while pretending that it's a profound conjecture.

    Speaking as a master's student, this makes me anxious for two reasons. The first is that it makes me doubtful of my own ability to spot what is and isn't drivel because I see that something is published by the Royal Society and assume that if I have such an aversion to it, then I must be dimly missing something. Alternatively, if it is as bad as I suspect, then the fact that it has been published by the Royal Society makes me extraordinarily depressed about the field. If it is the case - and I really hope I am misrepresenting this! - that today's researchers can't even agree on whether or not language structure is hierarchical, what the hell am I doing at university?!

    1. The paper you link to came out in 2012 and Norbert dealt with it under the title 3 psychologists walk into a bar:,

    2. Thanks for the link - I'll see if it puts my mind at ease!