Wednesday, February 11, 2015


Bob Berwick sent me this article on songbird and human brains. The paper is a pretty hard slog form someone with my meager biological and computational talents, however, the abstract is relatively comprehensible even to someone like me. Plus, Bob was good enough to hold my hand and explain what the whole thing meant. Here’s how Bob explained the results to me:

The paper appeared in Science December 12, the lead author Andreas Pfenning, was in Erich Jarvis’ bird lab at Duke.[1] As its title says, it’s about the convergence of specialized gene transcriptional factors in vocal learners – birds vs. us/
They sifted through thousands of genes and gene expression profiles in the brains of songbirds, parrot, hummingbird, dove, quail, macaque, and us, attempting to correlate distinctive expression profiles against a sophisticated hierarchical decomposition of known brain anatomy in all the species, attempting to find out whether subregions where certain genes were expressed more highly matched up to each other across species lines in the case of vocal learners (songbirds, parrot, hummingbird, us) as opposed to non-vocal learners (dove, quail, macaque).   And the answer was Yes: the same transcriptional profiles could be aligned across all vocal learners, but not in vocal learners vs. non-vocal learners.

I am lucky because that first author, who did a lot of the computational work as part of his PhD under Jarvis, Andreas Pfenning, is now a post-doc here at MIT working on genomics, just 2 floors below me. So I had him walk me through the article and this is what he told me: The bottom line (see Fig. 1 and Table 1 of the article) is that the sets of regulatory elements – stuff that gets genes “read” (transcribed) faster or slower in song learning birds and humans are the same across both sets of species – about 50 or so genes being regulated. That’s pretty amazing considering that birds and us are separated by at least 310 million years from a common ancestor. That’s a lot of evolutionary time.  Yet, both sets of species converged on the same solution for vocal learning. Now, it might be that there are just not that many ways to build a vocal learning system and it’s all been highly conserved, a lot like the eye. And going a bit further, it’s not hard to imagine that all vertebrates have the same basic toolkit for vocal learning, and then it’s switched on by just a few regulatory changes and voilà, you’ve got a song to sing.  Now as to why it’s not switched on in other primates – who knows? But what it does hint is that it might not take that long to get the “input/output” peripheral device built, a key part of  “externalization”.  And if so, then maybe that doesn’t take a lot of gene tweaking – perhaps as few as 1 or 2 genes out of the 50 – and it all gets done a lot faster than updating the MBTA subway trains (currently 1/2 disabled because they are more than 50 years old and run on DC current. Sorry, I couldn’t help myself, there’s nearly 2 meters of snow here and the mass transit has broken down). So, evolving externalization might not be as hard as Chomsky has thought.  And there you go: something for us to think about for evolang.   Oh – they found an intriguing nonassociation: between the birds’ Pallium or HVC areas, and their putative counterparts in human, Broca’s and Wernicke’s areas. Big caveat 1: all these results are associations – correlation, not causation.  There’s a lot of work now to figure out what they actual “genetic circuitry” is – what actually causes what, what the implicated genes actually do, etc. They do a bit of that in the full paper, and there are several more interesting results that I won’t cover here.  Big caveat 2: just because they didn’t find associations, doesn’t mean they aren’t there, as they say.

So, it looks increasingly likely that songbird brains are goodish neuro models for human brains when it comes to the study of vocalization. Berwick, Beckers, Okanoya and Bolhuis already noted the “linguistic” similarities between bird song and phonology. It now looks like this behavioral convergence might rest on convergence in brain organization rooted in exploitation of the same genetic mechanisms.

As Bob notes, this sure looks like (but note his caution here) another case of the eyeless gene. In other words, it looks like there is effectively one way to get vocalization off the ground biologically and all vocal species use the same basic genetic tricks to get this capacity in place. Moreover, given that vocalizers are scattered across phyla and clades it also looks like this trick is a biological option that can be pressed into service under the right conditions (whatever these happen to be, and, from what I can tell, what these conditions are is quite obscure. After all, not all birds sing, nor all apes, nor… And the question is why not if indeed this really is an available option. In other words, what are the down sides of vocalization such that every animal doesn’t blather away all the time?).

To my mind, this makes Chomsky’s conjecture that Natural Language is Meaning with sound (rather than meaning and sound) quite attractive.[2] Vocalization is something that natural selection can call on under the right circumstances, and coupling a recursive FL (which enhances thought) to an externalization mechanism which would leverage this capacity by allowing communication of these thoughts, seems like a plausible candidate for a “right circumstance” (all of this is very speculative, of course). Note, that given the spotted ubiquity of vocalization (i.e. across very different animals: birds, mice, whales, humans) it would seem that the causal line that Chomsky suggested (Recursion then externalization) makes sense. At the very least, the capacity to communicate (if we identify this with vocalization) does not bring with it the kind of grammatical system we find in humans. Biologically, it seems, there is a plausible story taking you from Merge to externalization, but none from externalization to Merge. Or, to put this another way: that humans vocalize is not that surprising given that this is the sort of capacity that seems to be just sitting there genetically for the taking. What is not just there for the taking is hierarchical recursion. Given the latter, there is plausibly further evolutionary utility to being able to vocalize. Hence, Chomsky’s conjecture: Merge first, externalization second.

Of course, if this is correct, it lends some support for the view that core parts of FL did not arise for communicative ends or in response to the advantages for communication. Communication was an add on with core properties of FL arising first and then the extra benefits of being able to communicate the thoughts that the newly e”Merge”nt mechanisms made available coming on line manifest. To repeat, all of this is highly speculative, but it is intriguing to see that one standard mechanism for communication (i.e. vocalization) seems is effectively the same system in birds and people (and, I would bet, mice and whales and…) and that it seems to be latently sitting there ready for service.

Interesting stuff. Thx Bob.

[1]A. R. Pfenning et al., Science 346, 1256846 (2014).
DOI: 10.1126/science.1256846.

[2] This line of argument was first mooted by Paul Pietroski in conversation. I think I got the argument right. If so, thx. If not, sorry. He is of course responsible for whatever mangling may have occurred.


  1. It does make you wonder how the capacity for vocalisation got into the genes and why it stayed there largely unused. It's a bit like a bird having the ability to fly, but never bothering to actually do so (to misquote NC rather badly ...)

    1. Is it? Sure when linked to an FL with CI interface links it seems obvious that adding externalization would be very useful. So useful that even without voalization (ASL) it finds a way to externalize. But what if not so hooked up. Cant we imagine that being able to "sing" might come with real risks as well as rewards? At any rate, it seems to be a very old network. Many different largely distantly related animals vocalize; some birds, some primates, some mice, some whales. Not exactly a natural class. If they all exploit the same kinds of networks to do this (i.e. This is really like eyeless) then until hooked up to an FL there must be a downside to making lots of even pretty noise.

  2. So, should we or should we not be pessimistic regarding comparative animal studies and genes (cf. Hauser et al 2014 and post on this blog about the same paper)?

    1. On sound issues, optimistic. Seems birds are good model organisms. For syntax, not so much. But studying externalization shoukd tell us somethig intersting. It alrady has.

  3. Interesting because now we are in the ballpark of the folk theory of language that says "it's because we can talk." Did you ever hear Geoff Hinton talk about why he made a glove out of distinctive features (you have to see it to believe it: It was so that he could put it on lab chimps and they could all say "Let me out," thus thereby simultaneously giving us all more empathy for chimps and proving that Chomsky wrong that humans had some special capacity for language. "So," (his words), "it was doubly ideological."

    I say this because the macaques (their example of a non-vocal call-learning human-like thing) almost look minimally different from us, they just can't learn words (calls, whatever). I say it as a warning, it would be tempting for someone with such a predisposition to conclude that learning words is all that separates us from the apes. If we didn't know how syntactically poor signing apes are, I would be compelled. Studying exactly what non-human primates can do and being precise about the contrast with language is now more important.

  4. Finally had a chance to look at Pfenning et al in more detail. As usual, Bob is spot on. Very interesting to see that vocal learners employ the same suite of genes, at least on the motor side of things. Important to realise we're talking externalisation here, not syntax. But indeed this work shows that possible means of externalisation could have evolved quite quickly. As Berwick et al (2013) put it in TICS: "It may be that the neural mechanisms that evolved from a common ancestor, combined with the auditory–vocal learning ability that evolved in both humans and songbirds, contributed to the emergence of language uniquely in the human lineage." Bob also noticed : " Oh – they found an intriguing nonassociation: between the birds’ Pallium or HVC areas, and their putative counterparts in human, Broca’s and Wernicke’s areas.". This requires some qualification. First, Pfenning et al. state that "We allowed Wernicke’s area to be any subregion in the superior temporal gyrus". That's quite a generalisation! Second, in the songbirds (zebra finches), they only sampled "song control nuclei—Area X, HVC (a letter-based name), LMAN (lateral magnocellular nucleus of the nidopallium), and RA". These are essentially motor- and premotor nuclei. So, NCM (caudomedial nidopallium), considered by some to be a 'Wernicke-like' brain region (note the inverted commas and the word '-like' here), wasn't sampled. [see Bolhuis et al. (2010) Nat Rev Neurosci, for a detailed review of these issues]. Third, we're talking about convergent evolution here. These are functionally analogous systems. E.g. we think that NCM is functionally analogous to human auditory association cortex (incl. Wernicke's area). Also, in terms of neural connectivity these brain regions are pretty similar. But this does not necessarily mean that they should be homologous, nor does it mean that the same suite of genes should be involved (but we don't know that yet, because Pfenning et al. didn't sample NCM). Convergence could also mean that different taxa have come up with similar solutions to similar problems (e.g. auditory-vocal learning), but using different brain regions and different genes in the process.