Saturday, January 5, 2019

Turing and Chomsky

There are two observations that motivate the Minimalist Project. 

The first is that the emergence of FL is a rather recent phenomenon biologically, say roughly 50-100kya. The argument based on this observation is that ifbiological complexity is a function of natural selection (NS) and NS is gradual then given the observation that language biologically arose “merely” 50-100kya implies whatever arose could not have been particularly complex. Why? Because complexity would require shaping by slow selection pressures and 50-100,000 years is not enough time to shape anything very complex. That’s the argument. And it relies, ahem, on many assumptions, not all of them at all obvious.

First, why think that 50-100,000 years is not enough time to develop a complex cognitive organ? Maybe that’s a lot of time. Second, how do we measure complexity? Biology selects genes, but MP measures complexity wrt the simplicity of the principles of FL/UG. Why assume that the phenotypic simplicity of linguistic descriptions of FL/UG line up well with the simplicity of the genetic foundations that express these phenotypic traits?[1]

This second problem is, in fact, not unique to EvoLang. It is part and parcel of the “phenotypic gambit” that I discussed elsewhere (here). Nonetheless, the fact that this is a generalissue in Evo accounts does not mean it is not also a problem for MP arguments. Third, every time one picks up the papers nowadays one reads that someone is arguing that language emerged further and further back. Apparently, many believe that Neanderthals jabbered as much as we did and if this is the case we push back the emergence of language many 100,000s of years. Of course, we have no idea what such language consisted in even if it existed (did it have an FL like ours?), but there is no question that were this fact established (and it is currently considered admissible I am told) then the simple minded argument noted above becomes less persuasive.

All in all then, the first kind of Evo motivation for a simpler FL/UG, though not nothing, is not particularly dispositive (some might even think it downright weak (and we might not be able to strongly rebut this churlish skepticism)). 

But there is a second argument, and I would like to spotlight it here. The second argument is that wheneverit arose it has remained stable since its inception. In other words, FL/UG has been conserved in the species since it arose. How do we know this? Well, largely because any human kid can learn any human language in effectively the same way if prompted by the relevant linguistic input. We should be very surprised that this is so if indeed FL/UG is a very complex system that slowly arose via NS. Why? Because if it did so slowly arise, why did it suddenly STOP evolving. Why don’t we have various FL/UGs with different human groups enjoying bespoke FL/UGs specially tailored to optimally fit the peccadillos of their respective languages or dialects? Why don’t we have ethnically demarcated FL/UGs, some of which are ultra sensitive to rich morphology and some more sensitive to linear properties of strings? In other words, if FL/UG is complex why is it basically the sameacross the species, even in groups that have been relatively isolated from other human groups over longish periods of time. Note, the problem of stability is the flip side of the problem of recency. If large swaths of time make for easier gradual selection stories, they also exacerbate the problem of stability. Stasis in the face of environmental diversity (and linguistic environments sure have the appearanceof boundless diversity, as my typologically inclined colleagues never tire of reminding me) is a problem when gradual NS is taken to shape genetic material to optimally fit environmental demands. 

Curiously, the fact of stability over large periods of Evo time has become a focus of interest in the Evo world (think of Hox genes). The term of art for this sort of stability is “strong conservation” and the phenomenon of interest has been the strong conservation of certain basic genetic mechanisms over extremely long periods of Evo time. I just read about another one of these strongly conserved mechanisms in Quanta (here). The relevant conserved mechanism is one that explains biological patterns like those that regulate “[t]he development of mammalian hair, the feathers of birds and even those ridges on the roof of your mouth” (2). It is a mechanism that Turing first mooted before anyone knew much about genes or development or much else of our contemporary bio wisdom (boy was this guy smart!). There are two interesting features of these Turing Mechanisms (TMs). First, they are very strongly conserved (as we shall see) and second, they are very simple. In what follows I would like to moot a claim that is implicit in the Quanta discussion: that simplicity enables strong conservation. You can see why I like this idea. It provides a biological motivation for “simple” mechanisms that seems relevant to the language case. Let me discuss the article a bit.

It makes several observations. 

First, the relevant TM, what is called a “reaction-diffusion” mechanism is “beautifully simple.” Here is the description (2):

It requires only two interacting agents, an activator and an inhibitor, that diffuse through tissue like ink dropped in water. The activator initiates some process, like the formation of a spot, and promotes the production of itself. The inhibitor halts both actions. 

Despite this simplicity, the process can regulate widely disparate kinds of patterns: “spaced dots, stripes, and other patterns” including the pattern of feathers on birds, hair, and, of relevance in the article, denticles (the skin patterning) on sharks (2). 

Second, this mechanism is very strongly conserved. As the same TM regulates bird feathers and denticles then we are talking about a mechanism conserved over hundreds of millions of years (4). As the article puts it quoting the author of the study (2):

According to Gareth Fraser, the researcher who led the study, the work suggests that the developing embryos of diverse backboned species set down patterns of features in their outer layers of tissue in the same way — a patterning mechanism “that likely evolved with the first vertebrates and has changed very little since.”

Third, the simplicity of the basic pattern forming mechanism does not preclude variation of patterns. Quite the contrary in fact. The simplicity of the mechanism lends itself to accommodating variation. Here is a longish quote (6):

To test whether a Turing-like mechanism could create the wide range of denticle patterns seen in other sharks and their kin, the researchers tweaked the production, degradation and diffusion rates of the activator and inhibitor in their model. They found that relatively simple changes could produce patterns that matched much of the diversity seen in this lineage. The skates, for example, tend to have more sparsely patterned denticles; by either increasing the diffusion rate or decreasing the degradation rate of the inhibitor, the researchers could make more sparse patterns emerge.
Once the initial pattern is set, other, non-Turing mechanisms complete the transformation of these rows into fully formed denticles, feathers or other epithelial appendages. “You have these deeply conserved master regulator mechanisms that act early on in the development of these appendages,” Boisvert explained, “but downstream, species-specific mechanisms kick in to refine that structure.” Still, Boisvert stressed how remarkable it is that the mechanism underlying so many different biological patterns was theorized “by a mathematician with no biological training, at a time when little about molecular biology was understood.”
So, the simple mechanisms can be tweaked to generate pattern diversity and can be easily combined with other downstream non-TM “species-specific” mechanisms to “refine the structure” the basic TM lays down.
Fourth, the similarity of mechanism exists despite a wide variety of functions supported. Feathers are not hairs, and hairs and feathers are not denticles. They served different functions, yet formally they are generated by the same mechanism. In other words, the similarity is formal not functional and it is at this abstract formal (think “syntactic”) level that the common biological basis of these traits is revealed.
Fifth, the discovery of TMs like this one (and Hox, I assume) “bolsters a growing theme in developmental biology that “nature tends to invent something once, and plays variations on that theme”” (quote is from Alexander Schier of Harvard bio). 
Sixth, the article moots the main point relevant to this wandering disquisition; that the reason TMs are conserved is because they are so very simple (6):
Turing mechanisms are theoretically not the only ways to build patterns, but nature seems to favor them. According to Fraser, the reliance on this mechanism by so many far-flung groups of organisms suggests that some kind of constraint may be at work. “There simply may not be many ways in which you can pattern something,” he said. Once a system emerges, especially one as simple and powerful as a Turing mechanism (my emphasis, NH), nature runs with it and doesn’t look back.
What makes the mechanism simple? Well, one that is relevant for linguists of the MP stripe is that you really cannot take part of the reaction-diffusion function and get it to work at all. You need both parts to generate a pattern and you need nothing but these two parts to generate the wide range of patterns attested.[2]In other words, half a generation diffusion pattern does you no good and once you have one you need nothing more (see first quoted passage above). I hope that this sounds familiar (don’t worry, I will return to this in a moment).
I think that each point made is very linguistically suggestive, and we could do worse than absorb these suggestions as regulative ideals for theoretical work in linguistics moving forward. Let me elaborate.
First, simplicity of mechanism can account for stability of that mechanism in that simple mechanisms are easily conservable. Why? Because they are the minimum required to generate the relevant patterns (the reaction-diffusion pattern is as simple a system as one needs to generated a wide variety of patterns). Being minimal means that so long as such patterns eventuate in functionally useful structure at leastthis much will be needed. And given that simple generative procedures combine nicely with other more specific “rules” they will be able to accommodate both variation and species-specific bespoke adjustments. Simple rules then are both stable (because simple) and play well with others (because they can be added onto) and that is what makes them very biologically useful.[3]  
IMO, this carries over to operations like Merge perfectly. Merge based dependencies come in a wide variety of flavors. Indeed, IMO, phrase structure, movement, binding, control, c-selection, constituency, structure dependence, case, theta assignment all supervene on merge based structures (again, IMO!). This is a wide variety of different linguistic functions all built on the same basic Merge generated pattern. Moreover, it is compatible with a large amount of language specific variation, variation that will be typically coded into lexical specifications. In effect, Merge creates an envelope of possibilities that lexical features will choose among. The analogy to the above Turing Mechanisms and the specificity of hair vs skin vs feathers should be obvious.
Second, Merge, like TMs, is a very simple recursive function. What does it do? All it does is combine two expressions and nothing more! It doesn’t change the expressions in combining them I any way. It doesn’t do anything butcombine them (e.g. adds no linear information). So if you want a combination operation then Merge will be as simple an operation as you could ask for. This very simplicity and the fact that it can generate a wide range of functionally useful dependencies is what makes it stable, on a par with TMs.
Third, we should steal a page from the biologists and assume that “nature tends to invent something once.” In the linguistic context this means we should be very wary of generative redundancy in FL/UG, of having different generative operations serving the same kinds of structural ends. So, we should be very suspicious of theories that multiply ways of establishing non-local dependencies (e.g. bothI-merge andAgree under Probing) or two ways of forming relative clauses (e.g. both matching (Agree) and raising (i.e. I-merge)).[4]In other words, if Merge is required to generate phrase structure and it also suffices to generate non-local dependencies then we should not immediately assume that we have otherways of generating these non-local dependencies. It seems that nature is Okhamist, and so venerating Okham is both methodologically andmetaphysically (i.e. biologically, linguistically) condign.
Fourth, it is hard to read this article and not recognize that the theoretical temperament behind Turing’s conjectures about mechanism is very similar to those that motivate Chomsky. Here is a nice version that theoretical sentiment (6):
“Biological diversity, across the board, is based on a fairly restricted set of principles that seem to work and are reused over and over again in evolution,” said Fraser. Nature, in all its exuberant inventiveness, may be more conservative than we thought.
And all that linguistic diversity we regularly survey might also be the output of a very restricted set of very simple Generative Procedures. That is the MP hope (and as I have noted, IMO it has been reasonably well vindicated (as I have argued in various papers recently released or forthcoming)), and it is nice to see that it is finding a home in mainstream biology.[5]
Enough. The problem of stability of FL/UG smells a lot like the problem of deep conservation in biology. It also sseems like simplicity might have something to say about why this might be the case. If so, the second motivation for MP simplicity might just have some non-trivial biological motivation.[6]
[1]It is likely worse than this. As Jerry Fodor often noted, we are doubly removed from the basic mechanisms in that genes grow brains and brains secrete minds. The inference from behavior to genes thus must transit through tacit assumptions about how brains subvene minds. We know very little about this in general and especially little about how brains support linguistic cognition. Hence, all inferences from phenotypic simplicity to genetic simplicity are necessarily tenuous. Of course, if this is the best that one can do, one does it realizing the pitfalls. Hence this is not a critique, just an observation, and one, apparently, that extends to virtually every attempt to ground “behavior” in genes (as Lewontin long ago noted). 
[2]Here’s another thought to chew on: it is the generative procedure that is the same (a reaction-diffusion mechanism) not the outputs. So it is the functions in intentionthat are conserved notthe extensions thereof, which are very different.
[3]I cannot currently spell this out but I suspect that simplicity ties in with modularity. You get a simple mechanism and it easily combines with others to create complexity. If modularity is related to evolvability (which sure smells right) then simplicity will be the kind of property that evolving systems prize.
[4]This is one reason I am a fan of Sportiche’s recent efforts to reanalyze all relativization in terms of raising (aka, I-merge). More specifically, we should resist the temptation to assume that when we see different constructions evincing different patterns that the generative procedures underlying these patterns are fundamentally different.
[5]And we got there first. It is interesting to see that Chomsky’s reasoning is being recapitulated inside biology. Indeed, contrary to the often voiced complaint that linguistics is out of step with the leading ideas in biology, it seems to have been very much ahead of the curve. 
[6]Of course, it does not need this to be an important ideal. Methodological virtue also prizes simplicity. But this is different, and if tenable, important.


  1. Is there actually good evidence for the assertion that any child can learn any language? I agree that it seems clear that children can learn to produce mutually intelligible E-language when exposed to community input in which that E-language is produced. This argument crucially relies on children sharing the ability to acquire the same I-language though. Do we have any empirical evidence that that is true?

    1. Good question. What evidence do we have that adults acquire the same I language? Not that much. Indeed, I doubt that they do, in the sense that no two native speakers "have" the exact same Gs. Rather they have Gs similar enough to "produce mutually intelligible E-language," as you put it. From what I can tell, we should have roughly the same level of confidence concerning kids and their Gs. In fact, what we mean when we say that speakers have the same G or that a G is the G of a given NL is that speakers of that NL have Gs with the property you note.

      Second point: it does seem a truism that kids placed in any speech community end up looking like all other speakers of that community (more or less). So, we have about as much evidence that any kid can acquire any language as we have evidence that humans acquire Gs, and that evidence seems pretty good to me.

      But nice question. Thx.

  2. Chapter 10, I think, of Sapir's _Language_ (1921) has some relevant discussion. I think I remember reading somebody saying something about adopted children in one of the older texts, but have not managed to find it yet.

    But I don't think we have solid evidence that children from say Southeast Asia where inflection seems to be absent over a wide area would not have any minor deficits if transplanted to Russia or pre-contact Australia; not only would valid research ethics rule out the necessary experiments, but racism etc. would make them unconvincing even if they were done.

    But there is definitely nothing clear enough to have been noticed by anybody.

  3. A great post! Two observations. First, I must take issue with (or perhaps don't understand) the logic of the introduction. I don't see any necessary relation between how long ago FL attained its current form and how long it took to evolve. The fact that it "arose" in the form that we know it 50-100kya tell us nothing about how long it was developing. Had it arisen just a hundred years ago that would hardly mean it evolved in a hundred years. It is not even clear how one could even determine the starting point for its evolution.

    Second: here's a just-so-story that is plausible enough (imo) to consider. Imagine that Merge came suddenly into being in our ancestors ten million years ago, but was restricted to one particular module – say, plan-formation – to some reproductive advantage. Over the millennia, it drifted, sometimes uselessly or detrimentally, into other modules, but occasionally found its way into C-I systems in ways that provided the cognitive advantages Chomsky and others have suggested, even without spoken language. Evidence that humans were getting "smarter" before the clear advent of communicative language certainly exists, after all. At some point – and this would be the leap forward – the computations feeding C-I also found their way to A-P, and suddenly we creatures use language to communicate, which of course changes everything about human life – and *that's* what we see emerging so suddenly 50-100kya.

    In other words, the slow spread of Merge across cognitive domains may indeed have been be a long evolutionary process, resulting in the development of significantly complex structures, but culminating in the leap to language-as-communication. Merge itself of course did not "evolve:" it arose, and then got put to a variety of uses, like the TMs you describe.

    1. I agree with this first observation (but felt like I was missing some step in the argument ) . A more relevant start date might be the most recent common ancestor of humans and the common chimpanzee say, which would be circa 5 mya.