Tuesday, January 31, 2017

Some short reads

Here are a couple of things to read that I found amusing.

The first two concern a power play by Elsevier that seems serious. The publisher seems to be about to get into a big fight with countries about access to their journals. Nature reports that Germany, Taiwan and Peru will soon have an Elsevier embargo placed on them, the journals that it publishes no longer available to scientists in these countries. This seems to me a big deal, and I suspect that this will be a turning point in open access publishing.  However big Elsevier is, were I their consigliere, I would council not getting into fights with countries, especially ones with big scientific establishments.

There is more in fact. It seems that Elsevier is also developing its own impact numbers, ones that make its journals look better than the other numbers do (see here). Here's one great quote from the link: "seeing a publisher developing its own metrics sounds about as appropriate as Breitbart news starting an ethical index for fake news."

Embargoing countries and setting up one's own impact metric; seems like fun times.

Here is a second link that I point to just for laughs and because I am a terrible technophobe. Many of my colleagues are LaTex fans. A recent paper suggests that whatever its other virtues, LaTex is a bot of a time sink. Take a look. Here's the synopsis:
To assist the research community, we report a software usability study in which 40 researchers across different disciplines prepared scholarly texts with either Microsoft Word or LaTeX. The probe texts included simple continuous text, text with tables and subheadings, and complex text with several mathematical equations. We show that LaTeX users were slower than Word users, wrote less text in the same amount of time, and produced more typesetting, orthographical, grammatical, and formatting errors. On most measures, expert LaTeX users performed even worse than novice Word users. LaTeX users, however, more often report enjoying using their respective software.
Ok, I admit it, my schadenfreude is tingling.

A small addendum to the previous post on Syntactic Structures

Here’s a small addition to the previous post prompted by a discussion with Paul Pietroski. I am always going on about how focusing on recursion in general as the defining property of FL is misleading. The interesting feature of FL is not that it produces (or can produce) recursive Gs but that it produces the kinds of recursive Gs that it does. So the minimalist project is not to explain how recursion arises in humans but how a specific kind of recursion arises in the species. What kind? Well the kind we find in the Gs we find. What kind are these? Well not FSGs nor CFGs, or at least this is what Syntactic Structures (SS) argues.

Let me put this another way: GG has spent the last 60 years establishing that human Gs have a certain kind of recursive structure. In SS, it argued for a transformational grammar arguing that FSGs (which were recursive) were inherently too weak and that PSGs (also recursive) were inadequate empirically. Transformational Gs, SS argued, are the right fit.


So, when people claim that the minimalist problem is to explain the sources of recursion or observe that there may be/is recursion in other parts of cognition thereby claiming to “falsify” the project, it seems to me that they are barking up a red herring (I love the smell of mixed metaphors in the morning!). From where I sit, the problem is explaining how an FL that delivers TGish recursive G arose as this is the kind of FL that we have and the kinds of Gs that it delivers. SS makes clear that “in the earliest days of GG,” not all recursive Gs are created equal and that the FL and Gs of interest have specific properties. It’s the sources for this kind of recursion we want to explain. This is worth bearing in mind when issues of recursion (and its place in minimalist theory) make it to the spotlight.

Friday, January 27, 2017

On Syntactic Structures

I am in the process of co-editing a volume on Syntactic Structures (SS) that is due out in 2017 to celebrate (aka, exploit) the 60th anniversary of the publication of this seminal work. I am part of a gang of four (the other culprits being Howard Lasnik, Pritti Patel, Charles Yang supervised/inspired by Norbert Corver). We have solicited about 15 shortish pieces on various themes. The tentative title is something like The continuing relevance of Syntactic Structures to GG. Look for it on your newsstands sometime late next year. It should arrive just in time for the 2017 holidays and is sure to be a great Xmas/ Hanukka/Kwanza gift. As preparation for this editorial escapade, I have re-read SS several times and have tried to figure out for myself what its lasting contribution is. Clearly it is an important historical piece as it sparked the Generative Enterprise. The question remains: What SS ideas have current relevance? Let me mention five.

The first and most important idea centers on the aims of linguistic theory (ch 6). SS contrasts the study of grammatical form and the particular internal (empirically to be determined) “simplicity” principles that inform it with discovery procedures that are “practical and mechanical” (56) methods that “an investigator (my emph, NH) might actually use, if he had the time, to construct a grammar of the language directly from the raw data” (52). SS argues that a commitment to discovery procedures leads to strictures on grammatical analysis (e.g. bans on level mixing) that are methodologically and empirically dubious.

The discussion in SS is reminiscent of the well-known distinction in the philo of science between the context of discovery and the context of justification. How one finds one’s theory can be idiosyncratic and serendipidous, justifying one’s “choice” is another matter entirely. SS makes the same point.[1] It proposes a methodology of research in which grammatical argumentation is more or less the standard of resaoning in the sciences more generally: data plus general considerations of simplicity are deployed to argue for the superiority of one analysis over another. SS contrasts this with the far stronger strictures Structuralists endorsed, principles which if seriously practiced would sink most any serious science. In practice then, what SS is calling for is that linguists act like regular scientists (in modern parlance, reject methodological dualism).

Let me be bit more specific. The structuralism that SS was arguing against took as a methodological dictum that the aim of analysis was to classify a corpus into a hierarchy of categories conditioned by substitution criteria. So understood, grammatical categories are classes of words, which are definable as classes of morphemes, which are defniable as classes of phonemes, which are definable as classes of phones. The higher levels are, in effect, simple generalizations over lower level entities. The thought was that higher level categories were entirely reducible to lower level distributional patterns. In this sort of analysis, there are no (and can be no) theoretical entities, in the sense of real abstract constructs that have empirical consequences but are not reducible or definable in purely observational terms. By arguing against discovery procedures and in favor of evaluation metrics SS is in effect arguing for the legitimacy of theoretical linguistics. Or, more accurately, for the legitimacy of normal scientific inquiry into language without methodological constrictions that would cripple physics were it applied.

Let me put this another way: Structuralism adopted a strong Empiricist methodology in which theory was effectively a summary of observables. SS argues for the Rationalist conception of inquiry in which theory must make contact with observables, but is not (and cannot) be reduced to them.  Given that the Rationalist stance simply reflects common scientific practice, SS is a call for linguists to start treating language scientifically and not hamstring inquiry by adopting unrealistic, indeed non-scientific, dicta. This is why SS (and GG) is reasonably seen as the start of the modern science of linguistics.

Note that the discussion here in SS differs substantially from that in chapter 1 of Aspects, though there are important points of contact.[2] SS is Rationalist as concerns the research methodology of linguists. Aspects is Rationalist as concerns the structure of human mind/brains. The former concerns research methodology. The latter concerns substantive claims about human neuro-psychology.

That said there are obvious points of contact. For example, if discovery procedures fail methodologically, then this strongly suggests that they will also fail as theories of linguistic mental structures. Syntax, for example, is not reducible to properties of sound and/or meaning despite its having observable consequences for both. In other words, the Autonomy of Syntax thesis is just a step away from the rejection of discovery procedures. It amounts to the claim that syntax constitutes a viable G level that is not reducible to the primitives and operations of any other G level.

To beat this horse good and dead: Gs contain distinct levels that interact with empirically evaluable consequences, but they are not organized so that lower levels are definable in terms of generalizations over lower level entities. Syntax is real. Phonology is real. Semantics is real. Phonetics is real. These levels have their own primitives and principles of operation. The levels interact, but are ontologically autonomous. Given the modern obsession with deep learning and its implicit endorsement of discovery procedures, this point is worth reiterating and keeping in mind. The idea that Gs are just generalizations over generalizations over generalizations that seems the working hypothesis of Deep Learners and others[3] has a wide following nowadays so it is worth recalling the SS lesson that discovery procedures both don’t work and are fundamentally anti-theoretical. It is Empiricism run statistically amok!

Let me add one more point and then move on. How should we understand the SS discussion of discovery procedures from an Aspects perspective given that they are not making the same point? Put more pointedly, don’t we want to understand how a LAD (aka, kid) goes from PLD (a corpus) to a G? Isn’t this the aim of GG research? And wouldn’t such a function be a discovery procedure?

Here’s what I think: Yes and no. What I mean is that SS makes a distinction that is important to still keep in mind. Principles of FL/UG are not themselves sufficient to explain how LADs acquire Gs. More is required. Here’s a quote from SS (56):

Our ultimate aim is to provide an objective, non-intuitive way to evaluate a grammar once presented, and to compare it with other proposed grammars (equivalently, the nature of linguistic structure) and investigating the empirical consequences of adopting a certain model for linguistic structure, rather than showing how, in principle, one might have arrived at the grammar of a language.

Put in slightly more modern terms: finding FL/UG does not by itself provide a theory of how the LAD actually acquires a G. More is needed. Among other things, we need accounts of how we find phonemes, and morphemes and many of the other units of analysis the levels require. The full theory will be very complex, with lots of interacting parts. Many mental modules will no doubt be involved. Understanding that there is a peculiarly linguistic component to this story does not imply forgetting that it is not the whole story. SS makes this very clear. However, focusing on the larger problem often leads to ignoring the fundamental linguistic aspects of the problem, what SS calls the internal conditions on adequacy, many/some of which will be linguistically proprietary.[4]

So, the most important contribution of SS is that it launched the modern science of linguistics by arguing against discovery procedures (i.e. methodological dualism). And sadly, the ground that SS should have cleared is once again infested. Hence, the continuing relevance ot the SS message.

Here are four more ideas of continuing relevance.

First, SS shows that speaker intuitions are a legitimate source of linguistic data. The discussions of G adequacy in the first several chapters are all framed in terms of what speakers know about sentences. Indeed, that Gs are models of human linguistic behavior over an unbounded domain is quite explicit (15):

…a grammar mirrors the behavior of speakers who, on the basis of a finite and accidental experience with language, can produce or understand an indefinite number of new sentences. Indeed, any explication of “grammatical in L” …can be thought of as offering an explanation for this fundamental aspect of linguistic behavior.

Most of the data presented for choosing one form of G over another involves plumbing a native speaker’s sense of what is and isn’t natural for his/her language. SS has an elaborate discussion of this in chapter 8 where the virtues of “constructional homonymity” (86) as probes of grammatical adequacy are elaborated. Natural languages are replete with sentences that have the same phonological form but differ thematically (flying planes can be dangerous) or that have different phonological forms but are thematically quite similar (John hugged Mary, Mary was hugged by John). As SS notes (83): “It is reasonable to expect grammars to provide explanations for some of these facts” and for theories of grammar to be evaluated in terms of their ability to handle them.

It is worth noting that the relevance of constructional homonymity to “debates” about structure dependence has been recently highlighted once again in a paper by Berwick, Pietroski, Yankama and Chomsky (see here and here for discussion). It appears that too many forget that linguistics facts go beyond the observation that “such and such a strong…is or is not a sentence” (85). SS warns against forgetting this, and the world would be a better place (or at least dumb critiques of GG would be less thick on the ground) if this warning 60 years ago had been heeded.

Second, SS identifies the central problem of linguistics as how to relate sound and meaning (the latter being more specifically thematic roles (though this term is not used)). This places Gs and their structure at the center of the enterprise. Indeed, this is what makes constructional homonymity such an interesting probe into the structure of Gs. There is an unbounded number of these pairings and the rules that pair them (i.e. Gs) are not “visible.” This means the central problem in linguistics is determining the structure of these abstract Gs by examining their products. Most of SS exhibits how to do this and the central arguments in favor of adding transformations to the inventory of syntactic operations involve noting how transformational grammars accommodate such data in simple and natural ways.

This brings us to the third lasting contribution of SS. It makes a particular proposal concerning the kind of G natural languages embody. The right G involves Transformations (T). Finite State Gs don’t cut it, nor can simple context free PSGs. T-grammars are required. The argument against PSGs is particularly important. It is not that they cannot generate the right structures but that they cannot do so in the right way, capturing the evident generalizations that Gs embodying Ts can do.

Isolating Ts as grammatically central operations sets the stage for the next 50 years of inquiry: specifying the kinds of Ts required and figuring out how to limit them so that they don’t wildly overgenerate.

SS also proposes the model that until very recently was at the core of every GG account. Gs contained a PSG component that generated kernel sentences (which effectively specified thematic dependencies) and a T component that created further structures from these inputs. Minimalism has partially stuck to this conception. Though it has (or some versions have) collapsed PSG kinds of rules and T rules treating both as instances of Merge, minimalist theories have largely retained the distinction between operations that build thematic structure and those that do everything else. So, even though Ts and PSG rules are formally the same, thematic information (roughly the info carried by kernel sentences in SS) is the province of E-merge applications and everything else the province of I-merge applications. The divide between thematic information and all other kinds of semantic information (aka the duality of interpretation) has thus been preserved in most modern accounts.[5]

Last, SS identifies two different linguistic problems: finding a G for a particular L and finding a theory of Gs for arbitrary L. This can also be seen as explicating the notions “grammatical in L” for a given language L vs the notion of “grammatical” tout court. This important distinction survives to the present as the difference between Gs and FL/UG. SS makes it clear (at least to me) that the study of the notion grammatical in L is interesting to the degree that it serves to illuminate the more general notion grammatical for arbitrary L (i.e. Gs are interesting to the degree that they illuminate the structure of FL/UG). As a practical matter, the best route into the more general notion proceeds (at least initially) via the study of the properties of individual Gs. However, SS warns against thinking that a proper study of the more general notion must await the development of fully adequate accounts of the more specific.

Indeed, I would go further. The idea that investigations of the more general notion (e.g. of FL/UG) are parasitic on (and secondary to) establishing solid language particular Gs is to treat the more general notion (UG) as the summary (or generalization of) of properties of individual Gs. In other words, it is to treat UG as if it were a kind of Structuralist level, reducible to the properties of individual Gs. But if one rejects this conception, as the SS discussion of levels and discovery procedures suggests we should, then prioritizing G facts and investigation over UG considerations is a bad way to go.

I suspect that the above conclusion is widely appreciated in the GG community with only those committed to a Greenbergian conception of Universals dissenting. However, the logic carries over to modern minimalist investigation as well. The animus against minimalist theorizing can, IMO, be understood as reflecting the view that such airy speculation must play second fiddle to real linguistic (i.e. G based) investigations. SS reminds us that the hard problem is the abstract one and that this is the prize we need to focus on, and that it will not just solve itself if we just do concentrate on the “lower” level issues. This would hold true of the world was fundamentally Structuralist, with higher levels of analysis just being generalizations of lower levels. But SS argues repeatedly that this is not right. It is a message that we should continue to rehearse.

Ok, that’s it for now. SS is chock full of other great bits and the collection we are editing will, I am confident, bring them out. Till then, let me urge you to (re)read SS and report back on  your favorite parts. It is excellent holiday reading, especially if read responsively accompanied by some good wine.


[1] What follows uses the very helpful and clear discussion of these matters by John Collins (here): 26-7.
[2] Indeed, the view in Aspects is clearly prefigured in SS, though is not as highlighted in SS as it is later on (see discussion p. 15).
…a grammar mirrors the behavior of speakers who, on the basis of a finite and accidental experience with language, can produce or understand an indefinite number of new sentences. Indeed, any explication of “grammatical in L” …can be thought of as offering an explanation for this fundamental aspect of linguistic behavior.
[3] Elissa Newport’s work seems to be in much the same vein in treating everything as probability distributions over lower level entities bottoming out in something like syllables or phones.
[4] Of course, the ever hopeful minimalist will hope that not very much will be such.
[5] I would be remiss if I did not point out that this is precisely the assumption that the movement theory of control rejects.

Wednesday, January 25, 2017

Where have the MOOCs gone?

I was reading the latest (February 9/2017) issue of the NYR and the lead article is a review of The revenge of the analog (by David Sax). The reviewer is Bill McKibben and it discusses a mini revolt against the digital world that is apparently taking place. I have no personal experience of this, but the article is a fun read (here). There is one point that it makes that I found interesting and comports with my own impression. It's that MOOCs have disappeared.  There was a time when that's all that we heard about, at least in academia. How MOOCs would soon displace all standard teaching and that everyone had to find a way of translating their courses into web accessible/MOOC form or they would be left far behind, the detritus of the forward march of pedagogy. This coincided with the view that education would be saved if only every student could be equipped with a tablet that would serve as conduit to the endless world of information out there. MOOCs were the leading edge of a technological revolution that would revamp our conception of education at all levels beyond recognition.

I was very skeptical of this at the time (see, e.g. here and here). Sax tries to explain what went wrong.  Here's McKibben on Sax.
The notion of imagination and human connection as analog virtues comes across most powerfully in Sax’s discussion of education. Nothing has appealed to digital zealots as much as the idea of “transforming” our education systems with all manner of gadgetry. The “ed tech” market swells constantly, as more school systems hand out iPads or virtual-reality goggles; one of the earliest noble causes of the digerati was the One Laptop Per Child global initiative, led by MIT’s Nicholas Negroponte, a Garibaldi of the Internet age. The OLPC crew raised stupendous amounts of money and created machines that could run on solar power or could be cranked by hand, and they distributed them to poor children around the developing world, but alas, according to Sax, “academic studies demonstrated no gain in academic achievement.” Last year, in fact, the OECD reported that “students who use computers very frequently at school do a lot worse in most learning outcomes.”
At the other end of the educational spectrum from African villages, the most prestigious universities on earth have been busy putting courses on the Web and building MOOCs, “massive open online courses.” Sax misses the scattered successes of these ventures, often courses in computer programming or other technical subjects that aren’t otherwise available in much of the developing world. But he’s right that many of these classes have failed to engage the students who sign up, most of whom drop out.
Even those who stay the course “perform worse, and learn less, than [their] peers who are sitting in a school listening to a teacher talking in front of a blackboard.” Why this is so is relatively easy to figure out: technologists think of teaching as a delivery system for information, one that can and should be profitably streamlined. But actual teaching isn’t about information delivery—it’s a relationship. As one Stanford professor who watched the MOOCs expensively tank puts it, “A teacher has a relationship with a group of students. It is those independent relationships that is the basis of learning. Period.”

The diagnosis fits with my perceptions as well. One of the problems MOOCs would always face, IMO, is that it left out how social an activity teaching and learning is. This is not so for everything, but it is so for many things, especially non-technical things. The problem is not the efficient transfer of information, but figuring out how to awaken the imaginative and critical sensibilities of students. (Note: these are harder to "test" than is the info transfer). The MOOCs conception treated students as "consumers" rather than "initiates." Ideas are not techniques. They need a different kind of exploration. Indeed, half of teaching is getting students to appreciate why something is important and that comes from the personal relation established between teacher and student and students to one another and student to teacher. This, at any rate, is Sax's view and if he is right then the failure of MOOCs as general strategies for education makes sense. Or, this would be a good reason for their failure.

BTW, there is a nice version of the relevant distinction in the movie The Prime of Miss Jean Brodie where Brodie notes that 'education' comes from the latin 'ex ducare' (to lead out) where much of what goes on is better seen as 'intrusion' as in 'thrust in.' Information can be crammed, thrust, delivered. Education must be more carefully curated.

Like I said, I wish this were the reason for the end of MOOCs, but I doubt it. What probably killed them (if they are indeed dead) was likely their cost. They did not really save any money, though they shifted cash from one set of pockets to another. In other words, they were scam-like and the fad has run its day. Will it return? Almost certainly. We just need a new technological twist to all for repackaging of the stuff. Maybe when virtual reality is more prevalent it will serve as the new MOOC platform and we will see the hysteria/hype rise again. There is no end to technological utopias because of their sales value. So expect a rise of MOOCs or MOOC equivalents some time soon at a university near you. But don't expect much more the next time around.

BTW, the stuff on board games was fascinating. Is this really a thing?























Friday, January 20, 2017

Tragedy, farce, pathos

Dan Everett (DE) has written once again on his views about Piraha, recursion, and the implications for Universal Grammar (here). I was strongly tempted to avoid posting on it for it adds nothing new of substance to the discussion (and will almost certainly serve to keep the silliness alive), beyond a healthy dose of self-pity and self-aggrandizement. It makes the same mistakes, in almost the same way, and adds a few more irrelevancies to the mix. If history surfaces first as tragedy and the second time as farce (see here) then pseudo debates in their moth eaten n-th iteration are just pathetic. The Piraha “debate” has long since passed its sell-by date. As I’ve said all that I am about to say before, I would urge you not to expend time or energy reading this. But if you are the kind of person who slows down to rubberneck a wreck on the road and can’t help but find the ghoulish fascinating, this post is for you.

The DE piece makes several points.

First, that there is a debate. As you all know this is wrong. There can be no debate if the controversy hinges on an equivocation. And it does, for what the DE piece claims about the G of Piraha, even if completely accurate (which I doubt, but the facts are beyond my expertise) has no bearing on Chomsky’s proposal, viz. that recursion is the only distinctively linguistic feature of FL. This is a logical point, not an empirical one. More exactly, the controversy rests on an equivocation concerning the notion “universal.” The equivocation has been a consistent feature of DE’s discussions and this piece is no different. Let me once again explain the logic.

Chomsky’s proposal rests on a few observations. First, that humans display linguistic creativity. Second, that humans are only accidentally native speakers of their native languages.

The first observation is manifest in the fact that, for example, a native speaker of English, can effortlessly use and understand an unbounded number of linguistic expressions never before encountered. The second is manifest in the observation that a child deposited in any linguistic community will grow up to be a linguistically competent native speaker of that language with linguistic capacities indistinguishable from any of the other native speakers (e.g. wrt his/her linguistic creativity).

These two observations prompt some questions.

First, what underlying mental architecture is required to allow for the linguistic creativity we find in humans?  Answer 1 a mind that has recursive rules able to generate ever more sophisticated expressions from simple building blocks (aka, a G). Question 2: what kind of mental architecture must a such a G competent being have? Answer 2: a mind that can acquire recursive rules (i.e a G) from products of those rules (i.e. generated examples of the G). Why recursive rules? Because linguistic productivity just names the fact that human speakers are competent with respect to an unbounded number of different linguistic expressions.

Second, why assume that the capacity to acquire recursive Gs is a feature of human minds in general rather than simply a feature of those human minds that have actually acquired recursive Gs? Answer: Because any human can acquire any G that generates any language. So the capacity to acquire language in general requires the meta-capacity to acquire recursive rule systems (aka, Gs).  As this meta-capacity seems to be restricted to humans (i.e. so far as we know only humans display the kind of recursive capacity manifested in linguistic creativity) and as this capacity is most clearly manifest in language then Chomsky’s conjecture is that if there is anything linguistically specific about the human capacity to acquire language the linguistic specificity resides in this recursive meta-capacity.[1] Or to put this another way: there may be more to the human capacity to acquire language than the recursive meta-capacity but at least this meta capacity is part of the story.[2] Or, to put this another way, absent the human given (i.e. innate) meta-capacity to acquire (certain specifiable kinds of) recursive Gs, humans would not be able to acquire the kinds of Gs that we know that they in fact do acquire (e.g. Gs like those English, French, Spanish, Tagalog, Arabic, Inuit, Chinese … speakers have in fact acquired). Hence, humans must come equipped with this recursive meta-capacity as part of FL.

Ok, some observations: recursion in this story is principally a predicate of FL, the meta-capacity. The meta-capacity is to acquire recursive Gs (with specific properties that GG has been in the business of identifying for the last 50 years or so). The conjecture is that humans have this meta-capacity (aka FL) because they do in fact display linguistic creativity (and, as the DE paper concedes, native speakers of non-Piraha do regularly display linguistic creativity implicating the internalization of recursive language specific Gs) and because the linguistic creativity a native speaker of (e.g.) English displays could have been displayed by any person raised in an English linguistic milieu. In sum, FL is recursive in the sense that it has the capacity to acquire recursive Gs and speakers of any language have such FLs.

Observe that FL must have the capacity to acquire recursive Gs even if not all human Gs are recursive. FL must have this capacity because all agree that many/most (e.g.) non-Piraha Gs are recursive in the sense that Piraha is claimed not to be. So, the following two claims are consistent: (1) some languages have non-recursive Gs but (2) native speakers of those languages have recursive FLs. This DE piece (like all the other DE papers on this topic) fails, once again, to recognize this. A discontinuous quote (4):

 If there were a language that chose not to use recursion, it would at the very least be curious and at most would mean that Chomsky’s entire conception of language/grammar is wrong….

Chomsky made a clear claim –recursion is fundamental to having a language. And my paper did in fact present a counterexample. Recursion cannot be fundamental to language if there are languages without it, even just one language.

First an aside: I tend to agree that it would indeed be curious if we found a language with a non-recursive G given that virtually all of the Gs that have been studied are recursive. Thus finding one that is not would be odd for the same reason that finding a single counter example to any generalization is always curious (and which is why I tend not to believe DE’s claims and tend to find the critique by Nevins, Pesetsky and Rodrigues compelling).[3] But, and this is the main take home message, whether curious or not, it is at right angles to Chomsky’s claim concerning FL for the reasons outlined above. The capacity to acquire recursive Gs is not falsified by the acquisition of a non-recursive one. Thus, logically speaking, the observation that Piraha does not have embedded clauses (i.e. does not the display one of the standard diagnostics of a recursive G) does not imply that Piraha speakers do not have recursive FLs. Thus, DE’s claims are completely irrelevant to Chomsky’s even if correct. That point has been made repeatedly and, sadly, it has still not sunk in. I doubt that for some it ever will.

Ok, let’s now consider some other questions. Here’s one: is this linguistic meta-capacity permanent or evanescent? In other words, one can imagine that FL has the capacity to acquire recursive Gs but that once it has acquired a non-recursive G it can no longer acquire a recursive one. DE’s article suggests that this is so for Piraha speakers (p. 7). Again, I have no idea if this is indeed the case (if true it constitute evidence for a strong version of the Sapir-Whorf hypothesis) but this claim even if correct is at right angles to Chomsky’s claim about FL. Species specific dedicated capacities need not remain intact after use. It could be true that FL is only available for first language acquisition and this would mean that second languages are acquired in different ways (maybe by piggy backing on the first G acquired).[4] However so far as I know, neither Chomsky nor GG has ever committed hostages to this issue. Again, I am personally skeptical that having a Piraha G precludes you from the recursive parts of a Portuguese G, but I have nothing but prejudicial hunches to sustain the skepticism. At any rate, it doesn’t bear on Chomsky’s thesis concerning FL. The upshot: DE’s remarks once again are at right angles to Chomsky’s claims so interesting as the possibility it raises might be for interesting issues relating to second language acquisition, it is not relevant to Chomsky’s claims about the recursive nature of FL.

A third question: is the meta-capacity culturally relative? DE’s piece suggests that it is because the actual acquisition of recursive Gs might be subject to cultural influences. The point seems to be that if culture influences whether an acquired G is recursive or not implies that the meta-capacity is recursive or not as well. But this does not follow.  Let me explain.

All agree that the details of an actual G are influenced by all sorts of factors, including culture.[5] This must be so and has been insisted upon since the earliest days of GG. After all, the G one acquires is a function of FL and the PLD used to construct that G. But the PLD is itself a function of what is actually gets and there is no doubt that what utterances are performed is influenced by the culture of the utterers.[6] So, that culture has an effect on the shape of specific Gs is (or should be) uncontroversial. However, none of this implies that the meta-capacity to build recursive Gs is itself culturally dependent, nor does DE’s piece explain how it could be. In fact, it has always been unclear how external factors could affect this meta-capacity. You either have a recursive meta-capacity or you don’t. As Dawkins put it (see here for discussion and references):

… Just as you can’t have half a segment, there are no intermediates between a recursive and a non-recursive subroutine. Computer languages either allow recursion or they don’t. There’s no such thing as half-recursion. It’s an all or nothing software trick… (383)

Given this “all or nothing” quality, what would it mean to say that the capacity (i.e. the innately provided “computer language” of FL) was dependent on “culture.”? Of course, if what you mean is that the exercise of the capacity is culture dependent and what you mean by this is that it depends on the nature of the PLD (and other factors) that might themselves be influenced by “culture” then duh! But, if this is what DE’s piece intends, then once again it fails to make contact with Chomsky’s claim concerning the recursive nature of FL. The capacity is what it is though of course the exercise of the capacity to produce a G will be influenced by all sorts of factors, some of which we can call “culture.”[7]

Two more points and we are done.

First, there is a source for the confusion in DE’s papers (and it is the same one I have pointed to before). DE’s discussion treats all universals as if Greenbergian. Here’s a quote from the current piece that shows this (I leave it as an exercise to the reader to uncover the Greenbergian premise):

The real lesson is that if recursion is the narrow faculty of language, but doesn’t actually have to be manifested in a given language, then likely more languages than Piraha…could lack recursion. And by this reasoning we derive the astonishing claim that. Although, recursion would be the characteristic that makes human language possible, it need not actually be found in any given language. (8)

Note the premise: unless every G is recursive then recursion cannot be “that which makes human languages possible.” But this only makes sense if you understand things as Greenberg does. If you understand the claim as being about the capacity to acquire recursive Gs then none of this follows.

Nor are we led to absurdity. Let me froth here. Of course, nobody would think that we had a capacity for constructing recursive Gs unless we had reason to think that some Gs were so. But we have endless evidence that this is the case. So, given that there is at least one such G (indeed endlessly many), humans clearly must have the capacity to construct such Gs. So, though we might have had such a capacity and never exercised it (this is logically possible), we are not really in that part of the counterfactual space. All we need to get the argument going for a recursive meta-capacity is mastery of at least one recursive G and there is no dispute that there exists such a G and that humans have acquired it. Given this, the only coherent reason for thinking a counterexample (like Piraha) could be a problem is if one understood the claim to universality as implying that a universal property of FL (i.e. a feature of FL) must manifest itself in every G. And this is to understand ‘universal’ a la Greenberg and and not as Chomsky does. Thus we are back to original sin in DE’s oeuvre; the insistence on a Greenberg conception of universal.

Second, the piece makes another point. It suggests that DE’s dispute with Chomsky is actually over whether recursion is part of FL or part of cognition more generally. Here’s the quote (10):

…the question is not whether humans can think recursively. The question is whether this ability is linked specifically to language or instead to human cognitive accomplishments more generally…

If I understand this correctly, it is agreed that recursion is an innate part of human mental machinery. What’s at issue is whether there is anything linguistically proprietary about it. Thus, Chomsky could be right to think that human linguistic capacity manifests recursion but that this is not a specifically linguistic fact about us as we manifest recursion in our mental life quite generally.[8]

Maybe. But frankly it is hard to see how DE’s writings bear on these very recondite issues. Here’s what I mean: Human Gs are not merely recursive but exhibit a particular kind of recursion. Work in GG over the last 60 years has been in service of trying to specify what kind of recursive Gs humans entertain. Now, the claim here is that we find the kind of structure we find in human Gs in cognition more generally. This is empirically possible. Show me! Show me that other kinds of cognition have the same structures as those GGers have found occur in Gs.  Nothing in DE’s arguments about Piraha have any obvious bearing on this claim for there is no demonstration that other parts of cognition have anything like the recursive structure we find in human Gs.

But let’s say that we establish such a parallelism. There is still more to do. Here is a second question: is FL recursive because our mental life in general is or is our mental life in general recursive because we have FL.[9] This is the old species specificity question all over again. Chomsky’s claim is that if there is anything species special about human linguistic facility it rests in the kind of recursion we find in language. To rebut this species specificity requires showing that this kind of recursion is not the exclusive preserve of linguistically capable beings. But, once again, nothing in DE’s work addresses this question. No evidence is presented trying to establish the parallel between the kind of recursion we find in human Gs and any animal cognitive structures.

Suffice it to say that the kind of recursion we find in language is not cognitively ubiquitous (so far as we can tell) and that if it occurs in other parts of cognition it does not appear to be rampant in non-human animal cognition. And, for me at least, that is linguistically specific enough. Moreover, and this is the important point as regards DE’s claims, it is quite unclear how anything about Piraha will bear on this question. Whether or not Piraha has a recursive G will tell us nothing about whether other animals have recursive minds like ours.

Conclusion? The same as before. There is no there there. We find arguments based on equivocation and assertions without support. The whole discussion is irrelevant to Chomsky’s claims about the recursive structure of FL and whether that is the sole UGish feature of FL.[10]

That’s it. As you can see, I got carried away. I didn’t mean to write so much. Sorry. Last time? Let’s all hope so.


[1] Here you can whistle some appropriate Minimalist tune if you would like. I personally think that there is something linguistically specific about FL given that we are the only animals that appear to manifest anything like the recursive structures we find in language. But, this is an empirical question. See here for discussion.
[2] Chomsky’s minimalist conjecture is that this is the sole linguistically special capacity required.
[3] Indeed such odd counterexamples place a very strong burden of proof on the individual arguing for it. Sometimes this burden of proof can be met. But singular counterexamples that float in a sea of regularity are indeed curious and worthy of considerable skepticism. However, that’s not my point here. It is a different one: the Piraha facts whatever they turn out to be are irrelevant to the claim the FL has the capacity to acquire recursive Gs. As this is what Chomsky has been proposing. Thus, the facts regarding Piraha whatever they turn out to be are logically irrelevant to Chomsky’s proposal.
[4] This seems to be the way that Sakel conceives of the process (see here). Sakel is the person the DE piece cites as rebutting the idea that Piraha speakers with Portuguese as a second language behave. That speakers build their second G on the scaffolding provided by a first G is quite plausible a priori (though whether it is true is another matter entirely). And if this is so, then features of one’s first G should have significant impact on properties of one’s second G. Sakel, btw, is far less categorical in her views than what DE’s piece suggests. Last point: a nice “experiment” if this interests you is to see what happens if a speaker is acquiring Portuguese and Piraha simultaneously; both as first Gs. What should we expect? I dunno, but my hunch is that both would be acquired swimmingly.
[5] So, for example, dialects of English differ wrt the acceptability of Topicalization. My community used it freely and I find them great. My students at UMD were not that comfortable with this kind of displacement. I am willing to bet that Topicalization’s alias (i.e. Yiddish Movement) betrays a certain cultural influence.
[6] Again, see note 4 and Sakel’s useful discussion of the complexity of Portuguese input to the Piraha second language acquirer.
[7] BTW, so far as I can tell, invoking “culture” is nothing but a rhetorical flourish most of the time. It usually means nothing more than “not biology.” However, how culture affects matters and which bits do what is often (always?) left unsettled. It often seems to me that the word is brandished a bit like garlic against vampires, mainly there to ward off evil biological spirits.
[8] On this view, DE agrees that there is FLB but no FLN, i.e. a UGish part of FL.
[9] In Minimalist terms, is recursion a UGish part of FL or is there no UG at all in FL.
[10] There is also some truly silly stuff in which DE speculates as to why the push back against his views has been so vigorous. Curiously, DE does not countenance the possibility that it is because his arguments though severely wanting have been very widely covered. There is some dumb stuff on Chomsky’s politics, Wolfe junk, and general BS about how to do science. This is garbage and not worth your time, except for psycho-sociological speculation.

Friday, January 13, 2017

No time but...

It's file reading season so my time is limited right now. That and the hangover from the holidays has made me slower. Nonetheless, I thought that I would post this very short piece that I read on computational models in chemistry (thx to Bill Idsardi for sending it my way). The report is interesting to linguists, IMO, for several reasons.

First, it shows how a mature science deals with complexity. It appears to be virtually impossible to do a full theoretically rigorous computation given the complexity of the problem, "but for the simplest atoms." Consequently, chemists have developed approximation techniques (algorithms/functions) to figure out how electrons arrange themselves in "bulk materials." These techniques are an artful combination of theory and empirical parameter estimation and they are important precisely because one cannot compute exact results for these kinds of complex cases. This is so even though the relevant theory is quite well known.

I suspect that if and when linguists understand more and more about the relevant computations involved in language we will face a similar problem. Even were we to know everything about the underlying competence and the relevant mental/brain machinery that uses this knowledge, I would expect the interaction effects among the various (many) interacting components to be very complex. This will lead to apparent empirical failure. But, and this is the important point, this is to be expected even if the theory is completely correct. This is well understood in the real sciences. My impression is that this bit of wisdom is still considered way out there in the mental sciences.

Second, the discussed paper warns against thinking that rich empirics can substitute for our lack of understanding. What the reported paper does is test algorithms by seeing how they perform in the simple cases where exact solutions can be computed. How well do those techniques we apply in sophisticated cases work in the simple ones? The questions: How well "different...algorithms approximated the relatively exact solutions."

The discovery was surprising: After a certain point, more sophisticated algorithms started doing worse at estimating the geometry of electrons (this is what one needs to figure out a material's chemical properties). More interesting still, the problem was most acute for "algorithms based on empirical data." Here's the relevant quote:

Rather than calculating everything based on physical principles, algorithms can replace some of the calculations with values or simple functions based on measurements of real systems (an approach called parameterization). The reliance on this approach, however, seems to do bad things to the electron density values it produces. "Functionals constructed with little or no empiricism," the authors write, "tend to produce more accurate electron densities than highly empirical ones."

It seems that when "we have no idea what the function is" that throwing data at the problem can make things worse. This should not be surprising. Data cannot substitute for theoretical insight. Sadly, this trivial observation is worth mentioning given he spirit of the age.

Here is one possible implication for theoretical work in linguistics: we often believe that one tests a theory best by seeing how it generalizes beyond the simple cases that motivate it. But in testing a theory in a complex case (where we know less) we necessarily must make assumptions based less on theory and more on the empirical details of the case at hand. This is not a bad thing to do. But it carries its own risks, as this case illustrates. The problem with complex cases is that they likely provoke interaction effects. To domesticate these effects we make useful ad hoc assumptions. But doing this makes the fundamental principles more opaque in the particular circumstance. Not always, but often.


















Friday, January 6, 2017

Where Norbert gets pounded for his biological ignorance

I recently co-wrote a comment (see here) on a piece by Berlinski and Uriagereka on Vergnaud's theory of case (see here and here), of which I am a fan, much to one of my colleagues continual dismay. The replies were interesting, especially the one by Berlinski. He excoriated the Jacobian position I tried to defend. His rebuttal should warm the hearts of many who took me to task for my thinking that it was legit to study FL/UG without doing much cross G inquiry.  He argues (and he knows more about this than I do) that the Jacob idea that the same fundamental bio mechanisms extend from bacteria to butterflies is little more than myth. The rest of his comment is worth reading too for it rightly points out that the bio-ling perspective is, to date, more perspective and less biology.

Reaction? Well, I really like the reply's vigorous pushback. However, I don't agree. In fact, I think that what he admires about Vergnaud's effort is precisely what makes the study of UG in a single language so productive.  Here's what I mean.

Vergnaud's theory aimed to rationalize facts about the distribution of nominals in a very case weak language (viz. English). It did this elegantly and without much surface morphology to back it up. Interestingly, as readers of FoL have noted, the cross G morpho evidence is actually quite messy and would not obviously support Vergnaud's thesis (though I am still skeptical that it fails here). So, the argument style that Vergnaud used and that Berlinski really admires supports the idea that intensive study of the properties of a single G is a legit way to study the general properties of FL/UG. In fact, it suggests that precisely because this method of study brackets overt surface features it cannot be driven by the distributions of these features which is what much cross G inquiry studies.  Given this logic, intensive study of a single G, especially if it sets aside surface reflexes, should generalize. And GG has, IMO, demonstrated that this logic is largely correct.  So, many of the basic features of UG, though discovered studying a small number of languages have generalized quite well cross linguistically. There have, of course, been modifications and changes. But, overall, I think that the basic story has remained the same. And I suspect that not a few biologists would make the same point about their inquiries. If results from model systems did not largely generalize then they would be of little use. Again, maybe they aren't, but this is not my impression.

Ok, take a look at Berlinski's comments. And if you like this, take a look at his, IMO, most readable book Black Mischief.