Wednesday, September 19, 2018

Generative grammar's Chomsky Problem

Martin Haspelmath (MP) and I inhabit different parts of the (small) linguistics universe. Consequently, we tend to value very different kinds of work and look to answer very different kinds of questions. As a result, when our views converge, I find it interesting to pay attention. In what follows I note a point or two of convergence. Here is the relevant text that I will be discussing (Henceforth MHT (for MHtext)).[1]

MHT’s central claim is that “Chomsky no longer argues for a rich UG of the sort that would be relevant for the ordinary grammarian and, e.g. for syntax textbooks” (1). It extends a similar view to me: “even if he is not as radical about a lean UG as Chomsky 21stcentury writings (where nothing apart from recursion is UG), Hornstein’s view is equally incompatible with current practice in generative grammar” (MHT emphasis, (2)).[2]

Given that neither Chomsky nor I seems to be inspiring current grammatical practice (btw, thx for the company MH), MHT notes that “generative grammarians currently seem to lack an ideological superstructure.” MHT seems to suggest that this is a problem (who wants to be superstructure-less after all?), though it is unclear for whom, other than Chomsky and me (what’s a superstructure anyhow?). MHT adds that Chomsky “does not seem to be relevant to linguistics anymore” (2).

MHT ends with a few remarks about Chomsky on alien (as in extra-terrestial) language, noting a difference between him and Jessica Coon on this topic. Jessica says the following (2):

 When people talk about universal grammar it’s just the genetic endowment that allows humans to acquire language. There are grammatical properties we could imagine that we just don’t ever find in any human language, so we know what’s specific to humans and our endowment for language. There’s no reason to expect aliens would have the same system. In fact, it would be very surprising if they did. But while having a better understanding of human language wouldn’t necessarily help, hopefully it’d give us tools to know how we might at least approach the problem.

This is a pretty vintage late 1980s bioling view of FL. Chomsky demurs, thinking that perhaps “the Martian language might not be so different from human language after all” (3). Why? Because Chomsky proposes that many features of FL might be grounded in generic computational properties rather than idiosyncratic biological ones. In his words:

We can, in short, try to sharpen the question of what constitutes a principled explanation for properties of language, and turn to one of the most fundamental questions of the biology of language: to what extent does language approximate an optimal solution to conditions that it must satisfy to be usable at all, given extralinguistic structural architecture?” 

MHT finds this opaque (as do I actually) though the intent is clear: To the degree that the properties of FL and the Gs it gives rise to are grounded in general computational properties, properties that a system would need to have “to be usable at all” then to that degree there is no reason to think that these properties would be restricted to human language (i.e. there is no reason to think that they would be biologically idiosyncratic). 

MHT’s closing remark about this is to reiterate his main point: “Chomsky’s thinking since at least 2002 is not really compatible with the practice of mainstream generative grammar” (3-4).

I agree with this, especially MHT's remark about current linguistic practice. Much of what interests Chomsky (and me) is not currently high up on the GG research agenda. Indeed, I have argued (herethat much of current GG research has bracketed the central questions that originally animated GG research and that this change in interests is what largely lies behind the disappointment many express with the Minimalist Program (MP). 

More specifically, I think that though MP has been wildly successful in its own terms and that it is the natural research direction building on prior results in GG, its central concerns have been of little mainstream interest. If this assessment is correct, it raises a question: why the mainstream disappointment with MP and why has current GG practice diverged so significantly from Chomsky’s? I believe that the main reason is that MP has sharpened the two contradictory impulses that have been part of the GG research program from its earliest days. Since the beginning there has been a tension between those mainly interested in the philological details of languages and those interested in the mental/cognitive/neuro implications of linguistic competence.

We can get a decent bead on the tension by inspecting two standard answers to a simple question: what does linguistics study? The obvious answer is language. The less obvious answer is the capacity for language (aka, linguistic competence). Both are fine interests (actually, I am not sure that I believe this, but I want to be concessive (sorry Jerry)). And for quite a while it did not much matter to everyday research in GG which interest guided inquiry as the standard methods for investigating the core properties of the capacity for language proceeded via a filigree philological analysis of the structures of language. So, for example, one investigated the properties of the construal modules by studying the distribution of reflexives and pronouns in various languages. Or by studying the locality restrictions on questions formation (again in particular languages) one could surmise properties of the mentalist format of FL rules and operations. Thus, the way that one studied the specific cognitive capacity a speaker of a particular language L had was by studying the details of the language L and the way that one studied more general (universal) properties characteristic of FL and UG was by comparing and contrasting constructions and their properties across various Ls. In other words, the basic methods were philological even if the aims were cognitive and mentalisic.[3]And because of this, it was perfectly easy for the work pursued by the philologically inclined to be useful to those pursuing the cognitive questions and vice versa. Linguistic theory provided powerful philological tools for the description of languages and this was a powerful selling point. 

This peaceful commensalism ends with MP. Or, to put it more bluntly, MP sharpens the differences between these two pursuits because MP inquiry only makes sense in a mentalistic/cognitive/neuro setting. Let me explain.

Here is very short history of GG. It starts with two facts: (1) native speakers are linguistically productive and (2) any human can learn any language. (1) implies that natural languages are open ended and thus can only be finitely characterized via recursive rule systems (aka grammars (Gs)). Languages differ in the rules their Gs embody. Given this, the first item on the GG research agenda was to specify the kinds of rules that Gs have and the kinds of dependencies Gs care about. Given an inventory of such rules sets up the next stage of inquiry.

The second stage begins with fact (2). Translated into Gish terms it says that any Language Acquisition Device (aka, child) can acquire any G. We called this meta-capacity to acquire Gs “FL” and we called the fine structure of FL “UG.” The fact that any child can acquire any G despite the relative paucity and poverty of the linguistic input data implies that FL has some internal structure. We study this structure by studying the kinds of rules that Gs can and cannot have. Note that this second project makes little sense until we have candidate G rules. Once we have some, we can ask why the rules we find have the properties they do (e.g. structure dependence, locality, c-command). Not surprisingly then, the investigation of FL/UG and the investigation of language particular Gs naturally went hand in hand and the philological methods beloved of typologists and comparative grammarians led the way. And boy did they lead! GB was the culmination of this line of inquiry. GB provided the first outlines of what a plausible FL/UG might look like, one that had grounding in facts about actual Gs. 

Now, this line of research was, IMO, very successful. By the mid 90s, GG had discovered somewhere in the vicinity of 25-35 non-trivial universals (i.e. design features of FL) that were “roughly” correct (see here for a (partial) list). These “laws of grammar” constitute, IMO, a great intellectual achievement. Moreover, they set the stage for MP in much the way that the earlier discovery of rules of Gs set the stage for GB style theories of FL/UG. Here’s what I mean.

Recall that studying the fine structure of FL/UG makes little sense unless we have candidate Gs and a detailed specification of some of their rules. Similarly, if one’s interest is in understanding why our FL has the properties it has, we need some candidate FL properties (UG principles) for study. This is what the laws of grammar provide; candidate principles of FL/UG. Given these we can now ask why we have these kinds of rules/principles and not other conceivable ones. And this is the question that MP sets for itself: why this FL/UG? MP, in short, takes as its explanadum the structure of FL.[4]

Note, if this is indeed the object of study, then MP only makes sense from a cognitive perspective. You won’t ask why FL has the properties it has if you are not interested in FL’s properties in the first place. So, whereas the minimalist program so construed makes sense in a GG setting of the Chomsky variety where a mental organ like FL and its products are the targets of inquiry, it is less clear that the project makes much sense if ones interests are largely philological (in fact, it is pretty clear to me that it doesn’t). If this is correct and if it is correct that most linguists have mainly philological interests then it should be no surprise that most linguists are disappointed with MP inquiry. It does not deliver what they can use for it is no longer focused on questions analogous to the ones that were prominent before and which had useful spillover effects. The MP focus is on issues decidedly more abstract and removed from immediate linguistic data than heretofore. 

There is a second reason that MP will disappoint the philologically inclined. It promotes a different sort of inquiry. Recall that the goal is explaining the properties of FL/UG (i.e. the laws of grammar are the explanada). But this explanatory project requires presupposing that the laws are more or less correct. In other words, MP takes GB as (more or less) right.[5] MP's added value comes in explaining it, not challenging it. 

In this regard, MP is to GB what Subjacency Theory is to Ross’s islands. The former takes Ross’s islands as more or less descriptively accurate and tries to derive them on the basis of more natural assumptions. It would be dumb to aim at such a derivation if one took Ross’s description to be basically wrong headed. So too here. Aiming to derive the laws of grammar requires believing that these are basically on the right track. However, this means that so far as MP is concerned, the GBish conception of UG, though not fundamental, is largely empirically accurate. And this means that MP is not an empirical competitor to GB. Rather, it is a theoretical competitor in the way that Subjacency Theory is to Ross’s description of islands. Importantly, empirically speaking, MP does not aim to overthrow (or even substantially revise the content of) earlier theory.[6]

Now this is a problem for many working linguists. First, many don’t have the same sanguine view that I do of GB and the laws it embodies. In fact, I think that many (most?) linguists doubt that we know very much about UG or FL or that the laws of grammar are even remotely correct. If this is right, then the whole MP enterprise will seem premature and wrong headed to them.  Second, even if one takes these as decent approximations to the truth, MP will encourage a kind of work that will be very different from earlier inquiry. Let me explain.

The MP project so conceived will involve two subparts. The first one is to derive the GB principles. If successful, this will mean that we end up empirically where we started. If successful, MP will recover the content of GB. Of course, if you think GB is roughly right, then this is a good place to end up. But the progress will be theoretical not empirical. It will demonstrate that it is reasonable to think that FL is simpler than GB presents it as being. However, the linguistic data covered will, at least initially, be very much the same. Again, this is a good thing from a theoretical point of view. But if one’s interests are philological and empirical, then this will not seem particularly impressive as it will largely recapitulate GB's empirical findings, albeit in a novel way.

The second MP project will be to differentiate the structure of FL and to delineate those parts that are cognitively general from those that are linguistically proprietary. As you all know, the MP conceit is that linguistic competence relies on only a small cognitive difference between us and our apish cousins. MP expects FL’s fundamental operations and principles to be cognitively and computationally generic rather than linguistically specific. When Chomsky denies UG, what he denies is that there is a lot of linguistic specificity to FL (again: he does not deny that the GB identified principles of UG are indeed characteristic features of FL). Of course, hoping that this is so and showing that it might be/is are two very different things. The MP research agenda is to make good on this. Chomsky’s specific idea is that Merge and some reasonable computational principles are all that one needs. I am less sanguine that this is all that one needs, but I believe that a case can be made that this gets one pretty far. At any rate, note that most of this work is theoretical and it is not clear that it makes immediate contact with novel linguistic data (except, of course, in the sense that it derives GB principles/laws that are themselves empirically motivated (though recall that these are presupposed rather than investigated)). And this makes for a different kind of inquiry than the one that linguists typically pursue. It worries about finding natural more basic principles and showing how these can be deployed to derive the basic features of FL. So a lot more theoretical deduction and a lot less (at least initially) empirical exploration.

Note, incidentally, that in this context, Chomsky’s speculations about Martians and his disagreement with Coons is a fanciful and playful way of making an interesting point. If FL’s basic properties derive from the fact that it is a well designed computational system (its main properties follow from generic features of computations), then we should expect other well designed computational systems to have similar properties. That is what Chomsky is speculating might be the case. 

So, why is Chomsky (and MP work more generally) out of the mainstream? Because mainstream linguistics is (and has always been IMO) largely uninterested in the mentalist conception of language that has always motivated Chomsky’s view of language. For a long time, the difference in motivations between Chomsky and the rest of the field was of little moment. With MP that has changed. The MP project only makes sense in a mentalist setting and invites decidedly philologically  projects without direct implications for further philological inquiry. This means that the two types of linguistics are parting company. That’s why many have despaired about MP. It fails to have the crossover appeal that prior syntactic theory had. MHT's survey of the lay of the linguistic land accurately reflects this IMO.

Is this a bad thing? Not necessarily, intellectually speaking. After all, there are different projects and there is no reason why we all need to be working on the same things, though I would really love it if the field left some room for the kind of theoretical speculation that MP invites.

However, the divergence might be sociologically costly. Linguistics has gained most of its extra mural prestige from being part of the cog-neuro sciences. Interestingly, MP has generated interest in that wider world (and here I am thinking cog-neuro and biology). Linguistics as philology is not tethered to these wider concerns. As a result, linguistics in general will, I believe, become less at the center of general intellectual life than it was in earlier years when it was at the center of work in the nascent cognitive and cog-neuro sciences. But I could be wrong. At any rate, MHT is right to observe that Chomsky’s influence has waned within linguistics proper. I would go further. The idea that linguistics is and ought to be part of the cog-neuro sciences is, I believe, a minority position within the discipline right now. The patron saint of modern linguistics is not Chomsky, but Greenberg. This is why Chomsky has become a more marginal figure (and why MH sounds so delighted). I suspect that down the road there will be a reshuffling of the professional boundaries of the discipline, with some study of language of the Chomsky variety moving in with cog-neuro and some returning to the language departments. The days of the idea of a larger common linguistic enterprise, I believe, are probably over.

[1]I find that this is sometimes hard to open. Here is the url to paste in: 

[2]I should add that I have a syntax textbook that puts paid to the idea that Chomsky’s basic current ideas cannot be explicated in one. That said, I assume that what MHT intends is that Chomsky’s views are not standard text book linguistics anymore. I agree with this, as you will see below.
[3]This was and is still the main method of linguistic investigation. FoLers know that I have long argued that PoS style investigations are different in kind from the comparative methods that are the standard and that when applicable they allow for a more direct view of the structure of FL. But as I have made this point before, I will avoid making it here. For current purposes, it suffices to observe that whatever the merits of PoS styles of investigation, these methods are less prevalent than the comparative method is.
[4]MHT thinks that Chomsky largely agrees with anti UG critics in “rejecting universal grammar” (1). This is a bit facile. What Chomsky rejects is that the kinds of principles we have identified as characteristic of UG are linguistically specific. By this he intends that they follow from more general principles. What he does not do, at least this is not what Ido, is reject that the principles of UG as targets of explanation. The problem with Evans and Levinson and Ibbotson and Tomasello is that their work fails to grapple with what GG has found in 60 years of research. There are a ton of non-trivial Gish facts (laws) that have been discovered. The aim is to explain these facts/laws and ignoring them or not knowing anything about them is not the same as explaining them. Chomsky “believes” that language has properties that previous work on UG ahs characterized. What he is questioning is whether theseproperties are fundamental or derived. The critics of UG that MHT cites have never addressed this question so they and Chomsky are engaged in entirely different projects. 
            Last point: MHT notes that neophytes will be confused about all of this. However, a big part of the confusion comes from people telling them that Chomsky and Evans/Levinson and Ibbotson/Tomasello are engaged in anything like the same project.
[5]Let me repeat for the record, that one can do MP and presuppose some conception of FL other than GB. IMO, most of the different “frameworks” make more or less the same claims. I will stick to GB because this is what I know best andMP indeed has targeted GB conceptions most directly.
[6]Or, more accurately, it aims to preserve most of it, just as General Relativity aimed to preserve most of Newtonian mechanics.


  1. I think there's another component lurking here, which has, in some sense, been with us since the inception of the Minimalist Program. And it is the following:

    In parallel to articulating the Minimalist Program (about which, I basically agree with everything you say here, save for maybe the pessimistic coda), Chomsky has also been putting forth a series of specific proposals: probe-goal, Phase Theory, uninterpretable features, Feature Inheritance, the Minimal Labeling Algorithm. I might be wrong about this, but from where I sit, these are not quite MPish proposals. That is, if we consider GB explorations to be "level 1"; and MP explorations (how GB-like principles could sprout from a more stripped down, minimally-stated FL) to be "level 2"; then the proposals just enumerated are something like "level 1.5": they are attempts to restate certain "level 1"(=GB) generalizations in ways that may make the eventual "level 2"(=MP) work (deriving them from a simple FL) easier than it would have been, if it had been working directly with the original "level 1"(=GB) statements of those generalizations.

    (I think this is what some people have in mind when they attempt to differentiate between the Minimalist Program and a particular Minimalist Theory; if so, then what I'm trying to highlight here is that a lot of the work that Chomsky himself has done in the past 20 years or so is actually developing a particular Minimalist Theory, not developing the Program.)

    And now here's the issue, as I see it: these specific proposals (probe-goal, Phase Theory, uninterpretable features, Feature Inheritance, the Minimal Labeling Algorithm) have been, in my opinion, increasingly bad. Just not good linguistics. Phase Theory, on its best day, is a restatement of Subjacency with hints of Barriers sprinkled in. I find the recent labeling stuff (in the "Problems of Projection" papers) to be outright embarrassing, being both theoretically retrograde (the Conceptual-Intentional system cares about syntactic labels? really?? didn't we spend the early 80's earning hard-won insights such as c-selection not, in fact, reducing to semantics?), and empirically dead-on-arrival (see, e.g., here).

    I could go on, but the point is this: to the extent that people look to Chomsky's own proposals as the standard for what work under the MP umbrella will look like, they've been confronted with a pretty poor collection of exemplars. This sends the (implicit) message that those with MPish concerns can only pursue them by first substituting the "level 1" results we know with these shoddy "level 1.5" replacements. In one sense, this is all fine: science is practiced by humans, humans are imperfect and sometimes do bad work, a bad proposal doesn't invalidate the framework within which it was proposed, etc. etc. But I think you are underestimating the sociological impact that these proposals have had on the perception of MPish work in the (ever so slightly) wider generative syntax community. This is not ending up "empirically where we started"; this is a fairly unfettered backslide.

    1. (To be clear: I think that probe-goal was a real, substantive step forward, shedding new light on Rizzi's 1990 results; Phase Theory was more or less neutral (for the reasons mentioned above), and then the backslide begins. ymmv, of course.)

    2. Why do you think it's ridiculous to propose that the CI system cares about labels?

      (I think I know the answer to this, but I'd rather not assume.)

    3. I was perhaps a bit sloppy: I see no reason to object to C-I caring about labels; what I am objecting to is the idea that labels exist (or are added) to satisfy a C-I need. And the reasons for that are well-known and well-trodden: there's the work from the 80s that I already mentioned (by Pesetsky & others) showing that c-selection doesn't reduce to semantics. There's also the much more venerable observations, you know, that verbs aren't (necessarily) "actions" and nouns aren't (necessarily) "things." And it's not just nouns and verbs: Bobaljik & Wurmbrand have a recent paper on questions with declarative syntax (i.e., so-called "declarative" C doesn't have a fully consistent semantic interpretation); Ritter & Wiltschko have a paper (2009) showing that the semantic content of Infl can be tense, or person, or location, depending on the language; and so on. I think, at the end of the day, this is fully general: labels might have (loose) semantic correlates, but they are not themselves semantic in nature. I think the evidence here is quite overwhelming.

      In light of this, pushing a theory of labeling where labeling subserves a semantic need seems fundamentally wrong-headed to me.

    4. Those studies, though, only lead to the conclusion that the CI system isn't sensitive to labels if you assume a certain theory of the CI system.

    5. "pushing a theory of labeling where labeling subserves a semantic need seems fundamentally wrong-headed to me." And to me. Labeling is needed to allow selection to work. L(exical)-selection, the hardest case, not reducible to Case or semantic selection, has for the most part simply fallen off the agenda and been forgotten, it seems to me (please correct me if I'm wrong). But no C-I requirement can plausibly tell us that "angry at", "proud of", "interested in", etc pair they way they do. We need l-selection, which means we need labels that are at least as fine-tuned as distinguishing "at" from "of", "in" etc requires. These relations are fundamentally *syntactic*, and so any theory of syntax that claims that these relations can be captured by or at the C-I interface has an uphill battle for the hearts and minds. I think the struggle of our published Minimalist systems to be able to even code such relations is one reason for some of the skepticism about the system's potential for wider success as well.

    6. This comment has been removed by the author.

    7. @Jason Yes, Label theory is wrong. But so is every scientific theory. The question for me is "Is it less wrong than the alternatives?" Of course, that question presupposes that there are alternatives, and I know of no other plausible alternative theories of projection that have been worked out so as to make any definite predictions (I know Norbert has an alternative in the works). Perhaps I'm mistaken, though.

  2. "I suspect that down the road there will be a reshuffling of the professional boundaries of the discipline, with some study of language of the Chomsky variety moving in with cog-neuro and some returning to the language departments."

    That's assuming that the language departments would want their linguists back. As far as I know language programs have largely excised grammar from their UG curricula, at least in the US. So what would they do with linguists? Not a lot of potential for additional grant revenue, and if you want a large-enrollment linguistics intro to improve the department metrics, you can just hire an adjunct/lecturer since there's many freshly minted PhDs with no other choice if they want to stay in academia.

    Similarly, no cog-sci department will jump at the opportunity to hire a generative grammarian when they can have a computational modeling neuro-guy/gal, whose research can draw from multiple funding pots and might even make the news.

    I don't see that changing anytime soon. So linguists will stay under the same institutional roof, although each department's roof might be heavily slanted in one of the two directions. Not too different from what we have right now.

  3. I don't read Chomsky's idea as being that labels encode or predict semantic properties. It is more that Merge itself doesn't include projection or headedness, and the C-I interface requires the results of Merge to have an identity. Thus, one needs a mechanism to determine from the properties of the merged items, which item projects. This was actually kinda suggested in MP, so is not really that new.

    1. @John: I don't understand what the assumption that "the C-I interface requires the results of Merge to have an identity" is supposed to rest on. There is copious evidence (e.g. c-selection) that syntax cares about the identity of the results of Merge, in a way that cannot reduce to anything the C-I interface should reasonably care about. (Of course, you can situate any kind of filtration "at the C-I interface," but that would just be an abuse of the technical vocabulary unless it is shown that this filter can be reasonably attributed to the demands of Conceptual-Intentional interpretation, and c-selection most definitely doesn't fit the bill.)

      Note, moreover, that even if we counterfactually grant the quoted assumption above – "the C-I interface requires the results of Merge to have an identity" – the idea that this identity would be stated in terms of syntactic categories seems incongruous to me. To name but one example, in most contemporary semantic theories, nouns and verbs are both predicates of type <e,t>, i.e., semantics seems to emphatically not care about the categorial distinction between the two. (Granted, D0 and Num0 do different things with this denotation than T0, Asp0, and v0 do; but that brings us back to c-selection, about which, see above.) So the C-I interface seems like the last place in the grammar that should care about the distinction between "NP" and "VP."

    2. I mean that the C-I interface cares that [a, b] is headed by one or the other item. 'Kill Bill' specifies an event, not an individual. It doesn't matter what the labels themselves are, and I assume that no-one is suggesting that semantics cares for NP and VP as such. I'd also suggest that standard type theory in semantics is wholly descriptive and has deep conceptual limitations. I do think that talk of labels is perhaps misbegotten. NC could have appealed to a 'head algorithm' instead.

    3. @John: Even if this is so, the "head algorithm" that you envision would merely recapitulate a syntax-internal computation that's already necessary in order to do c-selection (which also needs to know whether "kill Bill" is headed by "kill" or by "Bill"), so I don't see how this move would salvage Chomsky's proposal.

    4. Yes, I think that is basically right, and is nigh-on what Cecchetto and Donati suggest. You might think of heads as being determined by C-selection (more or less), but that still isn't Merge doing the work.

    5. We might have different ideas of what NC proposes.

  4. "By the mid 90s, GG had discovered somewhere in the vicinity of 25-35 non-trivial universals (i.e. design features of FL) that were “roughly” correct. These “laws of grammar” constitute, IMO, a great intellectual achievement."

    I wish this were true, though it's easy to see how the often triumphalist rhetoric of MGG authors may lead some people to think that there are major discoveries here. But in fact, all we have is an impressive widening of the phenomena – an achievement in charting the territory, no doubt (akin to James Cook's and Joseph Banks's achievements in the 18th century), but no clear progress in understanding what drives these kinds of phenomena. In the 1970s, Jackendoff thought that X-bar theory was a major breakthrough in depth of understanding, and in the 1980s, many thought that the unification of binding and constraints on movement were a breakthrough. But those hopes have long been gone, and nothing has replaced those bold ideas, as far as I can see. We have a lot of phenomena in a lot of languages, often with fancy names that outsiders may take as demonstrating our smartness ("incorporation", "weak cross-over", "heteroclisis", "parasitic gap", "deponency", "split ergativity"), but in fact, we don't have a general theory that explains much of this. In some cases, (thanks to Greenberg) we have a good idea about the (lack of) cross-linguistic generality of these phenomena, but in most cases, we don't.

    1. I don't think anything has emerged to challenge the most basic insights about for example A-bar dependencies. They are strikingly uniform across constructions, so that within a language, conditions on question formation will parallel conditions on relative clause formation, for example. Across languages, they are used for the same kinds of things, like question formation and relative clause formation, and have their tails in thematic positions and their heads in c-commanding positions. They are sensitive to relativized minimality type intervention, i.e. they don't freely cross things that are too similar to themselves. They are sensitive to certain kinds of structural boundaries such as finite clause boundaries, which give various kinds of evidence of successive-cyclicity, and adjunct boundaries, which often induce island effects. There are different specific theories of these facts, for example some involving SLASH and some involving traces, and some involving barriers and some involving phases, but if you abstract away from the details, it seems to me they are all based on a set of facts which are broadly accepted and don't face serious empirical challenge.

    2. When it comes to X-bar theory, it is true that the theory of projection from a lexical head turned out not to explain as much as the hopes expressed in Jackendoff's 1977 book. But I contest the characterization "nothing has replaced those bold ideas" and "[we have] no clear progress in understanding what drives these kinds of phenomena." X-bar theory was supplemented in the 80s and 90s by theories of functional heads and extended projections, using tools like the Mirror Principle originally explored by Baker and the incipient theory of head movement and developed in a huge number of works exploring the properties of individual functional heads like D and Num and C and T and Asp but also in works developing the broader theory of the extended projections of the clause and noun phrase by people like Grimshaw, Cinque, Travis, Ritter, Wiltschko, and many others. As for the unification of binding and movement, I would say the central facts still underlie the generative focus on structure as a central feature of utterances, even if the parallels between specific cases of movement and specific cases of binding are not as simple as Chomsky thought in 1981, so I reject the characterization "those hopes have long been gone." Yes, we have moved beyond the characterizations of 1977 and 1981, but I see progress, not failure. (In case it isn't clear, both this remark and my previous one are replies to Martin Haspelmath's comment of September 21st. The previous one was more of an echo of Norbert, because the successes of GG that he pointed to included the A-bar phenomena I mentioned. But the theory of functional heads and extended projections that I mention here did not figure prominently in Norbert's 2015 list of GG's achievements.)

    3. Yes, Cinque (1999) does constitute real progress, because it makes a very specific proposal about the functional sequence, along the lines of (but more detailed than) Bybee (1985). But the vast majority of the literature on functional heads just adds more language-specific functional heads, without any constraints whatsoever, and often just in order to provide "landing sites". I don't see this as replacing X-bar theory, but as recognizing that the world is much richer, but without a good idea of how to constrain it (except for Bybee-Cinque, as noted earlier). But I would go further: The idea of functional heads has created a lot of confusion, because it is no longer clear what the trees stand for: constituency? semantic scope? binding relations? prominence of argument roles? If there were any clear convergence among these kinds of notions, then functional heads would be justified, but I don't see this.

    4. In a very direct sense, the theory of extended projections replaces much of X-bar theory (and the rest is reduced to "bare phrase structure," or Merge plus labelling). X-bar theory was originally proposed by Chomsky in 1970 to capture parallels between clauses and nominalizations. Jackendoff 1977 took that much further, with three levels of functionally distinct specifiers. There were parallels between what was at the second bar level across categories and so on. What replaces that now are parallels between the T-domain in the clause and the D-domain in the noun phrase, as in Abney 1987; Grimshaw 1991 incorporated Emonds' observations about the shared properties of P and C, with P>D>N being analogous to C>T>V. Most of the things that were specifiers of a lexical projection in Jackendoff 1977 have been reinterpreted as functional heads in a corresponding extended projection, and instead of higher and higher bar levels, as in Jackendoff, you have higher and higher functional levels, Grimshaw's F-levels.

      In the 2005 version of her paper, Grimshaw applies the extended projection theory to Cinquean cartography. Subsequent work (some of which I mentioned earlier) has proposed theories of the functional hierarchy or hierarchies, rather than just listing; I have made some stabs at that myself. Most of those theories distinguish at least a C-domain, a T-domain, and a V-domain, with distinct properties, with analogous domains in the noun phrase. All of these theories have a lot in common, for example in their treatment of the difference between an adjunct and a functional head. The theories make clear predictions about a cluster of properties that are expected to converge on being a head versus being an adjunct, and they have a good track record. Non-GG work has some analogous categories, descriptively, but I don't know of any work which is not derived from GG which makes equally precise or good predictions. I also don't know of any work challenging the family of models of extended projections, as opposed to challenging assumptions specific to one version or another, of which we have plenty---a healthy situation.

  5. Returning to Norbert's original post, I really like the idea, laid out at length in Norbert’s “Whig history” series of posts in March-April 2015, that MP sets out to explain the GB era model and its results, rather than to replace them or repudiate them.

    It follows, as Norbert says, that it will be very hard to be excited about MP if you don't think that the GB canon of knowledge is descriptively accurate. You might expect there to be a cline of enthusiasm -- the more you think that GG accurately discovered between, say 1977 and 1995, the more enthusiasm you might feel for MP. You might say Martin and Norbert occupy opposite ends of the spectrum.

    I think there are also a few people in the other two corners -- grumpy recalcitrant GBists who had faith in the old system but think the MP is a big mistake, and true believers who don’t command the GB-era facts but like the MP because of its conceptual appeal. But it might be right that most people fall somewhere on the cline predicted by Norbert’s “Whiggish” interpretation of the relation between GB and MP.

    Something that I find less helpful is the idea of a “tension” between “those mainly interested in the philological details of languages and those interested in the mental/cognitive/neuro implications of linguistic competence.” I’m not saying that the field of linguistics isn’t full of tensions, but I don’t feel that that accurately locates the interesting ones. I don’t feel that there’s any tension whatsoever between good theoretical work and the kind of solid descriptive work on individual languages that requires years of immersion and painstaking fieldwork. Individual theoreticians can be hostile to philology and vice-versa, but the hostility or tension isn’t intrinsic to the nature of the respective enterprises. Take Tomasello, for example -- he is hostile to GG, but not because he’s interested in philological details; quite the opposite.

    1. I agree with this characterization, Peter – it's true that I don't think that the "GB canon of knowledge" is descriptively accurate. It's full of bias from English, and full of hopes that have not been fulfilled, or have yet to be fulfilled. And I also agree that the tension between "philological(??) details" and "cognitive implications" is not interesting. In fact, I don't even understand what is the difference between "good theoretical work" and "solid descriptive work" – why should there be two different adjectives here? How can descriptive work be atheoretical, and how can theoretical work be undescriptive?

    2. Good, I'm glad that we agree about the relationship of description and theory and that I understand your position on the status of the GB canon. You haven't explicitly responded to my characterization of what we know about A-bar dependencies. Since you mention English bias here, I would like to point out that I personally find it impressive that so many facts about A-bar dependencies which were originally observed in English have been replicated in many and diverse languages. In fact, the evidence that A-bar dependencies are successive-cyclic was originally rather subtle, but then languages like Irish and Chamorro turned out to provide stunning morphological support for it.

    3. @Martin: In fact, I don't even understand what is the difference between "good theoretical work" and "solid descriptive work"

      You're right that one does not preclude the other, but it's also very clear that the goals --- and by extension the relevant data --- can be vastly different.

      As a computational linguist, I fall heavily on the theory side of the spectrum. There are many issues that the average linguist cares a lot about that I consider fairly unimportant. Whether there's a separate Asp head in language X and where exactly it goes in the structure is immaterial for the big picture issues that I care most about: computational complexity, expressivity, parsing, learnability, and implications for cognition. And I have a hunch that many linguists will find my work perhaps theoretical, but definitely not descriptive. So it often feels like there is little common ground.

      But just like Peter and you I don't actually see much of a tension because there is still sufficient overlap. A data point or phenomenon might seem irrelevant now but become crucial at a later point. When I started working on the subregular complexity of phonology, even minor details of stress assignment in Cairene Arabic, n-retroflexion in Sanskrit, or Korean vowel harmony suddenly became very important. And if you think that linguistic hiearchies exhibit certain abstract properties that tie into human cognition in general (as I have been for a while), then functional hierarchies become very interesting and the Asp head in language X might not be so irrelevant after all.