Tuesday, May 16, 2017

The wildly successful minimalist program

It’s that time of year: spring has sprung, classes are almost over, and all of those commitments you made to write papers three years ago and forgot about are coming due. I am in the midst of one such effort right now (due the end of May). It’s one of those “compare different theories/frameworks” volumes and I have been asked to write on the Minimalist Program (MP). After ignoring the project for a good long time, I initially bridled against the fact that I had accepted to write anything. In order to extricate myself from the promise, I tried to convince the editor that the premise of the volume (that MP was a theory like the others) was false and so a paper on MP would not really be apposite. This tantrum was rejected. I then sulked. Finally, I decided that I would take the bull by the horns and argue that MP, contrary to what I perceive to be the conventional view, has been wildly successful in its own terms and that the reason for its widespread perceived failure is that most critics have refused to accept the premises of MP investigation. Why would they do so? There are several reasons, but the best one (and one that might even be correct) is that the premises for MP investigation (viz. that we know something about the structure of FL/UG and that something resembles GB) are shaky and so the project is premature. On this view the program is fine, it’s just that we’ve gotten a little ahead of ourselves.

This objection should sound familiar. It is what people who study specific languages and their Gs say about claims about FL and UG. We don’t know enough yet about particular Gs to address questions about FL/UG. Things are more complicated and we need time to sort these out.

I reject this. Things are always more complicated. Time is never right. IMO, GB is a pretty good theory and it is worth trying to see if we can derive some of its features in a more principled way. We will learn something even if we are not completely right about this (which is surely the case). In other words, GB is right enough (or, many of its properties will be part of whatever description turns out to be more accurate) and so trying to see how to derive its properties is a worthwhile project that could tech us something about FL/UG.

This, I should add, is the best reason to demur about MP (and as you can see, I am not sympathetic). Two others spring to mind: (i) MP sharpens the linguistics/languistics kulturkampf and (ii) MP privileges a kind of research that is qualitatively different from what most professionals commonly produce and so is suspect. 

I have beaten both these drums in the past, and I do so again here. I have convinced myself that the biggest practical problem for MP work is that it sharpens the contrast between the bio/cog and the philological perspectives on language. More specifically, MP only makes sense from the bio/cog perspective as it takes FL/UG as the object of inquiry. FL/UG is the explanandum. If you don’t think FL/UG exists (or you are not really interested in whether it exists) then MP will seem, at best, pointless and, at worst, mystical omphaloskepsis. It is an odd fact of life that many find their own interests threatened by those that do not share them. I suspect that MP’s greatest sin in the eyes of many is that it appears to devalue their own interest in language by promoting the study of the underlying faculty. This, of course, does not follow. Tastes differ, interests range. But there can be little doubt that one of Chomsky’s many vices is that by convincing so many to be fascinated by the problems he has identified that he has robbed so many of confidence in their own. MP simply sharpens: doing it at all means buying into the bio-cog program. Abandon hope all languists who enter here.

Second, furthering the MP project will privilege a kind of work distinct in style from that normally practiced. If the aim is unification then MP work will necessarily be quite theoretical and the relevance of this kind of work for the kinds of language facts that linguists prize somewhat remote, at least initially. Why? Because if a primary aim of MP is to deduce the basic features of GB from more fundamental principles then a good chunk of the hard work will be to propose such principles and see how to deduce the particular properties of GB from them. The work, in other words, will be analytic and deductive rather than descriptive and inductive. Need I mention again how little our community of scholars esteems such work?

I we put these two features of MP inquiry together, we end up with work that is hard core bio-mentalist and heavily deductive and theoretical in nature. Each feature suffices to generate skepticism (if not contempt) among many working linguists. This, at any rate, is what I argue in the paper that I avoided trying to write.

I cannot post the whole thing (or at least won’t do so today). But I am going to given you the intro stage setting (i.e. polemical) bits for your amusement.  Here goes, and may you have a happy time with your own thoughtless commitments.


What is linguistics about? What is its subject matter? Here are two views.

One standard answer is “language.” Call this the “languistic (LANG) perspective.” Languists understand the aim of a theory of grammar to describe the properties of different languages and identify the common properties they share. Languists frequently observe that there are very few properties that all languages have in common. Indeed, in my experience, the LANG view is that there are almost no language universals that hold without exception and that languages can and do vary arbitrarily and limitlessly. LANGers assume that if there are universals, then they are of the Greenbergian variety, more often statistical tendencies than categorical absolutes.  

There is a second answer to the question, one associated with Chomsky and the tradition in Generative Grammar (GG) his work initiated. Call this the linguistic (LING) perspective.” Until very recently, linguists have understood grammatical theory to have a pair of related objectives: (i) to describe the mental capacities of a native speaker of a particular language L (e.g. English) and (ii) to describe the meta-capacity that allows any human to acquire the mental capacities underlying a native speaker facility in a particular L (i.e. the meta-capacity required to acquire a particular G). LINGers, in other words, take the object of study to be two kinds of mental states, one that grammars of particular languages (i.e. GL) describe and one that “Universal Grammar” (UG) describes. UG, then, names not Greenbergian generalizations about languages but features of human mental capacity that enable them to acquire GLs. For linguists, the study of languages and their intricate properties is useful exactly to the degree that it sheds light on both of these mental capacities. As luck would have it, studying the products of these mental capacities (both at the G and UG level) provides a good window on these capacities.

The LANG vs LING perspectives lead to different research programs based on different ontological assumptions. LANGers take language to be primary and grammar secondary. GLs are (at best) generalizations over regularities found in a language (often a more or less extensive corpus or lists of “grammaticality” judgments serving as proxy).[1] For LINGers, GLs are more real than the linguistic objects they generate, the latter being an accidental sampling from an effectively infinite set of possible legitimate objects.[2] On this view, the aim of a theory of a GL is, in the first instance, to describe the actual mental state of a native speaker of L and thereby to indirectly circumscribe the possible legit objects of L. So for LINGers, the mental state comes first (it is more ontologically basic), the linguistic objects are its products and the etiology of those that publically arise (are elicited in some way) only partially reflect the more stable, real, underlying mental capacity. Put another way, the products are interaction effects of various capacities and the visible products of these capacities are the combination of their adventitious complex interaction. So the products are “accidental” in a way that the underlying capacities are not.

LANGers disagree. For them the linguistic objects (be they judgments, corpora, reaction times) come first, GLs being inductions or “smoothed” summaries of these more basic data. For LINGers the relation of a GL to its products is like the relation between a function and its values. For a LANGer it is more like the relation between a scatter plot and the smoothed distributions that approximate it (e.g. a normal distribution).

LINGers go further: even GLs are not that real. They are less real than UG, the meta-capacity that allows humans to acquire GLs. Why is UG more “real” than GLs? Because in a sense that we all understand, native speakers only accidentally speak the language they are native in. Basically, it is a truism universally acknowledged that any kid could have been native in any language. If this is true (and it is, really), then the fact that a particular person is natively proficient in a particular language is a historical accident. Indeed, just like the visible products of a GL result from a complex interaction of many more basic sub-capacities, a particular individual’s GL is also the product of many interacting mental modules (memory size, attention, the particular data mix a child is exposed to and “ingests,” socio-economic status, the number of hugs and more). In this sense, every GL is the product of a combination of accidental factors and adventitious associated capacities and the meta-capacity for building GLs that humans as a species come equipped with.

If this is right, then there is no principled explanation for why it is that Norbert Hornstein (NH) is a linguistically competent speaker of Montreal English. He just happened to grow up on the West Island of that great metropolis. Had NH grown up in the East End of London he would have been natively proficient in another “dialect” of English and had NH been raised in Beijing then he would have been natively proficient in Mandarin. In this very clear sense, then, NH is only accidentally a native speaker of the language he actually speaks (i.e. has acquired the particular grammatical sense (i.e. GL) he actually has) though it is no accident that he speaks some native language. At least not a biological accident for NH is the type of animal that would acquire some GL as a normal matter of course (e.g. absent pathological conditions) if not raised in feral isolation. Thus, NH is a native speaker of some language as a matter of biological necessity. NH comes equipped with a meta-capacity to acquire GLs in virtue of the fact that he is human and it is biologically endemic to humans to have this meta-capacity. If we call this meta-capacity the Faculty of Language (FL), then humans necessarily have an FL and necessarily have UG, as the latter is just a description of FL’s properties. Thus, what is most real about language is that any human can acquire the GL of any L as easily as it can acquire any other. A fundamental aim of linguistic theory is to explain how this is possible by describing the fine structure of the meta-capacity (i.e. by outlining a detailed description of FL’s UG properties).

Before moving on, it is worth observing that despite their different interests LINGers and LANGers can co-exist (and have co-existed) quite happily and they can fruitfully interact on many different projects. The default assumption among LINGers is that currently the best way to study GLs is to study its products as they are used/queried. Thus, a very useful way of limning the fine structure of a particular GL is to study the expressions of these GLs. In fact, currently, some of the best evidence concerning GLs comes from how native speakers use GLs to produce, parse and judge linguistic artifacts (e.g. sentences). Thus, LINGers, like LANGers, will be interested in what native speakers say and what they say about what they say. This will be a common focus of interest and cross talk can be productive.

Similarly, seeing how GLs vary can also inform one’s views about the fine structure of FL/UG. Thus both LINGers and LANGers will be interested in comparing GLs to see what, if any, commonalities they enjoy. There may be important differences in how LINGers and LANGers approach the study of these commonalities, but at least in principle, the subject matter can be shard to the benefit of each. And, as a matter of fact, until the Minimalist Program (MP) arose, carefully distinguishing LINGer interests from LANGer interests was not particularly pressing. The psychologically and philologically inclined could happily live side by side pursuing different but (often enough) closely related projects. What LANGers understood to be facts about language(s), LINGers interpreted as facts about GLs and/or FL/UG.

MP adversely affects this pleasant commensalism. The strains that MP exerts on this happy LING/LANG co-existence is one reason, I believe, why so many GGers have taken a dislike to MP.  Let me explain what I mean by discussing what the MP research question is. For that I will need a little bit of a running start.

Prior to MP, LING addressed two questions based on two evident rationally uncontestable facts (and, from what I can tell, these facts have not been contested). The first fact is that a native speaker’s capacities cover an unbounded domain of linguistic objects (phrases, sentences etc.). Following Chomsky (1964) we can dub this fact “Linguistic Creativity” (LC).[3]  dI’ve already adverted to the second fact: any child can acquire any GL as easily as any other. Let’s dub this fact “Linguistic Promiscuity” (LP). Part of a LINGers account for LC postulates that native speakers have internalized a GL. GLs consist of generative procedures (recursive rules) that allow for the creation of unboundedly complex linguistic expressions (which partly explains how a native speaker effortlessly deals with the novel linguistic objects s/he regularly produces and encounters).

LINGers account for the second fact, LP, in terms of the UG features of FL.  This too is a partial account. UG delineates the limits of a possible GL. Among the possible GLs, the child builds an actual one in response to the linguistic data it encounters and that it takes in (i.e the Primary Linguistic Data (PLD)).

So two facts, defining two questions and two kinds of theories, one delimiting the range of possible linguistic expressions for a given language (viz. GLs) and the other delimiting the range of possible GLs (viz. FL/UG). As should be evident, as a practical matter, in addressing LP it is useful to have to hand candidate generative procedures of specific GLs. Let me emphasize this: though it is morally certain that humans come equipped with a FL and build GLs it is an empirical question what properties these GLs have and what the fine structure of FL/UG is. In other words, that there is an FL/UG and that it yields GLs is not really open for rational debate. What is open for a lot of discussion and is a very hard question is exactly what features these mental objects have. Over the last 60 years GG has made considerable progress in discovering the properties of particular GLs and has reasonable outlines of the overall architecture of FL/UG. At least this is what LINGers believe, I among them. And just as the success in outlining (some) of the core features of particular Gs laid the ground for discovering non-trivial features of FL/UG, so the success in liming (some of) the basic characteristics of FL/UG has prepared the ground for yet one more question: why do we have the FL/UG that we have and not some other? This is the MP question. It is a question about possible FL/UGs.

There are several things worth noting about this question. First, the target of explanation is FL/UG and the principles that describe it. Thus, MP only makes sense qua program of inquiry if we assume that we know some things about FL/UG. If nothing is known, then the question is premature. In fact, even if something is known, it might be premature. I return to this anon. 

Second, the MP question is specifically about the structure of FL/UG. Thus, unlike earlier work where discussions of languistic interest can be used to obliquely address LC and LP, the MP question only makes sense from a LING perspective. It is asking about possible FL/UGs and this requires taking a mentalistic stance. Discussing languages and their various properties had better bottom out in some claim about FL/UG’s limits if it is to be of MP relevance.  This means that the kind of research MP fosters will often have a different focus from that which has come before. This will lead LANGers and LINGers to a more obvious parting of the investigative ways. In fact, given that MP takes as more or less given what linguists and languists have heretofore investigated as basic, MP is not really an alternative to earlier theory. More specifically, MP can’t be an alternative to GB because, at least initially, MP is a consumer of GB results.[4] What does this mean?

An analogy might help. Think of the relationship between thermodynamics and statistical mechanics. The laws of thermodynamics are grist for the stats mechanics mill, the aim being to derive the thermodynamic generalizations in a more principled atomic theory of mechanics. The right way to think of MP and early theory is in the same way. Take (e.g.) GB principles and see if they can be derived in a more principled way. That’s one way of understanding the MP program, and I will elaborate this perspective in what follows. Note, if this is right, then just as many thermodynamical accounts of, say, gas behavior will be preserved in a reasonable statistical mechanics, so too many GB accounts will be preserved in a decent MP theory of FL. The relation between GB and MP is not that between a true theory and a false one, but a descriptive theory (what physicists call an “effective” theory) and a more fundamental one.

If this is right, then GB (or whatever FL/UG is presupposed) accounts will mostly be preserved in MP reconstructions. And this is a very good thing! Indeed, this is precisely what we expect in science; results of past investigations are preserved in later ones with earlier work preparing the ground for deeper questions. Why are they preserved? Because they are roughly correct and thus not mimicking these results (at least approximately) is excellent indication that the subsuming proposal is off on the wrong track. Thus, a sign that the more fundamental proposal is worth taking seriously is that it recapitulates earlier results and thus a reasonable initial goal of inquiry is to explicitly aim to redo what has been done before (hopefully, in a more principled fashion).

If this is correct, it should be evident why many might dismiss MP inquiry. First, it takes as true what many will think contentious and tries to derive it. Second, it doesn’t aim to do much more than derive “what we already know” and so does not appear to add much to our basic knowledge, except, perhaps, a long labored (formally involved) deduction of a long recognized fact.

Speaking personally, my own work takes GB as a roughly correct description of FL/UG. Many who work on refining UGish generalizations will consider this tendentious. So be it. Let it be stipulated that at any time in any inquiry things are more complicated than they are taken to be. It is also always possible that we (viz. GB) got things entirely wrong. The question is not whether this is an option. Of course it is. The question is how seriously we should take this truism.

So, MP starts from the assumption that we have a fairly accurate picture of some of the central features of FL and considers it fruitful to inquire as to why we have found these features. In other words, MP assumes that time is ripe to ask more fundamental questions because we have reasonable answers to less fundamental questions. If you don’t believe this then MP inquiry is not wrong but footling.

Many who are disappointed in MP don’t actually ask if MP has failed on its own terms, given its own assumptions. Rather it challenges the assumptions. It takes MP to be not so much false as premature. It takes issue with the idea that we know enough about FL/UG to even ask the MP question. I believe that these objections are misplaced. In other words, I will assume that GBish descriptions of FL/UG are adequate enough (i.e. are right enough) to start asking the MP question. If you don’t buy this, MP will not be to your taste and you might be tempted to judge its success in terms of your interests rather than its own questions.

[1] There are few more misleading terms in the field than “grammaticality judgment.” The “raw” data are better termed “acceptability” judgments. Native speakers can reliably rank linguistic objects with regard to relative acceptability (sometimes under an interpretation). These acceptability judgments are, in turn, partial reflections of grammatical competence. This is the official LING view. LANGers need not be as fussy, though they too must distinguish data reflecting judgments in reflective equilibrium from more haphazard reactions. The reason that LANGers differ from LINGers in this regard reflects their different views on what they are studying. I leave it to the reader to run the logic for him/herself.
[2] The term set should not be taken too seriously. There is little reason to think that languages are sets with clear in/out conditions or that objects that GLs generate are usefully thought of as demarcating the boundaries of a language. In fact, LINGers don’t assume that the notion of a language is clear or well conceived. What LINGers do assume is that native speakers have a sense of what kinds of objects their native capacities extend to and that this is an open ended (effectively infinite) capacity and that is (indirectly) manifest in their linguistic behavior (production and understanding) of linguistic objects.
[3] Here’s Chomsky’s description of this fact in his (1964:7):
…a mature native speaker can produce a new sentence of his language on the appropriate occasion, and other speakers can understand it immediately, though it is equally new to them. Most of our linguistic experience, both as speakers and hearers, is with new sentences; once we have mastered a language, the class of sentences with which we can operate fluently is so vast that for all practical purposes (and, obviously, for all theoretical purposes), we may regard it as infinite.
 [4] Personally, I am a big fan of GB and what it has wrought. But MP style investigations need not take GB as the starting point for minimalist investigations. Any conception of FL/UG will do (e.g. HPSG, RG, LFG etc.). In my opinion, the purported differences among these “frameworks” (something that this edited collection highlights) have been overhyped. To my eye, they say more or less the same things, identify more or less the same limiting conditions and do so in more or less the same ways. In other words, these differing frameworks are largely notational variants of one another, a point that Stabler 2010) makes as well.


  1. It's interesting that you use the thermodynamics metaphor to describe the relation between MP and GB. Remko Scha, one of the pioneers of probabilistic grammars, used it often to describe the relation between the level of description of the theoretical linguist and the messy, stochastic cognitive and neural processes underlying it. Things like temperature and pressure are very real (and different from each other) at the macroscopic level, but disappear when you descend to the microscopic level (where there's just movement of molecules). Similarly, categories, rules, grammaticality, are very real at the linguistic level, but disappear when you zoom in further.

    It is in this sense only that I can understand an expression like "ontologically more basic": moving molecules are more basic than temperature. I therefore don't really see how we can understand the difference in world views between LINGers and LANGers in degrees of 'ontological basicness' of various linguistic concepts. I'd think it wouldn't be difficult to agree on the ontological status of the set of Greenbergian vs the set of Chomskyan universals -- the disagreement would be about whether or not these sets are empty and what they contain. And, importantly, about how we find out, but that would be a discussion about what is *epistemologically* more basic.

    1. Less a metaphor and more an analogy. But like all analogies it has to be handled carefully. Where I think it fits is that thermodynamics is a phenomenological theory. It captures generalizations that a more fundamental theory aims to explain. This is the way I think of the relation between GB and MP: the former sets the generalizations that the latter should derive in some principled manner. So, I don't really see the analogy the way Scha and you do. I think there are rules and (most likely) categories and the Grammar being a real object some conception of grammaticality makes sense. On the other hand notions like binding domain or c-command or controller are, at best, descriptive terms, not basic terms of art.

      By ontologically more basic I mean that LINGers take Gs as more fundamental. Sentences have the features they have because they are objects generated by Gs with certain properties. G properties are less invariant, less context sensitive and etiologically more fundamental. Sentences are complex objects whose properties are the result of the interaction of many different sub-systems only one of which is the grammar. The problem, IMO, with Greenberg generalizations is that they are summaries of surface forms that language has and are likely to be tracking non-natural properties. Greenberg Universals are summaries of what we have seen. Chomsky universals are principles that determine what properties a G can have whether it is seen or not.

      I don't really think that the notion of epistemologically more basic makes much sense here. We use data (among other things) to divine the abstract properties of the underlying principles. What you see before your eyes is, perhaps, more epistemologically accessible. But if history is any guide, this kind of data is most often misleading. I think the same holds in linguistics. What you can "see" is likely to be misleading. It's the only way to go, but as things progress we start manufacturing the data so that it is refined enough to address the theoretical questions that deal with the fundamental questions. In sum, I follow Fodor here in warning against confusing ontological questions with epistemological ones. The latter are, at least as concerns the basics, not enlightening.

    2. I think it's more interesting than "Thermodynamics is a phenomenological theory, statistical mechanics is an explanatory theory". At some level, every theory is phenomenological. Underlying a theory that postulates molecules with positions in time, there is a more fundamental theory that uses atoms, quarks, waves, or what not, where the basic particles have no definite position anymore. So, the most interesting thing about the analogy is that it makes us aware that the basic building blocks we need to assume at one level of description may not be clearly delineated objects at a lower level.

      I think this is a point where I have often disagreed with your analysis: when you, e.g., call on neuroscientists to look for the stack when trying to understand how the brain processes grammar, you underestimate, I think, how difficult it might be to recognize whatever it is at the neural level that might look more or less like a stack memory at a linguistic level of description.

      On the other points, I think we largely agree. Yes grammars (whatever they are) are more basic than sentences. I just wondered whether this is really the point where the worldviews of LINGers and LANGers diverge.

      I also don't disagree with what you write about epistemology (although, as a rule of thumb, I prefer going in the reverse direction from Fodor :)). The reason I brought up epistemology is that I see disagreements about what counts as evidence, and what you do when the evidence is inconclusive, as the issues that define the major camps in linguistics. At some point, in every serious discussion, all sides agree that there still are great mysteries about how language is used, learned, and how it evolved... and then they fall back on some basic assumptions that they hold to be selfevidently true.

    3. Well, every theory but the last will not be fundamental. But I think that some theories do not pretend to be fundamental, and thermodynamics was one of these, I think. It established relations among real magnitudes without trying to explain why/how they held. But, I am no expert in these matters, so you may be right.

      I also agree that one interesting thing about the analogy is that magnitudes etc that the less fundamental level takes as given will be less clear cut at the more fundamental level. The way reduction/explanation works is usually by restricting the ontology and unifying what looks disparate. Lumping is the name of the game. Hence it is likely that the more fundamental levels will cut things up differently in order to unify them. I think that this will be so in linguistics as well, at least if my conception of minimalism turns out to be roughly on the right track.

      I also agree that finding stacks will not be easy. But it might not be impossible either if you are looking for one. Geneticists found the code that DNA embodies by looking for it. It was hard, but it is being done. I think that one of the more interesting criticisms of current near by Gallistel is that if you are looking for things like addresses, and read/write memory and variables then you should NOT look at connections because you won't/can't find it there. This is an interesting argument. So if we have evidence for these kinds of things (and we do) then this implies that physical models that cannot accommodate them are likely wrong. Ok, so they are. the problem is not so much that they are hard to find (though they probably are) but they are impossible to find if you don't look for them.

      Last point: of course you fall back on things you take to be self-evidently true, or at least true enough to hold fixed. What else can you do? What I think is odd is that linguists do not really believe (or many don't) that we have made progress over the last 60 years. I find this incredible. Why do I say this? Because many are unwilling to take anything for granted. And for me this is terrible, because I think that doing Minimalism REQUIRES holding the results of the last 60 years as more or less right. Don't do this and you can do nothing at all. So, the problem is a failure of nerve. We need more guts!

  2. well explained and very informative post, thank you for sharing among us.

    by -naati translator Perth - 12:47PM