Wednesday, March 5, 2014

Minimalism's program and its theories

Is there any minimalist theory and if so what are its main tenets? I ask this because I have recently been reading a slew of papers that appear to treat minimalism as a theoretical “framework,” and this suggests that there are distinctive theoretical commitments that minimalist analyses make that render them minimalist (just as there are distinctive assumptions that make a theory GBish, see below). What are these and what makes these commitments “minimalist”? I ask this for it starts to address a “worry” (not really, as I don’t worry much abut these things) that I’ve been thinking about for a while, the distinction between a program and a theory. Here’s what I’ve been thinking.

Minimalism debuted in 1993 as a program, in Chomsky’s eponymous paper. There is some debate as to whether chapter 3 of the “Black Book” (BB) was really the start of the Minimalist Program (MP) or whether there were already substantial hints about the nature of MP earlier on (e.g. in what became chapters 1 and 2 of BB).  What is clear is that by 1993, there existed a self-conscious effort to identify a set of minimalist themes and to explore these in systematic ways. These tropes divided into at least two kinds, when seen in retrospect.[1]

First, there was the methodological motif. MP was a call to critically re-examine the theoretical commitments of earlier theory, in particular GB.[2]  The idea was to try and concretize methodological nostrums like “simple, elegant, natural theories are best” in the context of then extant syntactic theory. Surprisingly (at least to me)[3], Chomsky showed that these considerations could have a pretty sharp bite in the context of mid 90s theory. I still consider Chomsky’s critical analysis of the GB levels as the paradigm example of methodological minimalism. The paper shows how conceptual considerations impose different burdens of proof wrt the different postulated levels. Levels like PF (which interface with the sound systems (AP)) or LF (which interfaces with the  belief (CI) systems) need jump no empirical hurdles (they are “virtually conceptually necessary”) in contrast to internal levels like DS and SS (which require considerable empirical justification). In this instance, methodological minimalism rests on the observation that whereas holding that grammars interface with sound and meaning is a truism (and has been since the dawn of time), postulating grammar internal levels is anything but. From this it trivially follows that any theory that postulates levels analogous to DS and SS faces a high burden of proof. This reasoning is just the linguistic version of the anodyne observation that saying anything scientifically non-trivial requires decent evidence. In effect, it is the observation that PF and LF are very uninteresting levels while DS and SS are interesting indeed.

One of the marvels, IMO, of these methodological considerations is that they led rather quickly to a total reconfiguration of UG (eliminating DS and SS from UG is a significant theoretical step) and induced a general suspicion of grammar internal constructs beyond the suspect levels. In addition to DS and SS the 93 paper cast aspersions on traces (replacing them with copies), introduced feature checking, and suggested that government was a very artificial primitive relation whose central role in the theory of grammar called for serious reconsideration.

These themes are more fully developed in BB’s chapter 4, but the general argumentative outlines are similar to what we find in chapter 3. For example, the reasoning developing Bare Phrase Structure has a very similar structure to that concerning the elimination of DS/SS. It starts with the observation that any theory of grammar must have a combination operation (merge) and then goes on to outline what is the least we must assume concerning the properties of such an operation given widely accepted facts about linguistic structures. The minimal properties require little justification. Departures from them do. The trick is to see how far we can get making only anodyne assumptions (e.g. grammars interface with CI/AP, grammars involve very simple rules of combination) and then requiring that what goes beyond the trivial be well supported before being accepted. So far as I can see, there should be nothing controversial in this form of argument or the burdens it places on theory, though there has been, and continues to be, reasonable controversy about how to apply it in particular cases.[4]

However, truth be told, methodological minimalism is better at raising concerns than delivering theory to meet them. So, for example, a grammar with Merge alone is pretty meager. Thus, to support standard grammatical investigation, minimalists have added technology that supplements the skimpy machinery that methodological minimalism motivates.

A prime example of such is the slew of locality conditions minimalists have adopted (e.g. minimality and phase impenetrability) and the feature inventories and procedures for checking them (Spec-X0, AGREE via probe-goal) that have been explored. Locality conditions are tough to motivate on methodological grounds. Indeed, there is a good sense in which grammars that include locality conditions of various kinds and features of various flavors licensed by different feature checking operations are less simple than those that eschew these. However, to be even mildly empirically adequate any theory of grammar will need substantive locality conditions of some kind. Minimalists have tried to motivate them on computational rather methodological grounds. In particular, minimalists have assumed that bounding the domain of applicable operations is a virtue in a computational system (like a grammar) and so locality conditions of some variety are to be expected to be part of UG. The details, however, are very much open to discussion and require empirical justification.

Let me stress this. I have suggested above that there are some minimalist moves that are methodological defaults (e.g. no DS/SS, copies versus traces, some version of merge). The bulk of current minimalist technology, however, does not fall under this rubric.  It’s chief motivations are computational and empirical. And here is where we move from minimalism as program to minimalism as theory. Phase theory, for example, does not enjoy the methodological privileges of the copy theory. The latter is the minimal way of coding for the evident existence of non-local dependencies. The former is motivated (at best) in terms of the general virtues of local domains in a computational context and the specific empirical virtues of phase based notions of locality. Phase Theory moves us from the anodyne to the very interesting indeed. It moves us from program to theory, or, more accurately, theories, for there are many ways to realize the empirical and computational goals that motivate phases.

Consider an example, e.g. choosing between a minimalist theory that includes the first more local version of the phase impenetrability condition (PIC1) or the second more expansive one (PIC2). The latter is currently favored because it fits better with a probe-goal technology given data like inverse nominative agreement in Icelandic quirky case clauses. But this is hardly the only technology available and so the decision in favor of this version of the PIC is motivated neither on general methodological nor broadly computational ones. It really is an entirely empirical matter: how well does the specific proposal handle the relevant data? In other words, lots of current phase theory is only tangentially related to the larger minimalist themes that motivate the minimalist program. And this is true for much (maybe most) of what gets currently discussed under the rubric of minimalism.

Now, you may conclude from the above that I take this to be a problem. I don’t. What may be problematic is that practitioners of the minimalist art appear to me not to recognize the difference between these different kinds of considerations. So for example, current minimalism seems to take Phases, PIC2, AGREE under Probe-Goal, and Multiple Spell Out (MSO) as defining features of minimalist syntax. A good chunk of current work consists in tweaking these assumptions (which heads are phases?, is there multiple agree?, must probes be phase heads?, are the heads relevant to AP MSO identical to CI MSO?, etc.) in response to one or another recalcitrant data set. Despite this, there is relatively little discussion (I know of virtually none) of how these assumptions relate to more general minimalist themes, or indeed to any minimalist considerations. Indeed, from where I sit, though the above are thought of as quintessentially minimalist problems, it is completely unclear to me how (or even if) they relate to the any of the features that originally motivated the minimalist program, be they methodological, conceptual or computational. Lots of the technology in use today by those working in the minimalist “framework” is different from what was standard in GB (though lots only looks different, phase theory, for example, being virtually isomorphic to classical subjacency theory), but modulo the technology, the proposals having nothing distinctively minimalist about them. This is not a criticism of the research, for there can be lots of excellent work that is orthogonal to minimalist concerns. However, identifying minimalist research with the particular technical questions that arise from a very specific syntactic technology can serve to insulate current syntactic practice from precisely those larger conceptual and methodological concerns that motivated the minimalist program at the outset.

Let me put this another way: one of the most salutary features of early minimalism is that it encouraged us to carefully consider our assumptions. Very general assumptions led us to reconsider the organization of the grammar in terms of four special levels and reject at least two and maybe all level organized conceptions of UG. It led us to rethink the core properties of phrase structure and the relation of phrase structure operations to displacement rules. It lead us to appreciate the virtues of the unification of the modules (on both methodological and Darwin’s Problem grounds) and to replace traces (and, for some (moi), PRO) with copies. It led us to consider treating all long distance dependencies regardless of their morphological surface manifestations in terms of the same basic operations. These moves were motivated by a combination of considerations. In the early days, minimalism had a very high regard for the effort of clarifying the proferred explanatory details. This was extremely salutary and, IMO, it has been pretty much lost. I suspect that part of the reason for this has been the failure to distinguish the general broad concerns of the minimalist program from the specific technical features of different minimalist theories, thus obscuring the minimalist roots of our theoretical constructs.[5]

Let me end on a slightly different note. Programs are not true of false. Theories are. Our aim is to find out how FL is organized, i.e. we want to find out the truth about FL. MP is a step forward if it helps promote good theories. IMO, it has. But part of minimalism’s charm has been to get us to see the variety of arguments we can and should deploy and how to weight them. One aim is to isolate the distinctive minimalist ideas from the others, e.g. the more empirically motivated assumptions. To evaluate the minimalist program we want to investigate minimalist theories that build on its leading ideas. One way of clarifying what is distinctively minimalist might be by using GB as a point of comparison. Contrasting minimalist proposals with their GBish counterparts would allow us to isolate the distinctive features of each.  In the early days, this was standard procedure (look at BB’s Chapter 3!). Now this is rarely done. I suggest we start re-integrating the question “what would GB say” (WWGBS) back into our research methods (here) so as to evaluate how and how much minimalist considerations actually drive current theory. Here’s my hunch: much less than the widespread adoption of the minimalist “framework” might lead you to expect.

[1] Actually there is a third: in addition to methodological and computational motifs there exists evolutionary considerations stemming from Darwin’s Problem. I won’t discuss these here.
[2] Such methodological minimalism could be applied to any theory. Not surprisingly, Chomsky’s efforts were directed at GB, but his methodological considerations could apply to virtually any extant approach.
[3] A bit of confession: I originally reacted quite negatively to the 93 paper, thinking that it could not possibly be either true or reasonable. What changed my mind was an invitation to teach a winter course in the Netherlands on syntactic theory during the winter of 93. I had the impression that my reaction was the norm, so I decided to dedicate the two weeks I was teaching to defending the nascent minimalist viewpoint. Doing this convinced me that there was a lot more to the basic idea than I had thought. What really surprised me is that taking the central tenets even moderately seriously led to entirely novel ways of approaching old phenomena, including ACD constructions, multiple interrogation/ superiority, and QR. Moreover, these alternative approaches, though possibly incorrect were not obviously incorrect and they were different. To discover that the minimalist point of view could prove so fecund given what appear to be such bare bones assumptions, still strikes me as nothing short of miraculous.
[4] As readers may know, I have tried to deploy similar considerations in the domains of control and binding. This has proven to be very controversial but, IMO, not because of the argument form deployed but due to different judgments concerning the empirical consequences. Some find the evidence in favor of grammar internal formatives like PRO to meet the burden of proof requirement outlined above. Some do not. That’s a fine minimalist debate.
[5] I further suspect that the field as a whole has tacitly come to the conclusion that MP was actually not a very good idea, but this is a topic for another post.


  1. One problem I have with minimalist conclusions is that many of them are not grounded in anything objective. Take for example "the [copy theory] is the minimal way of coding for the evident existence of non-local dependencies." Certainly, it is one way of so doing, but why is it more minimal than, say, type raising, composition, and application, or tree adjunction (i.e. second order substitution), or unification? Even if we play your game, we can set Merge(A,B) = {A,B} and Move(A) = Merge(A,A) = {A}, and then we recover exactly the MG derivation trees, which we know allow for non-local dependencies. However, this encoding has no copying at all. (Move=self-merge does not need to explicitly indicate what will move, as this is fixed given the rest of the expression.) That certainly seems more `minimal' to me. But again, why should we prefer set notation to anything else? No one would claim that the choice of orthography for set brackets (round, curly, square) should matter. I would say this is because there is an obvious isomorphism between these different choices, and that we cannot distinguish between isomorphic models of our data. But there is also an obvious isomorphism between set notation and graph notation. How can we balk here? For me, part of the appeal of questions of computational complexity, and expressive power, lie in their representation independence. Here we can establish objective properties of our theories which do not hinge on arbitrary notational choices.

    1. Your questions highlight what I was trying to focus on, so thanks. As you may have noticed (though maybe not) I tried to set the Minimalist discussion in the context of GB. The original papers had the aim of conceptually slimming down GB. GB was taken as the starting point, and the argument was that the aims of GB could be accomplished without a lot of the grammar internal apparatus, SS, DS, traces, government etc. I found these arguments quite compelling, because I found GB to be a pretty good story. If you do not share this sentiment, then minimalism will tend to irritate you, as I see it has.

      Your question is whether these are more "minimal" than other approaches. I have no idea, for I have no idea what this could mean. Here it seems we are in the same boat. There is an interpretation of the program that I do not share that takes it as obvious that there is some absolute conception of "simple," "minimal," "elegant." Maybe there is but I don't know what that is. However, in the given GB context of yore, these notions had a real grip, and still do.

      You are right that we seem to prize very different things. I am not interested very much in "representation independence." I have no problem with observing that many of our "frameworks" are effectively notational variants. But expressive power issues do not really grab me. What does grab me is something that answers Plat's, Darwin's etc. problem. I am interested in the systems to the degree that they shed light on these issues.

      Last point: If you are interested in how I think about these problems, I have written about it a bit in various places, e.g. the introduction to 'Move!' and 'A Theory of Syntax.' My take on those matters is not everyone's, but it is mine. What you will notice should you look at these is that I have always thought that minimalist injunctions only make sense in very specific settings. And that, IMO, allows them to have some bite.