Comments

Monday, March 18, 2013

How I See The Minimalist Project


On reading through the comments to various posts I have developed a feeling that my take on current minimalist research is different from the norm. Most likely mine is idiosyncratic. However, for that reason, I want to outline how I see things so that others can compare their views to this one (should they want to, of course). Along the way, I will discuss what I think makes Minimalism different from earlier programs.  Again, my suspicion is that I see things somewhat differently from others and this may provoke a useful discussion, Let’s see.

Here’s how I interpret the minimalist project. It starts with the acknowledged success of GB.  Some may want to stop reading here for they do not think that GB has been successful. Good. If you judge otherwise, then here’s a good place to stop reading.  At any rate, GB was a success.[1] How so? It identified a dozen or so relatively robust non-trivial generalizations about linguistic dependencies. These generalizations reflect the innate structure of FL. Indeed, they are (a good part of what we know about) UG. The Minimalist Project takes it as given that these generalizations are indeed roughly accurate and proceeds to ask an additional question: why these generalizations and not others?  In other words, it constitutes the project of trying to explain why FL has the GB properties it does. Hopefully, it is clear why the judged success of GB is a sine qua non with this as project.

So, what is it to explain why something has the properties it has?  Well, it amounts to asking whether a given description is fundamental.  My physics friends have names for this (of course). They distinguish two kinds of theories: effective theories vs fundamental ones. The distinction is not meant to be invidious. Effective theories are very highly prized, even if not fundamental.  Indeed, they are often seen to be important necessary steps to deeper understanding (and generally seen to be critical steps towards deeper theories, c.f. discussion by Mitra here on thermodynamics and Landau’s theories). Here are some famous ones: Newtonian mechanics, classical thermodynamics, Maxwell’s theory of electromagnetism, the Standard Theory in quantum mechanics.  Pretty fancy stuff, so saying that some theory is effective is hardly an insult. However, despite their grandeur, these theories are not believed to be fundamental.  What are examples of more fundamental accounts?  Well, Newton’s laws, as we all know, are a special case of Einstein’s when the relevant velocities were small when compared to the speed of light.  In particular, Newton’s laws are limit values to Einstein’s. Another useful example is the relation between the Ideal Gas Laws and statistical mechanics. To my mind, Minimalism is to GB as Statistical mechanics is to the Ideal Gas Laws. Just as the Ideal Gas Laws are a pretty good (though far from perfect) description of regularities that hold among pressure, temperature and volume, so too is GB a pretty good (though not perfect) description of movement/binding/case/etc. dependencies we find in Natural Language grammars. And just like statistical mechanics aims to explain these regularities in terms of the aggregate mechanics of very small very hard atom like particles banging into one another, so too the Minimalist Program aims to isolate the fundamental operations (e.g. Merge) and principles (e.g. minimality) in terms of which it is possible to derive the GB identified grammatical regularities. So, one minimalist project, in my mind the empirically most tractable one given standard linguistic techniques, is to show how the heterogeneous primitives, principles and operations of GB can be unified to a small fundamental core of basic operations and principles.[2]

In my humble (well, not so humble) opinion, this project has had some notable high points.  Let me describe some (and I do mean some). GB has various modules embodying distinctive relations, locality conditions and licensing conditions, viz. Case theory is different from binding theory which is different from control theory which is different from movement theory which is different from phrase structure theory. In my view, one minimalist achievement has been to outline ways of unifying these dependencies by showing how they are all underlyingly reflections of a small common core of operations, viz. Merge. For example:

·      Chomsky 1993 analyzed case as a special instance of movement (i.e. I-merge, case lives on A-chains).
·      Zwart, Idsardi and Lidz, Baltin, Reuland, Drummond, Hicks, and me have proposed that local anaphora and obligatory control live on A-chains and so anaphora and control are also special cases of an A-chain dependency (hence the product of I-merge).[3]
·      Kayne, and Drummond, Kush and Hornstein have proposed treating pronominal binding as a kind of A’-dependency (i.e. a product of I-merge).[4]
·      Kayne has analyzed Weak and Strong Crossover effects in terms of movement (aka, I-merge).
·      Boskovic and Richards have argued that Superiority effects reduce to a species of minimality (i.e. a condition on I-merge).
·      Pesetsky and Torego have analyzed fixed subject effects in terms of derivational economy.
·      Chomsky 2004 has proposed that movement (aka I-merge) and phrase structure (aka E-merge) are actually products of a singular Merge operation.

The achievement is theoretical in that this work unifies apparently disparate grammatical domains.[5]  However, there have been novel empirical by-products as well, e.g. the scope effects of case discussed by Lasnik and Saito (building on earlier observations by Postal), backwards control and raising phenomena as discussed by Polinsky and Potsdam, finite control phenomena discussed by Ferreira, Nunes, Rodrigues, Landau, Seely. The net effect of all of this has been to offer a coherent picture in which the core grammatical relations of GB come together as effects of Merge plus whatever general conditions regulate its application (e.g. extension, minimality, inclusiveness, phases, last resort, LCA, and economy). In other words, theta marking, case checking, local and non-local antecedence, raising, control, crossover, question formation, relativization, X’-theory, (etc.) are all special cases of merge.[6]  If on the right track, this provides fundamental Merge based explanations for effective GB. And, if even roughly correct, this is a hell of a story, a story with interesting biolinguistic implications for the evolution of language and its neurological realization.[7] What I personally find remarkable is that the last 15 years of syntactic research has made a pretty decent case that this kind of picture is not at all implausible. Please note: showing that a very ambitious proposal is not implausible is a very respectable scientific goal. Of course, it would be even better if we could demonstrate (i.e. provide more and more evidence) that this unification is correct. All in good time!

Let me make a few ancillary points.

First, this kind of project of unification, has antecedents in previous generative research. The unification of islands in terms of subjacency/barriers is a project of the very same sort.  The unification of island phenomena has, I believe, been more or less vindicated, most recently in the careful work of Jon Sprouse.  What Jon has shown is not only that islands are almost surely grammatical structure effects (and not performance complexity effects) but that the islands display a pretty unified acceptability profile when violated.  Thus we find what a unified account of islands would lead you to expect. Nice. What I want to emphasize here is that it also provides a nice paradigm within generative research for the kind of minimalist project limned above.

Second, there remain large parts of earlier GB that have proven to be minimalistically refractory. For example, we don’t really have a good alternative account of the argument adjunct asymmetries that the Lasnik-Saito theory was designed to track. There are some sketches, but they fall short, in my view, of being more than suggestive.  There are also a large number of questions that GB ignored, e.g. we have no theory of grammatical features or categories.  So, even were the unification proposed above entirely successful, there would remain many unasked and unanswered questions to investigate.

Third, the project of unification also prompts another question, the significance of which has, in my view, been misinterpreted.  There is a distinctively minimalist question: which features of FL are distinctively linguistic? This is the “beyond explanatory adequacy” (BEA) question. Note that it only gains traction in a post GB context, one in which the aforementioned project of unification has proven successful. Why? Because the fine structure of GB is obviously linguistically tuned.  The relations that the modules traffic in (government, c-command, specified subject condition, antecedent government) have no obvious (or non-obvious) analogues in other domains of cognition.  However, if the relevant modules can all be unified as (e.g. a species of) Merge, then the question of what remains of FL that is specifically linguistic gains traction. Are any of the basic operations (e.g. Merge) or conditions on its application (e.g. extension) specifically linguistic or are they simply the reflections in the domain of language of more general cognitive-biological-computational principles?

There are evolutionary reasons (weak reasons actually, but reasons nonetheless) to hope that there is not much that is cognitively distinctive about UG for the more parochial FL’s principles and primitives the more work we make for evolutionary accounts that must bridge the divide between the grammatically endowed, i.e. us, and the grammatically deprived, i.e. every other animal.  As it is a mitzvah to lighten Darwin’s load to the greatest degree possible, all minimalists (mensches all) hope and pray that FL is more cognitively cosmopolitan than GB envisages. But (you knew this was coming, right?), but this is a worthwhile project just in case a sufficient number of the peculiar properties we find in NL are actually explicable in these more general terms. Here’s what’s not interesting: declaring that Merge is the only distinctive linguistic operation and pronouncing that anything not reducible to Merge is thereby not a linguistic property, but the product of some as yet unidentified interface condition.  This is a game we can be assured of winning and, hence, not very interesting. The declaration that anything that Merge can deal with isn’t   properly part of FL but is rather an interface effect may, in fact, be right. However, relocating something to the interface is not in and of itself doing much of interest. To be interesting requires showing how the specific properties of the interface explain the “linguistic” properties of interest. Minus this last step, what we have is a kind of actuarial minimalism, which, in my view, is an obstacle to serious research. Minimalism is not an exercise in balancing the grammatical books, or at least, no version of minimalism that mainly worries about this is one that I am interested in.

Getting back to the "misinterpreation" alluded to a paragraph back, many have concluded that if minimalism is right then there is no distinctive FL module. However, this is to confuse distinctive with parochial.  As Chomsky has emphasized, and I agree, it’s obvious that there is something special about our linguistic capacities.  Nothing does language like we do.  The problem is to understand the fine structure of this capacity. One hunch has been that this capacity lives on very specialized linguistic “circuits.” Minimalism is questioning this view, and to the degree that unification works, we might be able to whittle down (possibly to zero) what is specifically linguistic in this sense.  However, even if all the basic operations and principles are cognitively-biologically-computationally generic, it is clear that they have been put together in humans (and it appears in humans alone) in a very special way.  One job of generative grammar is to discover the fine structure underlying linguistic competence, i.e. the structure of UG.  It is currently an open question how generic UG is. The minimalist bet is that it is, well, minimal. It is no mean feat to have shown that the project is both viable and plausible, but we are still far from being able to declare the bet ours. Furthermore, even if FL has no dedicated operations or basic principles there is little reason to question that it is cognitively unique and figuring out how its put together from recycled parts is still a very hard unanswered problem.

Let me end: From where I sit, Minimalism has been a raging success.  It has deepened our understanding of UG and has added new questions (or made old ones more prominent) to the research agenda.  However, like all progressive shifts in the larger Generative research program, seen correctly (i.e. from my perspective), minimalism has been far more conservative than it is often advertised to be. And this is a good thing! It’s the hallmark of a progressive research program that earlier results lay the foundations for novel deeper inquiries. Minimalism has built on and conserved and deepened our prior theories, just as one should expect from a thriving program. Respect for one’s elders is a mark of good science. So minimalists, when throwing out bathwater, look out for the babies.



[1] Again, let me make clear, that ‘GB’ here includes its kissing cousins LFG, HPSG, RG, etc.  To my mind there is very little that empirically or conceptually distinguishes these various “frameworks.” Indeed thinking of these as different “frameworks” is analogous to thinking that a ling paper written in French is done in a different “framework” from one written in English or Japanese.
[2] What I mean by “standard linguistic techniques” are the ones that syntacticians are wont to deploy in their daily work.  There are other minimalist questions (see below) that will require developing a different armamentarium, currently more common in biology, psychology, CS and who knows where else. 
[3] Or its functional equivalent, viz. the A-type Agree dependency.
[4] This and the treatment of local antecedence as an A-chain dependency has the effect of deriving the complimentary distribution of reflexivization and pronominalization.
[5] I hope it goes without saying that there are many empirical “puzzles” yet to resolve, as is to be expected from a project of unification.
[6] Jan Koster in the comments has made fun of the “app” like nature of merge. I am not sure why. However, what’s interesting is not whether merge itself is trivial (frankly from one point of view, one hopes that it is) but whether this trivial “addition” can in the right context serve to unify the disparate operations and principles of GB. Showing that this is possible is definitely not trivial. Showing that it is actual would be an amazing achievement.
[7] The interested can see some slight discussion of this by me here.

10 comments:

  1. On my various browsers, something has gone wrong with some of the fancier coding here, lots of '

    ReplyDelete
  2. Norbet,
    A comment about the following:

    "Zwart, Idsardi and Lidz, Baltin, Reuland, Drummond, Hicks, and me have proposed that local anaphora and obligatory control live on A-chains and so anaphora and control are also special cases of an A-chain dependency (hence the product of I-merge).[3]"

    As far as I know the essence of the idea in question was, much to Chomsky's dismay, first proposed by me some 30 years ago. It was published in Linguistic Inquiry 15 (1984), 417-459, under the title *On Binding and Control* and, with some modifications, in my book *Domains and Dynasties* (1987, Ch 3) as *Anaphoric and Non-Anaphoric Control.* I built on some observations by Edwin Williams and the idea was also picked up by Bouchard (1984). All of this was pre-Minimalist.

    As for Merge as an app (re: your note 6), I didn't say that. I said almost the opposite, namely that language is an app of Merge (no matter whether Merge is innate or not). In that view, saying that Merge is the Faculty of Language in the Narrow sense (FLN) is just as strange as saying that the engine of a car is the Car in the Narrow sense (CN). Primary linguistic functionality comes from the (invented) lexicon, not from Merge.

    ReplyDelete
    Replies
    1. Jan is right about the pedigree. Sorry I forgot. I was alluding to those pushing HIS idea in a minimalist context. As excuse, please note that I note his precedence in my own work. So sorry.

      It also seems that I got the app allusion wrong. As Jan puts it now, I think we agree, language is what pops out when MERGE or something like it, gets added to the cognitive architecture we had before. The trick is to demonstrate this. That's the research project.

      Last point of agreement: merge is not FL, a point I tried to make in the post. If this is what the term 'FLN' was trying the say that it simply confused matters. I read the point slightly differently, I.e. that what is strictly linguistic might be merge. But on the substance, I am happy with Jan's version.

      So sorry and mea culpa.

      Delete
  3. Thanks, Norbert, you are officially forgiven!

    ReplyDelete
  4. I have read this post with interest and, for the most part, it was what i would have expected. However, two statements surprised me a bit and [before jumping to conclusions] I hope to get some clarification on them.

    1. "Again, let me make clear, that ‘GB’ here includes its kissing cousins LFG, HPSG, RG, etc. To my mind there is very little that empirically or conceptually distinguishes these various “frameworks.” Indeed thinking of these as different “frameworks” is analogous to thinking that a ling paper written in French is done in a different “framework” from one written in English or Japanese."

    Paul Postal has [quite patiently] explained to me that and why a translation from GB to RG could not work like one from say English to German. You seem to think otherwise. Maybe you could provide a few examples demonstrating how such translations can be done?

    2. "One hunch has been that this capacity lives on very specialized linguistic “circuits.” Minimalism is questioning this view, and to the degree that unification works, we might be able to whittle down (possibly to zero) what is specifically linguistic in this sense. However, even if all the basic operations and principles are cognitively-biologically-computationally generic, it is clear that they have been put together in humans (and it appears in humans alone) in a very special way. "

    Here the second part [all operations are generic but they are put together in a 'human-specific' way] seems to be compatible with what 'conncetionists' [e.g. Elman] or 'constructivists' [e.g. Tomasello] have been saying for decades. It even seems compatible with what Everett has proposed. Considering the strong opposition minimalists have to those views I'd love to get some more details here. Is 'how they have been together' something that is 'innate' on the minimalists view or how else would the minimalist 'option' be different from the views I mention?

    Finally, it seems that linguistics blogs are suddenly the thing! I see there's a new, very interesting and informative one by Pieter Seuren and think it might be of interest to some of the 'regulars' http://pieterseuren.wordpress.com/2013/02/28/mickey-mouse-linguistics/

    ReplyDelete
  5. Probably my biggest issue with the MP is the binarism assumption; there's a fair amount of evidence lying around that some kind of headed binary composition is important, but that's far from establishing that that's all there is.

    More generally, I think it's a mistake to attempt to try deduce as much as possible from a small number of principles which don't have truly overwhelming empirical support. A sample bad effect would be the peculiar resistance that Minimalists seem to have to paying attention to case-stacking phenomena; this should have been on people's radar from the time of the appearance of Nick Evans' grammar of Kayardild & Franz Plank's 'Double Case' book in 1995, even moreso after Rachel Nordlinger; 1998 book came out, but Baker's 2008 book has not a syllable about it, and the proposal he makes for concord seems to me to be hopelessly trashed by the phenomena.
    Since Baker is very typologically aware, I find it hard to understand this as anything than theory-induced blindness.

    There are also of course plenty of successes; my preference woud be to have had the original papers with their scary warnings about how speculative it all was, but no Chomsky, Hauser & Fitch (2002).

    ReplyDelete
  6. Avery, re: "the peculiar resistance that Minimalists seem to have to paying attention to case-stacking phenomena", try:

    Richards, Norvin. 2012. Lardil ‘‘case stacking’’ and the timing of case assignment. Syntax. [DOI: 10.1111/j.1467-9612.2012.00169.x]

    Pesetsky, David. 2012. Russian case morphology and the syntactic categories. http://ling.auf.net/lingbuzz/001120 (the version to appear as an MIT Press LI Monograph has more material about Lardil in it)

    Levin, Theodore. Korean Nominative Case-Stacking: A Configurational Account. To appear in Japanese/Korean Linguistics 22 http://dl.dropbox.com/u/62481330/JK22%20Proceedings%20(pre-pub).pdf -- and other work in progress by him

    I'm sure there's more, but this is work on my personal radar screen, for obvious reasons.

    ReplyDelete
  7. Thanks for the references, note that these are all extremely recent, but it's excellent that it's finally starting to happen.

    ReplyDelete
    Replies
    1. From the point of view of mathematical linguistics, case-stacking is massively important too, along with the Yoruba relative clause duplication effects and so on studied by Greg Kobele.

      Delete
    2. I've noticed that Stabler is dealing with it now.

      Delete