Friday, January 6, 2017

Inchoate minimalism

Chomsky often claims that the conceptual underpinnings of the Minimalist Program (MP) are little more than the injunction to do good science. On this view the eponymous 1995 book did not break new ground, or announce a new “program” or suggest foregrounding new questions. In fact, on this view, calling a paper A Minimalist Program for Linguistic Theory was not really a call to novelty but a gentle reminder that we have all been minimalists all along and that we should continue doing exactly what we had been doing so well to that point. This way of putting things is (somewhat) exaggerated. However, versions thereof are currently a standard trope, and though I don’t buy it, I recently found a great quote in Language and Mind (L&M) that sort of supports this vision.[1] Sorta, kinda but not quite.  Here’s the quote (L&M:182):

I would, naturally, assume that there is some more general basis in human mental structure for the fact (if it is a fact) that languages have transformational grammars; one of the primary scientific reasons for studying language is that this study may provide some insight into general properties of mind. Given those specific properties, we may then be able to show that transformational grammars are “natural.” This would constitute real progress, since it would now enable us to raise the problem of innate conditions on acquisition of knowledge and belief in a more general framework….

This quote is pedagogical in several ways. First, it does indicate that at least in Chomsky’s mind, GG from the get-go had what we could now identify as minimalist ambitions. The goal as stated in L&M is not only to describe the underlying capacities that make humans linguistically facile, but to also understand how these capacities reflect the “general properties of mind.” Furthermore, L&M moots the idea that understanding how language competence fits in with our mental architecture more generally might allow us to demonstrate that “transformational grammar is “natural”.” How so? Well in the obviously intended sense that a mind with the cognitive powers we have would have a faculty of language in which the particular Gs we have would embody a transformational component. As L&M rightly points out, being able to show this would “constitute real progress.” Yes it would.

It is worth noting that the contemporary conception of Merge as combining both structure building and movement in the “simplest” recursive rule is an attempt to make good on this somewhat foggy suggestion. If by ‘transformations’ we intend movement, then showing how a simple conception of recursion comes with a built in operation of displacement goes some distance in redeeming the idea that transformational Gs are “natural.”[2]

Note several other points: The L&M quote urges a specific research strategy: if you are interested in general principles of cognition then it is best to start the investigation from the bottom up. So even if one’s interest is in cognition in general (and this is clearly the L&M program) then right direction of investigation is not from, e.g. some a priori conception of learning to language but from a detailed investigation of language to the implications of these details for human mental structure more generally. This, of course, echoes Chomsky’s excellent critiques of Empiricism and its clearly incorrect and/or vacuous conceptions of reinforcement learning. 

However, the point is more general I believe. Even if one is not Empiricistically inclined (as no right thinking person should be) the idea that a body of local doctrine concerning a specific mental capacity is an excellent first step into probing possibly more general capacities seems like excellent method. After all, it worked well in the “real” sciences (e.g. Galileo’s, Copernicus’ and Kepler’s laws were useful stepping stones to Newton’s synthesis) so why not adopt a similar strategy in investigating the mind/brain? One of GGs lasting contributions to intellectual life was to demonstrate how little we reflexively know about the structure of our mental capacities. Being gifted linguistically does not imply that we know anything about how our mind/brain operates. As Chomsky likes to say, being puzzled about the obvious is where thinking really begins and perhaps GG’s greatest contribution has been to make clear how complex our linguistic capacities are and how little we understand about its operating principles.

So is the Minimalist Program just more of the same, with nothing really novel here? Again, I think that the quote above shows that it is not. L&M clearly envisioned a future where it would be useful to ask how linguistic competence fits into cognition more broadly. However, it also recognized that asking such “how” questions was extremely premature. There is a tide in the affairs of inquiry and some questions at some times are not worth asking. To use a Chomsky distinction, some questions raise problems and some point to mysteries. The latter are premature and one aim of research is to move questions from the second obscure mystical column to the first tractable one. This is what happened in syntax around 1995; the more or less rhetorical question Chomsky broached in L&M in the late 60s became a plausible topic for serious research in the mid 1990s! Thus, though there is a sense in which minimalism was old hat, there is a more important sense in which it was entirely new, not as regards general methodological concerns (one always values simplicity, conciseness, naturalness etc) but in being able to ask the question that L&M first posed fancifully in a non-trivial way: how does/might FL fit together with cognition more generally?

So what happened between 1968 and 1995? Well, we learned a lot about the properties of human Gs and had plausible candidate principles of UG (see here for some discussion). In other words, again to use Chomsky’s framing (following the chemist Davy), syntax developed a “body of doctrine” and with this it became possible to use this body of doctrine to probe the more general question. And that’s what the Minimalist Program is about. That’s what’s new. Given some understanding of what’s in FL we can ask how it relates to cognition (and computation) more generally. That’s why asking minimalist questions now is valuable while asking them in 1967 would have been idle.

As you all know, there is a way of framing the minimalist questions in a particularly provocative way, one that fires the imagination in useful ways: How could this kind of FL with these kinds of principles have evolved? On the standard assumption (though not uncontroversial, see here on the “phenotypic gambit”) that complexity and evolvability are adversarial, the injunction to simplify FL by reducing its linguistically proprietary features becomes the prime minimalist project. Of course, all this is potentially fecund to the degree that there is something to simplify (i.e. some substantive proposals concerning what the operative FL/UG principles are) and targets for simplification became worthwhile targets in the early 1990s.[3] Hence the timing of the emergence of MP.

Let me end by ridding off on an old hobbyhorse: Minimalism does not aim to be a successor to earlier GB accounts (and its cousins LFG, HPSG etc). Rather MP’s goal is  to be a theory of possible FL/UGs. It starts from the assumption that the principles of UG articulated from 1955-1990s are roughly correct, albeit not fundamental. They must be derived from more general mental principles/operations (to fulfill the L&M hope). MP is possible because there is reason to think that GB got things roughly right. I actually do think that this is correct. Others might not. But it is only once there is such a body of FL/UG doctrine that MP projects will not be hopelessly premature. As the L&M quote indicates, MP like ambitions have been with us for a long time, but only recently has it been rational to hope that they would not be idle.

[1] Btw, L&M is a great read and those of you who have never dipped in (and I am looking at anyone under 40 here) should go out and read it.
[2] And if we go further and assume that all non-local dependencies are mediated by ((c)overt) movement then all variety of transformations are the product of the same basic “natural” process. Shameless plug: this is what this suggests we do.
[3] Why then? Because by then we had good reasons for thinking that something like GB conception of UG was empirically and theoretically well-grounded. See here (and four following entries) for discussion.


  1. This post does a great job of zeroing in on what is, for me, one of the main causes for skepticism towards the Minimalist Program.

    The point of MP, as I see it, is to ask: given that language (or more accurately, the language faculty) has property X, why does it have that property? And to echo something that I said in the comments to this post, it looks to me like many of these "why" questions will prove extremely sensitive to minor perturbations in X. Where I depart from Norbert is that, from my vantage point, we are nowhere near nailing down the exact details on these Xs. We're still discovering new stuff all the time about Xs that were supposedly a done deal.

    This is a far cry from knowing nothing at all. Norbert reminds us, and I wholeheartedly agree, that generative linguistics has discovered tons of important and nontrivial generalizations (see Norbert's comment here, where he lists more than 25 of them). But the issue of minor perturbations looms large. Example: people thought they had a good handle on how syntactic islands work, and proceeded to try to explain why they work that way, based on the idea that islands are spellout domains. But then Minimal Compliance-effects in Bulgarian come along, showing us that spellout domains cannot possibly account for syntactic islands. (Really condensed version: it turns out that whether a domain D is an island with respect to movement M depends not only on the identity and structure of D, but also on whether there is another, locality-obeying movement operation that happens to land in the same periphery as M.) So, turns out that the "why" question was premature, and the resulting attempt to answer it was consequently off track.

    Like anything else, my assessment might be wrong. Time will tell. And far be it from me to tell someone else what to work on anyway. But if you encounter skepticism of MP among rank-and-file generativists like myself, this may be one of the main reasons.

    1. I can see that massive distance between linguists reading such a post and people in psychology and cognitive neuroscience doing so. From my vantage point, something like the MP is absolutely necessary if there is to be any meaningful connection between syntactic theory and psychology and cog neuro. So while I appreciate the concerns that linguist may raise with the iconoclastic character of the MP, at the same time, not vigorously pursuing MP means burrowing into a hole of irrelevance with respect to the rest of cognitive science. The MP has to be right in some sense, if syntactic theory is meant to make mentalistic/biological commitments. I have more elaborated thoughts on this to publish in the future that might be more convincing than my off-the-cuff statements here.

    2. @Omer: I think your point about perturbations is spot-on, but I'd say it's a problem with less MP-ish work, too. The problem is due to what I have called "sniping" around this corner of the internet before: linguists' desire to give a perfect answer, maximally informative, conceptually elegant, with no under- or overgeneration.

      My favored alternative to sniping is "carpet bombing". A good theory needs to have safe fallback positions, otherwise the chance is very high that a single piece of data can force you back to square one. Carpet bombing solves this by replacing the search for proposals with the search for classes of proposals. Rather than defining the class of natural languages, you define a superclass and try to monotically narrow down that set as more and more data comes in. And rather than giving an analysis of a specific phenomenon, you describe the space in which the right analysis or analyses can be found and leave it open as to which one is really the correct one. That's why I like the computational approach so much, it gives you the flexibility to carve out well-defined classes without venturing into "anything goes" territory.

      I think the same strategy can be co-opted for traditional linguistic research. Many syntax papers make many more assumptions than they actually need, starting with things like VP-internal subjects, probe-goal rather than covert movement, and so on, when all that is needed is that a certain dependency holds between X and Y, irrespective of how that is encoded. I understand the goal: a shared base on which all work rests so seemingly unrelated ideas and generalizations can easily be linked together. More specific assumptions also make it easier to tackle very subtle and complicated data points. But you can have your cake and eat it too: carve out a general set of reliable core data and specify the minimal set of assumptions needed to make it work, then add on assumptions to get the more subtle and finicky stuff at the edges. Don't build one monolithic account to rule them all, create a modular system of assumptions that can be grown and shrunk as necessary.

      The prize of such safety is specificity, of course. An approach that deliberately says "one of these proposals is right, though we won't commit to any specific one" is necessarily less specific than one that insists that even the way it writes feature values is cognitively real. So the question everybody has to ask themselves is, how much specificity do you want, and why?

      Personally, I believe that answers to the big questions asked by Minimalism (acquisition, evolution, computation) require way less specificity than what you'll find in your average syntax paper.

    3. I am not sure if the following agrees with Thomas's point or is making a different one. Here goes.

      There are many properties that have reasonable empirical backing. Many of these are accurate, I believe, but some may not be as further inquiry will show. An MP proposal can choose which of these GBish generalizations it aims to derive. So for example, that c-command regulates long distance dependencies or that lowering rules don't obtain or that intervention effects regulate G relations or that locality matters for reflexivization and pronominalization and that the latter two are in complementary distribution seem like relatively well ensconced features of Gs that a theory of FL/UG should aim to explain. Case and agreement? Well thee are apparently more controversial, i.e. what regulate these relations. Are there other principles, e.g. minimal compliance, perhaps. But, the fact that there are controversial features dos not mean that some are not more controversial than others. Does anyone really think that c-command is a descriptively useful and ventral feature regulating G commerce? Or minimality in some form? MP is not an all or none affair. Some features will be amenable to explanation, some things will be "explained" that turn out to be false generalizations, some generalizations will elude explication. That's the way science works, and not only in linguistics. It is still true that much of chemistry resists physical explanation despite quantum mechanics having "united" the two fields. That's life. That linguists should "wait" until we know the facts clearly is, IMO, disastrous policy. That we should do what we can and be ready to be wrong seems like a better research method.

    4. @Omer: This might be making a similar point to what Norbert says at the end of his comment, but I'm wondering how you think one might evaluate our epistemic standing with respect to whether a given fact about the faculty of language is ripe for explanation. To me, at least, this seems really nontrivial, particularly given the possibility of accidental typological gaps in attested natural languages. I think I'm therefore inclined to agree with Norbert that "we should do what we can and be ready to be wrong". Nonetheless, I'm curious if you have thoughts on how one could evaluate whether we're certain enough to really think that fact X about the faculty of language really is a fact, rather than a misgeneralization. It's a fun/interesting question to think about. :)

  2. This comment has been removed by the author.

  3. @Adam: Great question. The first thing to note is that it's all a matter of degrees. Norbert thinks we weren't ready for MP-style explorations in the late 60s, but that by the mid 90s, we were. In contrast to Norbert, I think that the knowledge state in linguistics as of the mid 90s – or our current knowledge state, for that matter – represent a much smaller fraction of the total amount of "standard" linguistic discoveries that still await. (By "standard" I just mean discoveries gleaned by pre-MP-style explorations.) That is to say, the question of how one evaluates our epistemic standing with respect to whether a given fact about the faculty of language is or isn't ripe for explanation should also be posed to Norbert, since his evaluations of the 60s and 90s bodies of knowledge apparently differ.

    [Tangentially related comic(?) relief: the Doomsday Argument.]

    This brings me to my second point: I doubt that there is a cut-and-dry answer to your question – different people will have different research heuristics. So let me tell you mine (this was, perhaps, all you were asking in the first place). While I agree with Norbert that looking at only one language can in principle suffice (the whole "model organism" thing), my assessment of how well this has worked in practice is far less rosy than his. This is, perhaps, a reflection of the fact that I work on agreement and case, areas in which generative linguistics is only beginning to approach anything resembling an adequate theory, and I mean this again in the pre-MP sense of adequate. (Disclaimer: the preceding statement is probably biased; I have an obvious horse in the race.)

    And so in practice, I treat with suspicion any theory that has not been subjected to at least a modicum of crosslinguistic vetting. To reiterate, this is not because I think the alternative is wrong in principle, but because I think it has failed (occasionally, quite dramatically so) in practice.