Monday, August 4, 2014

Final comments on lecture 4

This ends the comments (here and here) on lecture 4.

The logic used to account for the EPP also covers Fixed Subject Condition effects (FSC).[1] Consider (2’) again:

(2’) *Who1 did John say that t1 saw Mary

The T needs to be labeled. If who moves there will be nothing in the Spec-T to agree with and so labeling will fail.  That’s Chomsky’s story. The obvious problem with this account is the absence of FSCs if there is no overt complementizer (I pointed to this in the comments to lecture 3). Chomsky addresses this problem here. He proposes that a deleted C is no longer a phase. More exactly, to delete a C you must transfer the feature that says that C is a phase to T. In effect, C deletion makes T the phase head. So not only do we lower phi and tense features form C to T, but phase-headedness as well.  This now ties FSCs to the presence of absence of an overt C.[2]

Observation 1: This story requires that that is deleted rather than not present at all. Were it never present, C could not transfer its features to T, and T has not features of its own (more below). Thus, to make this work, we need deletion operations in the syntax. A question that arises is how similar the operation deleting that is to more run of the mill ellipsis operations. The latter are generally treated as simply dephoneticization processes. This will not suffice here. It must be that getting rid of phonetic content requires that all features of C lower to T. For those with long enough memories, this smells a little of the old notion of “L-contains.”  At any rate, it’s worth observing that C deletion is not simply quieting the phonetics.

Observation 2: there are well known variations regarding that-t effects dialectally in English. This suggests that deletion might be sufficient for transferring all of Cs features to T but it is not necessary. So, contrary to what Chomsky suggests, the explanation requires that we say something “special” about these FSCs in English. IMO, things are a little worse than this. As I mentioned in earlier comments, many speakers seem to allow violations of the FSC in English even with if/whether in the C position (or at least so report a third of my syntax undergrads). There is no problem accommodating this by allowing feature lowering as Chomsky suggests. But this is now decoupled from phonetic articulation. Unfortunately, in the relevant dialects, deleting a that does not license null subjects, which one might have expected (4).  Or more exactly, why doesn’t lowering all the features of C to T serve to strengthen T? Note that Cs can label just fine without anything in Spec-C helping them along. Given this, why shouldn’t lowering all the features of C onto T (including the “phase head feature” see below) not make T as independent as C?  Dunno, but it doesn’t.
(1)  *John thinks (that) is a man here

In other words, the EPP and FSC do not really swing together, though they should if they were truly unified, one might suppose. 

Let’s put these questions to one side and continue. Chomsky then asks how we can get (2):

(2)  Who do you think t is kissing Sue

How do we label the lower “TP” if who moves. Chomsky says that it is labeled before C deletes and labels cannot be deleted. In other words, labels are indelible (think Lasnik and Saito).  Now, Chomsky really doesn’t like this way of putting things. What he wants to say is that CSs have phase sized memories (i.e. CS can recall what operations have taken place within a phase). In other words, there is phase level memory for all syntactic operations.  By lowering the phase-hood to T from C, the next higher v* phase can “remember” that the lower T was given a label via agree and so movement of the DP in Spec T is ok.  So, it is not that the labeling is indelible, but that all operations that happen within a phase are recollected in that phase. So, once labeled, memory tells us that it is always labeled. 

This emphasizes the computational aspects of phase theory.  What’s important about phases is that they reduce memory demands of a computation. The reverse of this is that it allows some memory of previous operations to be retained.  This is quite definitely not a conceptual argument. There is no conceptual motivation for these assumptions. The motivations are computational. The question becomes how we bring information forward in a derivation, how forgetting can be computationally efficient etc.  Phases and the properties Chomsky relies on here are entirely of this variety.

Comment: Two things: this is very Barrerish in spirit. Phases are no longer fixed, but change in the course of the derivation (as Den Dikken was the first to propose). And call it what you want, indelibility is back.  Moreover, just as in Barriers, T has a strange role in this system. It is not an inherent phase but can become one by inheritance. Sound familiar?[3] To me, this all has the feeling of a Rube Goldberg device, but this is partly a matter of taste. Some might think Barriers elegant. Go figure.

Let me make my unease clearer. It seems that Chomsky is not that happy with T and its various special properties within his system (again, just like T in Barriers).  He, in fact, proposes, that T has no properties of its own. It’s just there to receive properties from C. This makes T very similar to Agr in older MP, and recall that Chomsky argued that grammar internal formatives like Agr are to be eschewed. They cause DP headaches, as do any grammar internal formatives (viz. we need to explain how they and their properties got into FL).  Worse, IMO, the special properties of T are critical in Chomsky’s explanations of the EPP and FSC effects. But, this strikes me as a non-trivial problem for his account. Why doesn’t explaining these special properties of FL in terms of special properties of T not amount to re-description rather than explanation? One of the salutary effects of MP has been to warn us about confusing the two. However, the more T is special, the more accounts of Spec-T effects (EPP and FSCs) are weakened. And from what I can tell, Ts special properties are critical in deriving the results Chomsky obtains.

There are other Barrier like resonances here. Recall that in the Barriers framework, VP was never a barrier. Why? Because we could always adjoin to it and thereby void its barrierhood.  In the present story, there is also a big asymmetry between C and v*. The latter never displays FSC or ECP effects. Why not? Because v* always looses its phase property. How? Because Chomsky assumes that when the root raises to v*, v* gets buried inside the raised root and thereby looses its phase properties.  Though Chomsky does not discuss this, it raises the question of the role of v* in a phase based theory if it always looses its phase property. Do v*s no longer induce PIC effects?  It would be very odd to assume that the lower copy of the root inherited the phase property, like T inherits it from C. After all, this is the tail of a head chain and tails are generally grammatically inert.  So, one conclusion could be that v* never induces PIC effects on this revised account. Of course, once again, we can add technical fixes to obviate this conclusion, but, at least for me, their motivations will be empirical not conceptual.  This is not bad, but it does do some damage to the SMT line of reasoning Chomsky cherishes.

So, v* gets special treatment because of the properties of roots and T gets special treatment because its just, well, odd. The story may hang together, but it is hardly conceptually pretty, at least from where I sit.

Question: When R(V) raises to v* what happens to the phase property of v*? I assume that it is eliminated and this is why there are no EPP/FSC effects. Ok, does R(V) inherit the phase-hood property? This seems unlikely, as it is the tail of the head chain, but maybe.  If not, is the phase-hood of v* simply voided and what we are left with is one big C-phase?  If so, how does this enhance computational efficiency?[4]  

This post is already way too long. Let me end with three more highlights and maybe a remark or two.

Lecture 4 drops the idea that all grammatical action takes place at the phase head. In particular he allows I-merge to apply completely independently of any relationship to C. He does this to get rid of the counter-cyclic movement he needed in PoP. Recall counter-cyclic movement violates the NTC, which is a part of the conceptually best version of Merge. Chomsky really didn’t want to allow it and he eliminates here at the cost of abandoning the assumption that all grammatical operations are with the phase head. 

A consequence of this is that Chomsky has to abandon his explanation from PoP as to why Gs raise T to C but don’t raise DPs to C instead given that they are equally close if SLOs are without labels. You may recall, that he accounted for this by moving the subject to Spec T counter-cyclically in derivations that applied “all at once” at the phase level. 

I’m glad that Chomsky drops this now for two reasons. First, I never liked his earlier explanation (indeed I was part of a trio arguing that it didn’t work and was the wrong way to proceed (here)). Second, I never understood what “all at once” derivations meant. Ever.  I pretended to and taught it, but never got it.  Now I don’t have to, it seems. Curiously, Chomsky not only drops this analysis but states that it was always “artificial.” Yup.

Chomsky also finally abandons the last residues of Greed based MP theories. Merge is free. If you follow him here, you can stop worrying about what motivates this or that movement. Note, this is conceptually the right move (and I have thought this for a long time and even said so publically). Chomsky notes that E-merge is not subject to greed considerations and so I-merge shouldn’t be either, given that they are two instances of the very same operation.  Again, yup.  So much for probing and agreeing being a pre-condition for I-merge.

Does this mean that movement is never “for a reason”? Well it does mean that it is never for a local CS reason. Rather, we return once again to a “generate and filter” theory of computation, similar to what we had in GB. The main difference is that this time the filters are provided by Bare Output Conditions, in particular the requirement that all SLOs be labeled prior to Transfer and that MLA sometimes requires what is effectively a Spec-X configuration to provide a viable label. So, Probe/goal seems out and Spec-X is back. Plus ca change (and I know I’m missing a thingy under ‘c’).

This is enough. At least I’ve had enough. Let me once again end on a positive note. As you might have noticed, I am not (yet) convinced by Chomsky’s story here. I believe that he has mis-analyzed the role of labels in G.  His approach rests on the assumption that labels play no role in CS. They are only relevant to interface interpretation.  I currently believe that this is likely wrong. At the very least, I don’t really see how labels are required for CI (or SM) interpretive rules to apply. For CI, at least, all we need, so far as I can make out, is the branching structure and the contents of the individual atoms.  So the premise that motivates the MLA as a BOC seems (at least to me) very shaky.

However, though I am not that moved by Chomsky’s proposal, I am moved by his method and overall conception of the enterprise. I wholeheartedly agree that we should take Galileo’s Maxim as a strong boundary condition on theory.  I completely agree that we should respect GGs history and treat its generalizations as the targets for more principled explanation.[5] I also agree that the critics of GB that he mentions have contributed virtually nothing to our understanding of FL. I even believe that Chomsky’s efforts are an excellent illustration of what we should be aiming to do.  I just am not moved by the details of his effort.  But, really, that’s a small disagreement, among friends.

[1] Chomsky calls these ECP effects. I changed this to FSC effects to distinguish the subject/object asymmetry effects in the ECP from the argument/adjunct ones. Historically, the latter have proven more recalcitrant and were the ones that called forth the heavy ECP assumptions. Indeed, for some analyses, a good part of the subject/object asymmetries were relegated to head government effects (Rizzi, Aoun et al, Saito) more on the SM side of things than the CI.
[2] Transfer of the “phase feature” (PF) from C to T seems like a clear violation of the NTC. As stated, T that has no inherent PF receives one and thereby changes its grammatical powers. One can play around with definitions so that this violation of the NTC doesn’t count, but the definitions will not enjoy the conceptual naturalness that the conceptual versions of the SMT rely on.
[3] It seems that Chomsky is not that happy with T and its various properties.  He, in fact, suggests, that T has no properties of its own. It’s just there to receive properties from C. This makes T very similar to Agr in older MP, and recall that Chomsky argued that such grammar internal formatives are to be eschewed. They cause DP headaches, as do any grammar internal formatives.  The special properties of T are critical in Chomsky’s explanations of the EPP and FSC effects. But, this strikes me as a problem for his account. Why doesn’t explaining these special properties of FL in terms of special properties of T not amount to re-description rather than explanation? One of the salutary effects of MP has been to warn us about confusing the two. However, the more T is special, the more accounts of Spec-T effects (EPP and FSCs) are weakened. 
[4] And talking about memory load: what about the phase hood of D?  Chomsky really does not like to talk about D as a phase. He reluctantly mentions that it might be one from time to time, but it is reluctantly.  But if it is not a phase, then the memory demands on phases can be arbitrarily big given the recursive proclivities of DPs.  So, it would seem that on computational grounds, if v* is a phase then D should be one as well. Indeed, given that “clauses” are already bounded at C, why do we need to also computationally bound them at v*?  Remember, what constitutes a phase needs to be coded into FL and this causes problems for DP. So, all in all, we want the fewest phases we can get away with.
[5] I would go further: I think that these lectures constitute a bit of a departure in Chomsky’s minimalist style. The targets of explanation are now very broad generalizations that GG has established, the aim being to explain their properties.  Earlier efforts were far more local, the targets of explanation being Existential Constructions in Icelandic.  I think that this more general project gets the explanatory grain more correct.


  1. This comment has been removed by the author.

  2. Two quick points:

    1. Color me disappointed when it comes to Chomsky's apparent recommitment to the "generate-and-filter" architecture. There have always been some meaningful conceptual arguments around against this architecture (e.g. the fundamentally poor fit between this architecture and the realtime use of the grammar in parsing & production; we recently hashed this out on the blog, in the comments on this post).

    But if I am correct (shameless self-promotion!), there are also more conventional, "classic linguistic" arguments: generation-and-filtration simply does not get the facts right.

    2. Setting aside whether DP is a phase or not, there are other problems with the memory/efficiency argument for phases. Infinitive TPs can be unboundedly nested in one another, too (John seems to be likely to have appeared to have been...), and here no phases will come to the rescue to chop this structure down to size.

    [NB1: The issue here is the amount of structure that has to be kept in memory at once; it is independent of the (independently interesting) question of whether A-movement is successive-cyclic through each [Spec,TP] on its way to the final, finite one.]

    [NB2: One way out of this, for the infinitival scenario, is accepting Legate's (2003) position that even passive/unaccusative vP is a phase.]

  3. I agree on both points. Phases don't walk the walk when it comes to limiting memory resources if one adopts weak phases. And if one goes Legate, then the agreement facts cited for weak phases are hard to handle in a probe/goal theory. At any rate, the rhetoric is often ahead of the results.