One of the nice thing about conferences is that you get to bump into people you haven’t seen for a while. This past weekend, we celebrated our annual UMD Mayfest (it was on prediction in ling sensitive psycho tasks) and, true to form, one of the highlights of the get together was that I was able to talk to Masaya Yoshida (a syntax and psycho dual threat at Northwestern) about islands, subjacency, phases and the argument-adjunct movement asymmetry. At any rate, as we talked, we started to compare Phase Theory with earlier approaches to strict cyclicity (SC) and it again struck me how unsure I am that the new fangled technology has added to our stock of knowledge. And, rather than spending hours upon hours trying to figure this out solo, I thought that I would exploit the power of crowds and ask what the average syntactician in the street thinks phases have taught us above and beyond standard GB wisdom. In other words, let’s consider this a WWGBS (what would GB say) moment (here) and ask what phase wise thinking has added to the discussion. To set the stage, let me outline how I understand the central features of phase theory and also put some jaundiced cards on the table, repeating comments already made by others. Here goes.
Phases are intended to model the fact that grammars are SC. The most impressive empirical reflex of this is successive cyclic A’-movement. The most interesting theoretical consequence is that SC grammatical operations bound the domain of computation thereby reducing computational complexity. Within GB these two factors are the province of bounding theory, aka Subjacency Theory (ST). The classical ST comes in two parts: (i) a principle that restricts grammatical commerce (at least movement) to adjacent domains (viz. there can be at most one bounding node (BN) between the launch site and target of movement) and (ii) a metric for “measuring” domain size (viz. the unit of measure is the BN and these are DP, CP, (vP), and maybe TP and PP). Fix the bounding nodes within a given G and one gets locality domains that undergird SC. Empirically A’-movement applies strictly cyclically because it must given the combination of assumptions (i) and (ii) above.
Now, given this and a few other assumptions and it is also possible to model island effects in a unified way. The extra assumptions are: (iii) some BNs have “escape hatches” through which a moving element can move from one cyclic domain to another (viz. CP but crucially not DP) (iv) escape hatches can accommodate varying numbers of commuters (i.e. the number of exits can vary; English thought to have just one, while multiple WH fronting languages have many). If we add a further assumption - (v) DP and CP (and vP) are universally BNs but Gs can also select TP and PP as BNs – the theory allows for some typological variation. (i)-(v) constitutes the classical Subjacency theory. Btw, the reconstruction above is historically misleading in one important way. SC was seen to be a consequence of the way in which island effects were unified. It’s not that SC was modeled first and then assumptions added to get islands, rather the reverse; the primary aim was to unify island effects and a singular consequent of this effort was SC. Indeed, it can be argued (in fact I would so argue) that the most interesting empirical support for the classical theory was the discovery of SC movement.
One of the hot debates when I was a grad student was whether long distance movement dependencies were actually SC. Kayne and Pollock and Torrego provided (at the time surprising) evidence that it was, based on SC inversion operations in French and Spanish. Chung supplied Comp agreement evidence from Chamorro to the same effect. This, added to the unification of islands, made ST the jewel in the GB crown, both theoretically and empirically. Given my general rule of thumb that GB is largely empirically accurate, I take it as relatively uncontroversial that any empirically adequate theory of FL must explain why Gs are SC.
As noted in a previous post (here), ST developed and expanded. But let’s leave history behind and jump to the present. Phase Theory (PT) is the latest model for SC. How does it compare with ST? From where I sit, PT looks almost isomorphic to it, or at least a version that extends to cover island effects does. A PT of this ilk has CP, vP and DP as phases. It incorporates the Phase Impenetrabiltiy Condition (PIC) that requires that interacting expressions be in (at most) adjacent phases. Distance is measured from one phase edge to the next (i.e. complements to phase heads are grammatically opaque, edges are not). This differs from ST in that the cyclic boundary is the phase/BN head rather than the MaxP of the Phase/BN head, but this is a small difference technically. PT also assumes “escape hatches” in the sense that movement to a phase edge moves an expression from inside one phase into the next higher phase domain and, as in ST, different phases have different available edges suitable for “escape.” If we assume that Cs have different numbers of available phase edges and we assume that D has no such available edges at all then we get a theory effectively identical to the ST. In effect, we traded phase edges for escape hatches and the PIC for (i).
There are a few novelties in PT, but so far as I can tell they are innovations compatible with ST. The two most distinctive innovations regard the nature of derivations and multiple spell out (MSO). Let me briefly discuss each, in reverse order.
MSO is a revival of ideas that go back to Ross, but with a twist. Uriagereka was the first to suggest that derivations progressively make opaque parts of the derivation by spelling them out (viz. spell out (SO) entails grammatical inaccessibility, at least to movement operations). This is not new. ST had the same effect, as SC progressively makes earlier parts of the derivation inaccessible to later parts. PT, however, makes earlier parts of the derivation inaccessible by disappearing the relevant structure. It’s gone, sent to the interfaces and hence no longer part of the computation. This can be effected in various ways, but the standard interpretations of MSO (due to Chomsky and quite a bit different form Uriagereka’s) have coupled SO with linearization conditions in some way (Uriagereka does this as do Fox and Pesetsky, in a different way). This has the empirical benefit of allowing deletion to obviate islands. How? Deletion removes the burden of PF linearization and if what makes an island an island are the burdens of linearization (Uriagereka) or frozen linearizations (Fox and Pesetsky) then as deletion obviates the necessity of linearization, island effects should disappear, as they appear to do (Ross was the first to note this (surprise, surprise) and Merchant and Lasnik have elaborated his basic insight for the last decade!). At any rate, interesting though this is (and it is very interesting IMO), it is not incompatible with ST. Why? Because, ST never said what made an island an island, or more accurately, what made earlier cyclic material unavailable to later parts of the computation. (i.e. it had not real theory of inaccessibility, just a picture) and it is compatible with ST that it is PF concerns that render earlier structure opaque. So, though PT incorporates MSO, it is something that could have been added to ST and so is not an intrinsic feature of PT accounts. In other words, MSO does not follow from other parts of PT any more than it does from ST. It is an add-on; a very interesting one, but an add-on nonetheless.
Note, btw, that MSO accounts, just like STs require a specification of when SO occurs. It occurs cyclically (i.e. either at the end of a relevant phase, or when the next phase head is accessed) and this is how PT models SC.
The second innovation is that phases are taken to be the units of computation. In Derivation by Phase, for example, operations are complex and non-markovian within the phase. This is what I take Chomsky to mean when he says that operations in a phase apply “all at once.” Many apply simultaneously (hence not one “line” at a time) and they have no order of application. I confess to not fully understanding what this means. It appears to require a “generate and filter” view of derivations (e.g. intervention effects are filters rather than conditions on rule application). It is also the case that SO is a complex checking operation where features are inspected and vetted before being sent for interpretation. At any rate, the phase is a very busy place: multiple operations apply all at once; expressions E and I merged, features checked and shipped.
This is a novel conception of the derivation, but again, is not inherent in the punctate nature of PT. Thus, PT has various independent parts, one of which is isomorphic to traditional ST and other parts that are logically independent of one another and the ST similar part. That which explains SC is the same as what we find in ST and is independent of the other moving parts. Moreover, the parts of PT isomorphic to ST seem no better motivated (and no less worse) than the analogous features in ST: e.g. why the BNs are just these has no worse answer within ST than the question why the phase heads are just those.
That’s how I see PT. I have probably skipped some key features. But here are some crowd directed questions: What are the parade cases empirically grounding PT? In other words, what’s the PT analogue of affix hopping? What beautiful results/insights would we loose if we just gave PT up? Without ST we loose an account of island effects and SC. Without PT we loose…? Moreover, are these advantages intrinsic to minimalism or could they have already been achieved in more or less the same form within GB. In other words, is PT an empirical/theoretical advance or just a rebranding of earlier GB technology/concepts (not that there is anything intrinsically wrong with this, btw)? So, fellow minimalists, enlighten me. Show me the inner logic, the “virtual conceptual necessity” of the PT system as well as its empirical virtues. Show me in what ways we have advanced beyond our earlier GB bumblings and stumblings. Inquiring minimalist minds (or at least one) want to know.
 This “history” compacts about a decade of research and is somewhat anachronistic. The actual history is quite a bit more complicated (thanks Howard).
 Actually, if one adds vP as a BN then Rizzi like differences between Italian and English cannot be accommodated. Why? Because, once one moves into an escape hatch movement is thereafter escape hatch to escape hatch, as Rizzi noted for Italian. The option of moving via CP is only available for the first move. Thereafter, if CP is a BN movement must be CP to CP. If vP is added as a BN then it is the first available BN and whether one moves through it or not, all CP positions must be occupied. If this is too much “inside baseball” for you, don’t sweat it. Just the nostalgic reminiscences of a senior citizen.
 vP is an addition from Barriers versions of ST, though how it is incorporated into PT is a bit different from how vP acted in ST accounts.
 There are two versions of the PIC, one that restricts grammatical commerce to expressions in the same phase and a looser one that allows expressions in adjacent phases to interact. The latter is what is currently assumed (for pretty meager empirical reasons IMO – Nominative object agreement in quirky subject transitive sentences in Icelandic, I think).
 As is well known, Chomsky has been reluctant to extend phase status to D. However, if this is not done then PT cannot account for island effects at all and this removes one of the more interesting effects of cyclicity. There has been some allusions to the possibility that islands are not cyclicity effects, indeed not even grammatical effects. However, I personally find the latter suggestion most implausible (see the forthcoming collection on this edited by Jon Sprouse and yours truly: out sometime in the fall). As for the former, well, if islands are grammatical effects (and like I said, the evidence seems to me overwhelming) then if PT does not extend to cover these then it is less empirically viable than ST. This does not mean that it is wrong to divorce the two, but it does burden the revisionist with a pretty big theoretical note payable.
 MSO is effectively a theory of the PIC. Curiously, from what I gather, current versions of PT have began mitigating the view that SO removes structure by sending it to the interfaces. The problem is that such early shipping makes linearization problematic. It is also necessitates processes by which spelled out material is “reassembled” so that the interfaces can work their interpretive magic (think binding which is across interfaces, or clausal intonation, which is also defined over the entire sentence, not just a phase).
 Nor is the assumption that lexical access is SC (i.e. the numeration is accessed in phase sized chunks). This is roughly motivated on (IMO view weak) conceptual reasons concerning SC arrays reducing computational complexity and empirical facts about Merge over Move (btw: does anyone except me still think that Merge over Move regulates derivations?).