Monday, May 30, 2016

Crucial experiments and killer data

In the real sciences, theoretical debate often comes to an end (or at least severely changes direction) when a crucial experiment (CE) ends it. How do CEs do this? They uncover decisive data (aka “killer data” (KD)) that if accurate shows that one possible live approach to a problem is empirically deeply flawed.[1] These experiments and their attendant KD become part of the core ideology and serve to eliminate initially plausible explanations from the class of empirically admissible ones.[2]

Here are some illustrative examples of CE: the Michaelson-Morley experiment (which did in the ether and ushered in special relativity (here)), the Rutherford Gold Foil experiment that ushered in the modern theory of the atom (here), the recent LIGO experiment that established the reality of gravity waves (here), the Franklin x-ray refraction pix that established the helical structure of DNA (here), the Aspect and Kwiat experiments that signaled the end of hidden variable theories (here) and (one from Wootton) Galileo’s discovery of the phases of Venus which ended the Ptolemaic geocentric universe. All of these are deservedly famous for ending one era of theoretical speculation and initiating another. In the real sciences, there are many of these and they are one excellent indicator that a domain of inquiry has passed from intelligent speculation (often lavishly empirically titivated) to real science. Why? Because only relatively well-developed domains of inquiry are sufficiently structured to allow an experiment to be crucial. To put this another way: crucial experiments must tightly control for wiggle room, and this demands both a broad well developed empirical basis and a relatively tight theoretical setting. Thus, if a domain has such, it signals its scientific bona fides.

In what follows, I’d like to offer some KDs in syntax, phenomena that, IMO, rightly terminated (or should, if they are accurate) some perfectly plausible lines of investigation. The list is not meant to be exhaustive, nor is it intended to be uncontroversial.[3] I welcome dissent and additions. I offer five examples.

First, and most famously, polar questions and structure dependence. The argument and effect is well known (see here for one elaborate discussion). But to quickly review, we have an observation about how polar questions are formed in English (Gs “move” an auxiliary to the front of the clause). Any auxiliary? Nope, the one “closest” to the front. How is proximity measured? Well, not linearly. How do we know? Because of (i) the unacceptability of sentences like (1) (which should be well formed if distance were measured linearly) and (ii) the acceptability of those like (2) (which should be acceptable if distance is measured hierarchically).

1.     *Can eagles that can fly should swim?
2.     Should eagles that can fly should swim?

The conclusion is clear: if polar questions are formed by movement, then the relevant movement rule ignores linear proximity in choosing the right auxiliary to move.[4] Note, as explained in the above linked-to post, the result is a negative one. The KD here establishes that G rules forsake linear information. It does not specify the kind of hierarchical information it is sensitive to. Still, the classical argument puts to rest the idea that Gs manipulate phrase markers in terms of their string properties.[5]

The second example concerns reflexivization (R). Is it an operation that targets predicates and reduces their addicities by linking their arguments or is it a syntactic operation that relates nominal expressions? The former treats R as ranging over predicates and their co-arguments. The latter treats R as an operation that syntactically pairs nominal expressions regardless of their argument status.  The KD against the predicate centered approach is found in ECM constructions where non co-arguments can be R related.

3.     Mary expects herself to win
4.     John believes himself to be untrustworthy
5.     Mary wants herself to be elected president

In (3)-(5) the reflexive is anteceded by a non-co-argument. So, ‘John’ is an argument of the higher predicate in (4), and ‘himself’ is an argument of the lower predicate ‘be untrustworthy’ but not the higher predicate ‘believe.’ Assuming that reflexives in mono-clauses and those in examples like (3)-(5) are licensed by the same rule, it provides KD that R is not an argument changing (i.e. addictiy lowering)[6] operation but a rule defined over syntactic configurations that relates nominals.[7] 

Here's a third more recondite example that actually had the consequence of eliminating one conception of empty categories (EC). In Concepts and Consequences (C&C) Chomsky proposed a functional interpretation of ECs.

A brief advertisement before proceeding: C&C is a really great book whose only vice is that its core idea is empirically untenable. Aside from this, it is a classic and still well worth reading.

At any rate, C&C is a sustained investigation of parasitic gap (PG) phenomena and it proposes that there is no categorical difference among the various flavors of traces (A vs A’ vs PRO). Rather there is only one EC and the different flavors reflect relational properties of the syntactic environment the EC is situated in. This allows for the possibility that an EC can start out its life as a PRO and end its life as an A’-trace without any rule directly applying to it. Rather, if something else moves and binds the PRO, the EC that started out as a PRO will be interpreted as an A or A’-trace depending on what position the element it is related to occupies (the EC is an A-trace if A-bound and an A-trace if A’-bound). This forms the core of C&C analysis of PGs, and it has the nice property of largely deriving the properties of PGs from more general assumptions about binding theory combined with this functional interpretation of ECs. To repeat, it is a very nice story. IMO, conceptually, it is far better than the Barriers account in terms of chain formation and 0-operators which came after C&C. Why? Because the Barriers account is largely a series of stipulations on chain formation posited to “capture” the observed output. C&C provides a principled theory but is wrong and Barriers provides an account that covers the data but is unprincipled.

How was C&C wrong? Kayne provided the relevant KD.[8] He showed that PGs, the ECs inside the adjuncts, are themselves subject to island effects. Thus, though one can relate a PG inside an adjunct (which is an island) to an argument outside the adjunct, the gap inside the island is subject to standard island effects. So the EC inside the adjunct cannot itself be inside another island. Here’s one example:

6.     Which book did you review before admitting that Bill said that Sheila had read
7.     *Which book did you review before finding someone that read

The functional definition of ECs implies that ECs that are PGs should not be subject to island effects as they are not formed by movement. This proved to be incorrect and the approach died.  Killed by Kayne’s KD.

A fourth case: P-stranding and case connectedness effects in ellipsis killed the interpretive theory of ellipsis and argued for the deletion account. Once upon a time, the favored account of ellipsis was interpretive.[9] Gs generated phrase markers without lexical terminals. Ellipsis was effectively what one got with lexical insertion delayed to LF. It was subject to various kinds of parallelism restrictions, with the non-elided antecedent serving to provide the relevant terminals for insertion into the elided PM (i.e. the one without terminals) the insertion subject to recoverability and the requirement that the insertion be to positions parallel to those in the non-elided antecedent. Figuratively, the LF of the antecedent was copied into the PM of the elided dependent.

As is well-known by now, Jason Merchant provided KD against this position, elaborating earlier (ignored?) arguments by Ross. The KD came in two forms. First, that elided structures respect the same case marking conventions apparent in non-elision constructions. Second, that preposition stranding is permitted in ellipsis just in case it is allowed in cases of movement without elision. In other words, it appears that but for the phonology, elided phrases exhibit the same dependencies apparent in non-elided derivations. The natural conclusion is that elision is derived by deleting structure that is first generated in the standard way. So, the parallelism in case and P-stranding profiles of elided and non-elided structures implies that they share a common syntactic derivational core.[10] This is just what the interpretive theory denies and the deletion theory endorses. Hence the deletion theory has a natural account for the observed syntactic parallelism that Merchant/Ross noted. And indeed, from what I can tell, the common wisdom today is that ellipsis is effectively a deletion phenomenon.

It is worth observing, perhaps, that this conclusion also has a kind of minimalist backing. Bare Phrase Structure (BPS) makes the interpretive theory hard to state. Why? Because the interpretive theory relies on a distinction between structure building and lexical insertion, and BPS does not recognize this distinction. Thus, given BPS, it is unclear how to generate structures without terminals. But as the interpretive theory relies on doing just this, it would seem to be a grammatically impossible analysis in a BPS framework. So, not only is the deletion theory of ellipsis the one we want empirically, it also appears to be the one that conforms to minimalist assumptions.

Note, that the virtue of KD is that it does not rely on theoretical validation to be effective. Whether deletion theories are more minimalistically acceptable than interpretive theories is an interesting issue. But whether they are or aren’t does not affect the dispositive nature of KD data wrt the proposals it adjudicates. This is one of the nice features of CEs and KD: they stand relatively independent of particular theory and hence provide a strong empirical check on theory construction. That’s why we like them.

Fifth, and now I am going to be much more controversial; inverse control and the PRO based theory of control. Polinksy and Potsdam (2002) presents cases control in which “PRO” c-commands its antecedent. This, strictly speaking should be impossible for such binding violates principle C. However, the sentences are licit with a control interpretation. Other examples of inverse control have since been argued to exist in various other languages. If inverse control exists, it is a KD for any PRO based conception of control. As all but the movement theory of control (MTC) is a PRO based conception of control, if inverse control obtains then the MTC is the only theory left standing. Moreover, as Polinsky and Potsdam have argued since, that inverse control exists makes perfect sense in the context of a copy theory of movement if one allows top copies to be PF deleted. Indeed, as argued here the MTC is what one expects in the context of a theory that eschews D-structure and adopts the least encumbered theory of merge. But all of this is irrelevant as regards the KD status of inverse control. Whether or not the MTC is right (which it, of course is) inverse control effects present KD against PRO based accounts of control given standard assumptions about principle C.

That’s it. Five examples. I am sure there are more. Send in your favorite. These are very useful to have on hand for they are part of what makes a research program progressive. CEs and KDs mark the intellectual progress of a discipline. They establish boundary condition sin adequate further theorizing. I am no great fan of empirics. The data does not do much for me. But I am an avid consumer of CEs and KDs. They are, in their own interesting ways, tributes to how far we’ve come in our understanding and so should be cherished.

[1] Note the modifier ‘deeply.’ Here’s an interesting question that I have no clean answer for: what makes one flaw deep and another a mere flesh wound? One mark of a deep flaw is that it buts up against a bed rock principle of the theory under investigation. So, for example, Galileo’s discovery was hard to reconcile with the Ptolemaic system unless one assumed that the phases of Venus were unlike any other of the phases seen at the time. There was no set of calculations that could get you the observed effects that were consistent with those most generally in use. Similarly for the Michaelson-Morley data. To reconcile these with the observations required fundamental changes to other basic assumptions. Most data are not like this. They can be reconciled by adding further (possibly ad hoc) assumptions or massaging some principles in new ways. But butting up against a fundamental principle is not that common. That’s why CEs and KD is interesting and worth looking for.
[2] The term “killer data” is found in a great new book on the rise of modern science by David Wootton (here). He argues that the existence of KD is a crucial ingredient in the emergence of modern science. It’s a really terrific book for those of you interested in these kinds of issues. The basic argument is that there really was a distinction in kind between what came after the scientific revolution and its precursors. The chapter on how perspective in painting fueled the realistic interpretation of abstract geometry as applied to the real world is worth the price of the book all by itself.
[3] In this, my list fails to have one property that Wootton highlighted. KDs as a matter of historical fact are widely accepted and pretty quickly too. Not all my candidate KDs have been as successful (tant pis), hence the bracketed qualifying modal.
[4] Please note the conditional: the KD shows that transformations are not linearly sensitive. This presupposes that Y/N questions are transformationally derive. Syntactic Structures argued for a transformational analysis of Aux fronting. A good analysis of the reasons for this is provided in Lasnik’s excellent book (here).  What is important to note is that data can become KD only given a set of background assumptions. This is not a weakness.
[5] This raises another question that Chomsky has usefully pressed: why don’t G operations exploit the string properties of phrase markers? His answer is that PMs don’t have string properties as they are sets and sets impose no linear order on their elements.
[6] Note: that R relates nominals does not imply that it cannot have the semantic reflex of lowering the additcity of a predicate. So, R applies to John hugged himself to relate the reflexive and John. This might reduce the addicity of hug from 2-place to 1-place. But this is an effect of the rule, not a condition of the rule. The rule could care less whether the relata are co-arguments.
[7] There are some theories that obscure this conclusion by distinguishing between semantic and syntactic predicates. Such theories acknowledge the point made here in their terminology. R is not an addicity changing operation, though in some cases it might have the effects of changing predicate addicity (see note 6).
This, btw, is one of my favorite KDs. Why? Because it makes sense in a minimalist setting. Say R is a rule of G. Then given Inclusiveness it cannot be an addicity changing operation for this would be a clear violation of Inclusiveness (which, recall, requires preserving the integrity of the atoms in the course of a derivation and nothing violates the integrity of a lexical item more than changing its argument structure). Thus, in a minimalist setting, the first view of R seems ruled out.
We can, as usual, go further. We can provide a deeper explanation for this instance of Inclusiveness and propose that addicity changing rules cannot be stated given the right conception of syntactic atoms (this parallel to how thinking of Merge as outputting sets thereby makes impossible rules that exploit linear dependencies among the atoms (see note 3)). How might we do this? By assuming that predicates have at most one argument (i.e. they are 1-place predicates). This is to effectively endorse a strong neo-Davidsonian conception of predicates in which all predicates are 1-place predicates of events and all “arguments” are syntactic dependents (see e.g. Pietroski here for discussion). If this is correct, then there can be no addicity changing operations grammatically identifying co-arguments of a predicate, as predicates have no co-arguments. Ergo, R is the only kind of rule a G can have.
[8] If memory serves, I think that he showed this in his Connectedness book.
[9] Edwin Williams developed this theory. Ivan Sag argued for a deletion theory. Empirically the two were hard to pull apart. However in the context of GB, Williams argued that the interpretive theory was more natural. I think he had a point.
[10] For what it is worth, I have always found the P-stranding facts to be the more compelling. The reason is that all agree that at LF P-stranding is required. Thus the LF of To whom did you speak? involves abstracting over an individual, not a PP type. In other words, the right LF involves reconstructing the P and abstracting over the DP complement; something like (i), not (ii):
(i)             Who1 [you talk to x1]
(ii)           [To who]1 [you talk x1]
An answer to the question given something like (i) is ‘Fred.’ An answer to (ii) could be ‘about Harry.’ It is clear that at LF we want structure like (i) and not (ii). Thus, at LF the right structure in every language necessarily involves P-stranding, even if the language disallows P-stranding syntactically. This is KD for theories that license ellipsis at LF via interpretation rather than via movement plus deletion.


  1. This comment has been removed by the author.

  2. Ooh, I like this game. Here are some of my favorites:

    • raising-to-ERG in Basque (Artiagoitia 2001, in Herschensohn et al. (eds.); Rezac, Albizu & Etxepare 2014, NLLT): KD for the inherent case theory of ergative case

    • multiple heads, in different clauses, all agreeing with the same argument in all available phi features (Polinsky & Potsdam 2001, NLLT): KD for the Activity Condition

    • ABS case in Basque in structures where agreement has been visibly disrupted (Etxepare 2006, ASJU (the International Journal of Basque Linguistics and Philology)): KD for the idea that structural case is assigned by agreement

    • A-movement of inherent-case marked noun phrases in Icelandic (Zaenen, Maling, & Thráinsson 1985, NLLT; among many others): KD for the idea that raising in Germanic is case-driven [[[NB: Like the Ross/Merchant stuff cited in the body of Norbert's post, this is a good example of how certain ideas -- in this instance, raising being case-driven -- continued to pervade the field long after their empirical expiration date had passed.]]]

    • sensitivity of the Person Case Constraint (PCC) to the relative hierarchical organization of the ABS and DAT arguments in 2-place unaccusatives in Basque (Albizu 1997, ASJU; Rezac 2008, NLLT): KD for the idea that the PCC is a morphological filter

    • the distribution of ACC in Sakha (Baker & Vinokurova 2010, NLLT): KD for the idea that accusative case is generally assigned by a head (e.g. v)

    1. (Just realized this comment could double as promotional material for NLLT; I assure you that was unintended!)

    2. Or promotional material for Basque (almost).

  3. Huybrechts and Shieber's work on Dutch and Swiss German that was the KD for GPSG, and the idea that languages were even weakly context-free.

    (And Piraha surely is KD for something but I am not sure what ...)

  4. Tangkic languages and the idea that agreement depends on local spec-head relationships (alive in Baker 2008, apparently finally abandoned now).

  5. I'm not sure about number 2, even after reading footnotes 6 and 7. What about versions of categorial grammar in which "expects __ to win" is a possible constituent, which gets its arity reduced by "herself"?

    1. This is what I meant by notes 6 and 7. There are ways of redefining notions like predicate so that what appears to be the predicate is actually not. This can be done by allowing predicates to "combine" into complex predicates etc. I have personally always viewed these moves as conceding the main point. The question then becomes what to make of these complex predicate forming operations. I am not a fan myself, but if you really want to preserve the semantic conception then you need a more expansive (i.e. non intuitive) conception of predicates. Thus you need to circumvent the idea that in 'John believes Bill to be a fool' that 'Bill' is an argument of believe. It isn't. You cannot conclude from the former that John believes Bill. But if it is not an argument in this, then it cannot be in 'John believes himself to be a fool.' And so on. So, yes, there are technical fixes, but they concede the main point, IMO. That's what I was getting at.

  6. I've also always been fond of the C&C view as opposed to the horror that is barriers. I'm not sure Kayne's data is KD for the idea of a functional definition of ECs though (as opposed to the particular execution of that idea that Chomsky offers in C&C). You can imagine, for example, that movement and islands are not so tightly linked (if, say, movement has to be preceded by Agree, and Agree is constrained by locality, as Gillian and I said in our 2005 LI paper), and that might allow you to say that PGC's involve a Comp-->EC relation with no movement, where the nature of the tail of the constructed Agree-chain is determined by the value that Comp gives it plus its locally determined case properties. I quite like that idea as it allows you to reduce the non-overtness of PGCs to a general capacity for licensed pronouns to be non-overt.

    1. I think you can say this. But as you note it requires divorcing islands from movement and treating locality as a property of agreement in some form. But as you note, the Kayne observations are KD for the idea that islands are diagnostics of movement AND ECs are functionally determined.

      BTW, I agree that the barriers account is a horror. Sadly, from what I can tell, this is still the standard theory of PGs.

  7. I have some KD for the MTC:

    I find this whole discussion a bit hypocritial because advocats of the MTC like Hornstein have been ignoring the crucial data for years.

    1. I've known Hornstein for many years and I can attest that his remarks are almost always hypocritical. So you are right to call him out on this. However, I fear that you may have missed the point of his most recent self serving comment. It is not that the MTC is correct, only that inverse control (IC) IF IT EXISTS (and the evidence is non-negligible IMO (in fact pretty good)) is KD for any PRO based account of control. It is also true that the only non-PRO based account of control out there is the MTC. Both points are correct. How is IC a KD? Because it requires that control structures in IC Gs systematically violate condition C, in particular the condition that prohibits an anaphor from c-commanding its antecedent. I take this to be a real deep no-no. If I am right, then PRO based accounts must fail. But, to repeat, this does not mean that the MTC succeeds, only that it has one positive property that makes PRO based accounts non-starters.

      The papers that you linked to (thx, btw) look interesting. I will have Hornstein take a careful look at them. However, he has informed me that he is probably not going to be moved by the papers. He says this so as not to be deemed a hypocrite. He would rather be known as stiff necked.

      He asks me, in all sincerity, to thank you for the remonstration.

  8. Dear Norbert,

    sorry, I wasn’t aware that you knew this Hornstein guy. ;-)

    I apologize for the harsh tone of my comment. I mean it. It is way too late for me to write commentaries. It is only that (in addition to me being physically tired) I’m a bit tired of this whole discussion about MTC and backward control, which has been going on for a while. As a syntactician working on German, I still find the arguments put forward by the MTC camp not very convincing, because they obviously make the wrong predictions for German (see e.g. the paper by Haider I linked). I'm fully aware that you weren’t writing about the MTC in general in your post, but let me rephrase what was going on in my mind:

    Do you consider the empirical evidence from German (but also from other languages) put forward by Haider and Kiss against the MTC as KD? After all, the paper by Tibor Kiss has been around for a while, and I think you - I mean Hornstein - even quotes it in his 2010 book.

    What would be the CE for the MTC?

    These are not rhetorical questions, i.e. I'd be really interested in your answer.



    1. Long reply (1):
      I do not consider them dispositive (as you may have guessed). But right now I don't recall why. I will look at them again and see why I was unconvinced. But here, are some of the reasons.

      First, the idea that control is effectively a selection issue strikes me as simply saying that there is no interesting grammar to control. Selection is diacritical. Of course a selection account can cover more data as there is no data that diacritics cannot cover.

      Second, if selection is restricted to head to head configurations control CANNOT be a selection issue for the controllee is not a head is locally selectable by the higher predicate. Hence, strictly speaking, this is not even a standard head to head relation.

      Third, argument selection is but one part of the control problem. The other is distribution of PRO. Selection theories say nothing about this.

    2. Reply (2)
      Fourth, I believe that adjunct control has virtually all the same properties as complement control. If this is so, then a selection theory of control cannot be right as there is no selection relation between a predicate and adjuncts that modify it.

      Fifth, though I have not addressed every problem aimed at the MTC I have addressed more than a few. You may not like the answers, but they have been addressed. For example, colleagues and me have provided a principled reason for the VIsser facts. Indeed, it also explains why you cannot have control of 'there' expletive into an adjunct. It relies on the idea that the position that raising targets, unlike the position that control does, is outside the theta domain of the relevant predicate. This may be wrong, but there is an argument. Is it right? Dunno.

      Sixth, I am tired of 'promise.' I've talked about this a bunch of times. But what really tires me is the apparent belief that stipulating the facts explains anything. There are many accounts of 'promise' in the literature, all of which I am fine with. Take Larson's for example. Ok with me. But I have also discussed this in the book with Jairo and Cedric. I still like that account.

      Last, what is a KD? It is not counter-data. Any non stimulative theory will have problems. But some are very much worse than others. The ones that are really bad involve overturning deep principles of the system. Most recalcitrant data can be stipulated away or the theory can be adumbrated to allows for the messy facts. This is not so for KD. Here the data requires a wholesale revision of deep principles. Overturning principle C seems to me in that league. So IF IC then no principle C and this is a BIG BIG BIG problem.

      What would be the CE for the MTC? Good question. I have been quite moved by the parallels between cases of finite raising and finite control. So languages like Brazilian Portuguese that allow hyper raising also allow finite control. This seems like a nice prediction of the theory and I always though was one of its best features. Because of this I take the German stuff as a serious problem (or would if I recalled it (i will reread the two papers)). The MTC ties antecedent selection with PRO distribution (OC, of course) very tightly together. If we could find that they don't fall together in clear cases then this would be a big problem. Say that we had evidence that the element that purportedly moved from some position was not the antecedent of the implicit argument there. This would be terrible.

      Here's another: I personally think that the MTC requires a more general theory of all binding as movement. So, if we could show that this fails (not merely that there are some problems, there are always some) then I would also think that this argues against the MTC.

      Very last point: I have never actually cared about control and have never really worried about whether the MTC was a good theory of control. I care about whether there can be a good minimalist account of control. My argument has been that IF this is possible, it will look like the MTC. Why? Because the MTC follows with virtually no additional stipulations once you give up D-structure completely (see last chapter of the 2010 book). So my interest has not been in control but in the feasibility of a principled minimalist account. It is quite possible IMO that minimalism is a dead end and if so then the MTC should go down the toilet with it. I am not yet ready to say this, but I have no problem believing it. What I am sure of is that the standard theories I have seen have little to recommend them on minimalist grounds which either means that they are of no linguistic (note the 'i') or that they are not the right kinds of theories if minimalism is on the right track. You can guess where I have placed my bets.

      Oh, don't worry: Hornstein did not take anything you said earlier to heart (though careful how you read this. It's what a hypocrite might say)).

  9. @Oliver. I'm wary of starting an MTC debate here, but the Haider paper is a little frustrating in that it doesn't reference a lot of the MTC literature that's attempted to deal with some of the problems it raises. For example, with regard to section 3, non-subject-oriented OC into adjucnts is discussed on p. 98 of Move!. And surely at least some of the discussion of passives in section 5.2 of the Boeck, Hornstein and Nunes monograph has got to be relevant to section 2, but the discussion there is never mentioned.

  10. Doesn't this discussion indicate that syntax is for the most part too young a science for KD/CE? I mean I can imagine the same kind of to and fro emerging on case 4 (theories of ellipsis) if, say, Mark Steedman started chipping in (eg deletion advocates like myself have argued that case matching needs to be stipulated as a condition on remnants and not silent structure, and that stipulation can be bought by DI theories; p-stranding could prob be treated similarly, given that the theory of p-stranding is still up for grabs). Very local claims like the ones Omer cited above are probably good cases of KD, although even with those I feel like there might be someone lurking with a bunch of null operators/projections which they can argue for the existence of if cornered.

    1. @Gary: I agree with your sentiment that there's sometimes a troubling amount of wiggle room when it comes to this type of thing (cf. "a bunch of null operators/projections"). A few notes about that, though:

      1. Not all arguments are easily susceptible to such countermanuevers. The raising-to-ERG stuff, for example, is supported by evidence from idiomatic readings being preserved under raising (see, in particular, the Rezac et al. 2014 paper). And finding a way around that would entail undoing almost everything we thought we knew about idiomatic readings when it comes to raising vs. base-generation/prolepsis.

      2. Even when countermanuevers are available, one should judge them with a critical eye. So, for example: suppose you have an entire framework based on strict ordering statements of the form x<y, z<w, and so forth. And then someone comes along and shows that your ordering statements lead to paradoxes (e.g. x<y, y<z, but z<x). And suppose the response to this is to proliferate more categories for the ordering statements to operate upon ("actually, there's x and x', one of which precedes z and one of which follows z"). A discerning eye will notice that the framework in question has crossed the line from predictive theory to descriptive jargon. (Whether the field as a whole does or doesn't recognize this is of course immaterial to the substance of the matter. It's a scientific question, not a sociological one.)

      3. Sometimes one stumbles (through dumb luck, in my case) onto an argument that is not susceptible even to the "bunch of null operators/projections" gambit. Such is the case for the argument from K'ichean against 'crashes' / 'checking' theories of agreement / fully interface-driven approaches to syntax. (I won't bore the readers of this blog with the details, they can be found in ch. 5 of my 2014 monograph, in particular 87ff.) Let me stress that I did not set out with some top-down foresight that I would be able to argue from this data against a null-expletive alternative. But sometimes one gets lucky!

      Finally, let me say that I'm quite sure that Gary himself is already aware of everything I've written here (though whether he agrees or not is another matter); but I thought this stuff belonged in the conversation.

    2. Yeah for sure I agree with all of the above when it comes to assessing countermanuevers, but I suppose my point is that all that assessing and weighing of arguments is an indicator that nothing's really getting KILLED by killer data. I mean the MTC is a case in point: I'll bet both sides have felt like they've landed the game-changing KD at various points, and yet this debate rages on. Rutherford's experiment it ain't.

      btw re rethinking what we think about idioms and raising, that doesn't seem so crazy in the weird and wacky world of relative clauses and tough constructions...

  11. KD for the claim that the CSC is not a constraint on movement: in languages with productive resumptive pronouns, a RP can save any island except a CSC island. I think the initial observation goes back to Grant Goodall using data from Carol Georgopoulos' 1985 paper "Variables in Palauan Syntax".

  12. A nice contribution, Norbert. Just two notes.

    1. The 'KD' about polar questions shows conclusively that the grammar is sensitive to hierarchical structure and that this (normally) overrides linear order -- no doubt about that. But it does not logically follow that linearity cannot be part of FL at all (which is how you seem to talk about it). There are various known linear order effects, such as agreement with closest conjunct.

    2. About reflexives: this argument seems to rest on the assumption that a pronoun is either reflexive or pronominal. But crosslinguistic reality shows a variety of elements and strategies, short, medium and long distance anaphora, etc. I'm not defending any particular view on reflexivity here, but it might be that co-argument theory is only about the SD strategy ('real' reflexivity), in addition to which there are other ways of linking nominal referents (with somewhat different properties, including sensitivity to perspective, probably). The confusing thing is that in English the element 'himself' can be used for multiple purposes, but not every language is like that. Given these complications, it may not be so clear anymore that ECM data are KD against the argument-changing theory. It first needs to be (re)established that all data involving reflexive strategies are essentially one and the same phenomenon (once the zero hypothesis, but there's reason for doubt). Correct me if I'm wrong.

    1. Good points. I agree with your first point completely. As you know, since Larson and Reinhart there has been an effort to do away with linear effect in G. I believe that the field has largely concluded that this effort has been successful. I have no opinion on this myself, but it does seem true that the effects of linearity are mainly at the edges (but this is a value judgment). That said, your point is right on.

      Your second point is also well put. IMO, the English data are pretty conclusive wrt "real" reflexivity. The ECM cases don't look like logophors or emphatic reflexives. If these are indeed real reflexives then the co-argument theory seems to me dead. That does not imply that things are entirely clear even given this. I agree that there are various kinds of anaphoric expressions (and maybe, many kinds of anaphoric relations), but thinking that local reflexives are local in virtue of being co-arguments seems to me a non-starter.