In two earlier posts (here
and here)
I outlined an argument that revolves around the following point: MP forces a
concentration on properties of FL/UG and forces a LING (vs LANG) conception of
the study of language. Prior to MP, LING and LANG interpretations of the study
of language could happily co-exist, each readily and happily interpreting the
results of the other’s perspective without threatening one’s favored point of
view. I then reviewed how it is that the
Merge Hypothesis (MH) advances the goal of explaining some of the basic
features of FL/UG by showing them to be by-products of a simple, fundamental
recursive procedure. In my view, this result is breathtaking and, given the
assumption that earlier GG was empirically roughly correct, it provides an
excellent model of how to realize MP ambitions, thereby demonstrating that the
big question that MP poses for itself can be fruitfully addressed and answered.
In this post, I want to push this line of argument further,
much further. As the old saying goes, you can never know how far you can go
until you go a wee bit further. The idea is to entertain an Extended Merge Hypothesis (EMH) which
takes non local dependencies to “live on” Merge generated structures (aka,
chains). The aim is to unify all
non-local dependencies and treat them as chain dependencies. If doable, this
will have the effect of unifying several otherwise independent modules of the
grammar and showing their properties to effectively reduce to chain properties.
As chain properties are just Merge properties, this can be understood as a first
step towards reducing all the laws of GB to products of Merge.
The EMH is much less well grounded than the MH, IMO. It
includes a hobbyhorse of mine as a special case: the movement theory of
construal relations (including both control and binding). I sketch how this
might look. Do I believe it? Well, who cares really? That’s a question about personal psychology.
Do I think that this is the right way for an MPer to proceed? Yup. If the goal
is to explain the properties of FL/UG in the most efficient manner possible,
then some extension of MH is the obvious way to go. And, IMO, one should never
do the less obvious when the obvious is staring you in the face. So, it is
worth doing. Here goes.
The Extended Merge Hypothesis: Explaining more features of
FL/UG
There are proposals in the MP literature that push the MH
line of argument harder still. Here’s what I mean. MH unifies structure
building and movement and in the process explains many central properties of
FL/UG. It accomplishes this by reducing phrase building and movement to
instances of Merge (i.e. E and I-Merge respectively). We can push this
reductive/unificational approach more aggressively by reducing other kinds of
dependencies to instances of E or I-Merge. More specifically, if we take seriously
the MP proposal that Merge is the unique
fundamental combinatoric operation
that FL/UG affords, then the strongest minimalist hypothesis is that every grammatical dependency must be
mediated by some instance of Merge. Call this the “Extended Merge Hypothesis”
(EMH). In what follows, I would like to review some of the MP literature to see
what properties EMH might enjoy. My aim is to suggest that this radical
unification has properties that nicely track some fundamental features of
FL/UG. If this is correct, then it suggests that relentlessly expanding the
reach of Merge beyond phrase structure and movement to include construal and
case/agreement dependencies has interesting potential payoffs for those
interested in MP questions. Once again, it will pay to begin with GB as a
jumping off point.[1]
GB is a strongly modular theory in the sense that it
describes FL/UG as containing many different kinds of operations and
principles. Thus, in GB, we distinguish construal rules like Binding and
Control, from movement rules like Wh-movement and Raising. We classify case
relations as different from movement dependencies and both from theta forming
dependencies. The primitives are different and, more importantly, the
operations and principles that condition them are different. The internal
modularity of GB style FL/UGs complicates them. This is, theoretically
speaking, unfortunate, especially in light of the fact that the different
dependencies that the modules specify share many properties in common. That
they do so is something an MP account (indeed, any account) would like to
explain. EMH proposes that all the different dependencies in GB’s various
modules structure are actually only apparently
distinct. In reality, they are all different instances of chains formed by
I-merge.[2]
To put this slightly differently, all of the non-local dependencies GB
specifies “live on” chains formed by I-Merge. Let’s consider some examples.
(i)
Case Theory
(Chomsky 1993) re-analyzes case dependencies as movement
mediated. The argument is in two steps.
The first is a critical observation: the GB theory of case
is contrived in that it relies on a very convoluted and unnatural notion of
government. Furthermore, the contrived nature of the government account
reflects a core assumption: accusative case on the internal argument of a
transitive verb (sisterhood to a case assigning head) reflects the core
configuration for case licensing. Extending sisterhood so that it covers what
we see in nominative case and in ECM requires “generalizing” sisterhood to
government, with the resulting government configuration itself covering three
very distinct looking configurations (see (9)).
(9a) is the configuration for accusative case, (9b) for nominative and
(9c) for ECM. It is possible to define a technical notion that treats these
three configurations as all instances of a common relation (viz. government),
but the resulting definition is very baroque.[3]
The obtuseness of the resulting definition argues against treating (9a) as the
core case precisely because the resulting unified theory rests on a
gerrymandered (and hence theoretically unsatisfying) conception of government.
(9) a. [ V nominal]
b.
[ Nominal [ T0-finite…
c.
[V [ Nominal [ T0-non-finite…
The second step in the argument is positive: case theory can
be considerably streamlined and unified if we take the core instance of case
assignment to be exemplified by what we find with nominatives. If this is the
core case configuration then case is effectively a spec-head relation between a
case marked nominal and a head that licenses it. Generalizing nominative case
configurations to cover simple accusative objects and ECM subjects requires
treating case as a product of movement (as with nominatives).[4]
Thus, simplifying the case module rests on analyzing case dependencies as
products of I-merge (i.e as non-local relations between a case assigning head
and a nominal that has
(A-)moved to the specifier of this head).[5]
The canonical case configuration is (10), with h0 being a case
assigning head and the nominal being the head of a I-merge generated A-chain .[6]
(10) [Nominal [ h0…
There is some interesting empirical evidence for this view.
First, it predicts that we should find a correlation between case and
movability. More specifically, if some position resists movement, then this
should have an impact on case. And this
seems to be correct. Consider the paradigm in (11):
(11) a.
John believes him to be tall
b. *John believes
him is tall
c. John was
believed t to be tall
d. *John was believed t is tall
Just as A-movement/raising from the subject position of a
finite clause is prohibited, so too is accusative case. Why? Because accusative
case requires A-movement of him in
(11b) to the case head that sits above believe
and this kind of movement is prohibited, as (11d) illustrates.[7]
There is a second prediction. Case should condition binding.
On the current proposal, movement feeds case. As movement broadens an
expressions scope, it thereby increases its binding potential. This second prediction
is particularly interesting for it ties together two features of the grammar
that GB approaches to accusative case keep separate. Here’s what I mean.
With regard to nominative case assignment, it is well-known
that movement “for” case (e.g. raising to subject) can expand an expression’s
binding potential. John can bind himself after raising to subject in
(12b) but not without moving to the matrix Spec T (12a). Thus some instances of
movement for case reasons can feed binding.
(12) a. *It seems to himself1 [(that)
John1 is happy]
b.
John1 seems to himself1 [ t1 to be happy]
However, the standard GB analysis of case for accusatives
has the V assign case to the nominal object in its base position.[8]
Thus, whereas case to nominative subjects is a Spec-head relation, case to
canonical objects is under sisterhood. If,
however, we unify nominative and accusative case and assimilate the latter to
what we find with nominatives, then movement will mediate accusative case too.
If this involves movement to some position above the external argument’s base
position (recall, we are assuming PISH) then accusative case is being assigned
in a configuration something like (13). Different epochs of MP have treated
this VP external Spec position differently, but the technical details don’t
really matter. What does matter is that accusative case is not assigned in the nominal’s base position, but rather in a higher
position at the edge of the VP complex. So conceived, accusative case, like
nominative, is expected to expand a nominal’s binding domain.
(13) [ Nominal1 [V external
argument [V V…
There is evidence supporting this correlation between case
value and scope domain.[9]
Here is some comparative data that illustrates the point. (14) shows that it is
possible for an embedded ECM subject to bind an anaphor in a matrix adjunct,
whereas a nominative embedded subject cannot.
(14) a. The lawyer proved [the men1
to be guilty] during each other1’s trials
b. The lawyer proved [the men1 were guilty] during each other1’s
trials
(14a) has a sensible reading in which the during phrase modifies the matrix
predicate proved. This reading is
unavailable in (14b), the only reading being the silly one in which the during phrase modifies were guilty. This is expected if
licensing accusative case on the ECM subject requires moving it to the edge of
the higher matrix VP (as in (13)). In contrast, licensing nominative leaves the men in the embedded spec T and hence
leaves the matrix during phrase
outside its c-command domain prohibiting binding of the reciprocal. Thus we see
that case value and scope domain co-vary as the MP story leads us to expect.[10]
In sum, unifying case under I-merge rather than government
leads to a nicer looking theory and makes novel
predications concerning the interaction of case and binding.
(ii)
Control
Consider next (obligatory) complement control, as
exemplified in (15):
(15) a. John1 hopes [PRO1
to go to grad school]
b.
John persuaded Mary1 [PRO1 to go to grad school]
Here are two salient properties of these constructions: (i)
PRO is restricted to non-finite subject positions and (ii) PRO requires a local
c-commanding antecedent. There are GB proposals to account for the first
property in terms of binding theory (the so-called “PRO theorem”) but by the
early 1990s, its theoretical inadequacies became apparent and PRO’s
distributional restrictions were hereafter restricted to the subject of finite
clauses by stipulation.[11]
As regards selecting the appropriate antecedent, this has the remained the
province of a bespoke control module with antecedent selection traced to
stipulated properties of the embedding predicate (i.e. the controller is a
lexical property of hope and persuade). I believe that it is fair to
say that both parts of GB control theory contain a fair bit of ad hocery.
Here’s where MP comes to the rescue. A unified more
principled account is available by treating construal relations as “living” on
chains (in the case of control, A-chains) generated by I-merge. On this view,
the actual structure of the sentences in (15) is provided in (16) with the
controller being the head of an A-chain with links/copies in multiple theta
positions (annotated below).
(16) a. [ John [ T [JohnQ
[ hopes [ John [to [ JohnQ [ go to grad school]]]]]]]]
b. [John T [ John [persuade [MaryQ
[ Mary to [ MaryQ [go to grad school]]]]]]]
The unification provides a straightforward account for both
facts above: where PRO is found and what its antecedent must be. PROs
distribute like links in A-chains. Antecedents for PRO are heads of the chains
that contain them. Thus, PRO can appear in positions from which A-movement is
licit. Antecedents will be the heads of such licit chains. Observe that this
implies that PRO has all the properties of a GB A-trace. Thus it will be part
of a chain with proximate links, these links will c-command one another and
will be local in the way that links in A-chains are local. In other words, a
movement theory of control derives the features of control constructions noted
above.
We can go further: if we assume that Merge is the only way to establish grammatical
dependencies, then control configurations must
have such properties. If PRO is a “trace” then of course it requires an
antecedent. If it is a trace, then of course the antecedent must c-command it.
If it is an A-trace, then of course the antecedent must be local. And if it is
an A-trace then we can reduce the fact that it (typically)[12]
appears in the subject position of non-finite clauses to the fact that A-movement
is also so restricted:
(17)
a. John seems t to like Mary
b. *John seems t will like Mary
c. John expects
PRO to like Mary
d. *John expects PRO will like Mary
In sum, if we reduce control dependencies to A-chain
dependencies and treat control structures as generated via I-merge it is
possible to derive some of its core distributional and interpretive properties.[13]
Indeed, I would go further, much further.
First, at this moment, only
this approach to control offers a possible explanation
for properties of control constructions. All other approaches code the relevant
data, they do not, and cannot explain
them. And there is a principled reason
for this. All other theories on the market treat PRO as a primitive lexical
element, rather than the residue of grammatical operations, and hand pack the
properties of control constructions into the feature specifications of this
primitive lexical element. The analysis amounts to showing that checking these
features correlates with tracking the relevant properties. The source of the
features, however, is grammatically exogenous and arbitrary. The features
posited are exactly those that the facts require, thereby allowing for other
features were the facts different. And this robs these accounts of any
explanatory potential. From a minimalist perspective, one from which the
question of interest is why FL/UG has the properties it appears to have and not
others, this treatment of control is nugatory.
Second, the movement approach to control has a very
interesting empirical consequence in the context of standard MP theories. Recall that the copy theory is a consequence
of Merge based accounts of movement (see here
section 1). If control is the product of I-merge then control chains, like
other A-chains, have copies as links. If so, part of any G will be procedures
for phonetically “deleting” all but one of the copies/occurrences. So the
reason that PRO is phonetically null is that copies in A-chains are generally
phonetically null.[14]
Importantly, whatever the process that “deletes” copies/occurrences will apply
uniformly to “A-traces” and to PRO as these are the same kinds of things.
There is well-known evidence that this is correct. Consider
contraction effects like those in (18). Wanna
contraction is licensed in (18b) across an A-trace and in (18a) across a PRO,
but not in (18c) across a A’-trace. This supports the claim that PRO is the
residue of A-movement.
(18)
a. They want PRO to kiss Mary à They wanna kiss Mary
b. They used t to
live in the attic à They usta live in the attic
c. Who do they want t to vanish
from the partyà
*Who do they wanna vanish from the party.
The I-Merge analysis of control also predicts a possibility
that PRO based accounts cannot tolerate. Consider, an I-Merge based account of
displacement needs a theory of copy/occurrence pronunciation to account for the
fact that most copies/occurrences in many languages are phonetically null. So
part of any I-Merge theory of displacement we need a theory of copy deletion. A
particularly simple one allows higher copies as well as lower ones to delete,
at least in principle.[15]
This opens up the following possibility: there are control configurations in
which “PRO” c-commands its antecedent.[16]
Thus, the movement theory of control in conjunction with standard assumptions
concerning deletion in the copy theory of movement allow for the possibility of
control constructions which apparently violate Principle C. And these appear to
exist. It is possible to find languages
in which the controllee c-commands its controller.[17]
In other words, configurations like (19b) with the standard control
interpretation are perfectly fine and have the interpretation of control
sentences like (19a). Both kinds of sentences are derivable given assumptions
about I-merge and copy deletion. They derive from the common underlying (19c)
with either the bottom copy removed (19a) or the top (19b). On this view, the
classical control configuration is simply a limiting case of a more general set
of possibilities, that but for phonetic expression, have the same underlying
properties.[18]
(19)
a. DP1 V [PRO1
VP]
b. PRO1 V [DP1 VP]
c.
DP1 V [DP2 VP]
This kind of data argues against
classical PRO based accounts (decisively so, in my opinion), while being
straightforwardly compatible with movement approaches to control based on
I-merge.
One last point and I will move on. Given standard MP
assumptions, something like the movement theory of control is a virtual
inevitability once one dispenses with PRO. MP theories have dispensed with
D-structure as a level of representation, and with this the prohibition against
a DP moving into multiple theta positions. Thus, there is nothing to prevent
DPs from forming control chains via I-merge given the barest MP assumptions. In
this sense, control as movement is predicted as an MP inevitability. It is
possible to block this implication, but only by invoking additional ad hoc
assumptions. Not only is control as movement compatible with MP, it is what we will find unless we try to
specifically avoid it. That we find it, argues for the reduction of control to
I-merge.[19]
(iii)
Principle A effects
The same logic reduces principle A-effects to I-merge. It’s
been a staple of grammatical theory since LGB that A-traces have many of the
signature properties of reflexives, as illustrated by the following paradigm:
(20)a.
*John seems [t is intelligent]
b. *John believes [himself is intelligent]
c. John seems [to be intelligent]
d. John believes [himself to be intelligent]
e. *John seems it was told t that Sue is intelligent
f.
*John wants Mary to tell himself that Sue is intelligent
LGB accountd for this common pattern by categorizing
A-traces as anaphors subject to principle A. Thus, for example, in LGB-land the
reason that A-movement is always upwards, local and to a c-commanding position
is that otherwise the traces left by movement are unbound and violate principle
A. What’s important for current purposes is to observe that LGB unifies
A-anaphoric binding and movement. The current proposal that all grammatical dependencies are
mediated by Merge has the LGB unification as a special case if we assume that
A-anaphors are simply the surface reflex of an underlying A-chain. In other
words, the data in (20) follow directly if reflexives “live on” A-chains. Given
standard assumptions concerning I-merge this could be theoretically
accommodated if “copies” can convert to reflexive in certain configurations (as
in (25)).[20]
(21)[John
believes [John (à
himself) to be intelligent]]
Like cases of control, reflexives are simply the
morphological residue of I-merge generated occurrences/copies. Put another way,
reflexives are the inessential morphological detritus of an underlying process
of reflexivization, the latter simply being the formation of an A-chain
involving multiple theta links under I-Merge.
If correct, this makes a prediction: reflexives per se are inessential for
reflexivization. There is evidence in favor of this assumption. There are
languages in which copies can stand in place of reflexive morphemes in
reflexive constructions. Thus, structures like (22a) have reflexive
interpretations, as witnessed by the fact that they license sloppy identity
under ellipsis (22b).[21]
(22)
a. John saw John (=John saw himself)
b. John saw John and Mary too
(=and Mary say Mary)
Note that given standard assumptions regarding GB binding
theory examples like (22) violate principle C.
We also have apparent violations of principle B where
pronouns locally c-command and antecede another pronoun (structure in (23)):
(23) Pronoun likes pronoun and Mike too (= and
Mike likes Mike )
These puzzles disappear when these are seen as the surface
manifestations of reflexivization chains under I-merge. The names and pronouns
in object positions in (22) and (23) are just pronounced occurrence/copies.
There is a strict identity condition on the copies in copy reflexive
constructions, again something that an I-merge view of these constructions
would lead one to expect. Interestingly, we find similar copies possible in
“control” structures:
(24)
a. Mike wants Mike to eat
b. The priest persuaded Mike Mike to go to
school
This is to be expected if indeed both Reflexive and Control
constructions are mediated by I-merge as proposed here.[22]
(iv)
Conclusion
Let me sum up. Section 1 showed that we can gain explanatory
leverage on several interesting features of FL/UG if we assume that Merge is
the fundamental operation for combining lexical atoms into larger hierarchical
structures. In this section I argued that one can get leverage on other
fundamental properties if we assume that all
grammatical dependencies are mediated by Merge. This implies that non-local
dependencies are products of I-merge. This section presented evidence that case
dependencies, control and reflexivization “live on” A-chains formed by I-merge.
I have shown that that this proposal much of the conventional data in a
straightforward way and that it is
compatible with data that goes against the conventional grain (e.g. backwards
control, apparent violations of principle B and C binding effects). Moreover,
all of this follows from two very simple assumptions: (i) that Merge is the
basic combinatoric operation FL/UG makes available and (ii) that all
grammatical dependencies are mediate via Merge. We have seen that the second
assumption underwrites I-merge analyses of case, control and reflexivization,
which in turn explain some of the key features of case, control and reflexivization
(e.g. case impacts scope, PRO occupies the subject position of non-finite
clauses and requires a local c-commanding antecedent and that languages that
appear to violate conditions B and C are not really doing so). Thus, the Merge
hypothesis so extended resolves some apparent paradoxes, accounts for some
novel data, covers the standard well trodden empirical ground and (and this is the key part) explains why it is that the FL/UG
properties GB identified hold in these cases. The Extended Merge Hypothesis
(EMH) explains why these constructions have the general properties they do by
reducing them to reflexes of the A-chains they live on generated by I-merge. If
this is on the right track, then the EMH goes some way towards answering the
question that MP has set for itself.
[1]
But first a warning: many MPers would agree with the gist of what I outlined in
section 1. What follows is considerably more (indeed, much more) controversial.
I don’t think that this is a problem, but it is a fact. I will not have the
space (or, to be honest, the interest) in defending the line of argument that
follows. I have written about this elsewhere and tried to argue that, for
example, large parts of the rules of construal can be usefully reduced to
I-merge. Many have disagreed. For my point here, this may not be that important.
My aim here is to see how far this line of argument can go, showing that it is
also the best way to go is less important than showing that it is a plausible
way to proceed.
[2]
This MP project clearly gains inspiration from the unification of islands under
Subjacency, still, in my opinion, one of the great leaps forward in syntactic
understanding.
[3]
Defining government so that it could do all required of it in GB was a lively
activity in the 80s and 90s.
[4]
At least if we adopt the Predicate Internal Subject Hypothesis which assumes
that subjects of finite clauses move to Spec T from some lower predicate
internal base position in which the nominals theta role is determined. For
discussion see Hornstein et al. 2005
[5]
This abstracts away from the issue of assignment versus checking, a distinction
I will ignore in what follows.
[6]
If we assume that structures are labeled and that labels are heads then (10)
has the structure in (10’) and we can say that the nominal merges with h0
in virtue of merging with a labeled projection of h. I personally believe that
this is the right view, however, this is not the place to go into these
matters.
(10’)
[h Nominal [h h0…
[7]
That case and movement should correlate is implicit in GB accounts as well.
Movement in raising and passive constructions is “for” case. If movement is
impossible, the case filter will be violated. However, the logic of the GB
account based on government is that movement “for” case was the special. The
core case licensing configuration did not require it. Chomsky’s 1993 insight is
that if one takes the movement fed licensing examples as indicative of the
underlying configuration a more unified theory of case licensing was possible.
Later MP approaches to case returned to the earlier GB conception, but, in my
view, at a significant cost. Later theory added to Merge an additional G
operation, AGREE. AGREE is a long distance operation between a probe and a c-commanded
goal. It is possible to unify case licensing configurations using AGREE.
However, one looses the correlation between movement and scope unless further
assumptions are pressed into service.
Why the shift from the
earlier account? I am not sure. So far as I can tell, the first reason was
Chomsky’s unhappiness with Spec-X0 relations (Chomsky took these to
be suspect in a way that head-complement relations are not (I have no idea
why)) and became more suspicious in a label free syntax. If labels are not
syntactically active, then there isn’t a local relation between a moved nominal
and a case licensing head in a Spec-head configuration. So, if you don’t like
labels, you won’t like unifying case under Spec-head. Or, to put this more
positively (I am after all a pretty positive fellow), if you are ok with labels
(I love them) then you will find obvious attractions in the Spec-head theory.
[8]
As is also the case for AGREE based conceptions, see previous note.
[9]
This reprises the analysis in Lasnik and Saito 1991, which is in turn based on
data from Postal. For a more elaborate discussion with further binding data see
Hornstein et. al 2005 pp. 133ff.
[10]
Analogous data for the internal argument obtain as well:
(i)
John criticized the men during each other’s
trials
I leave unpacking the derivations as an exercise.
[11]
Usually via a dedicated diacritic feature (e.g. null case) but sometimes even
less elegantly.
[12]
I say “typically” for A-movement is not always so restricted and it appears
that in these Gs neither is control. See Boeckx et. al. chapter 4 for
discussion.
[13]
Again, space prohibits developing the argument in full detail. The interested
reader should consult the Boeckx et. al. presentation.
[14]
My own view is that this is probably a reflex of case theory. See Haddad and
Potsdam for a proposal along these
lines.
[15]
We will soon see that in some languages many copies can be retained, but let’s
put this aside for the moment.
[16]
As Haddad and Potsdam note there actually four possibilities: The higher copy
is retained, the lower, either the higher or lower or both. Haddad and Potsdam
provides evidence that all four possibilities are in fact realized, a fact that
provides further support for treating Control as living in I-Merged generated
A-chains plus some deletion process for copies.
[17]
For discussion, see Boeckx et. al. and the review in Haddad and Potsdam.
[18]
Observe, for example, that control is still a chain relation linking two theta
positions, the embedded one being the subject of a non-finite clause.
[19]
There are many other properties of control constructions that an I-Merge
account explains (e.g. the Principle of Minimal Distance). For the curious,
this is reviewed in Boeckx et. al.
[20]
This partly resurrects the old Lees-Klima theory of reflexivization, but
without many of the problems. For discussion see Lidz and Idsardi 1998 and
Hornstein 2001.
[21]
See Boeckx et. al. 2008 and references therein for discussion.
[22]
This proposal also predicts that backwards reflexive constructions should be
possible, and indeed, Polinky and Potsdam (2002) argues that these exist in
Tsez.
The argument based on contraction effects in (18) doesn't seem very compelling. If we were already convinced that control arose from movement of some kind, then the fact that you can "contract across a PRO" would tell us that it's A-movement rather than A-bar-movement. But I think most would say that the natural interpretation of (18b) and (18c) is that A-bar-movement blocks contraction, not that A-movement creates the possibility of contraction. (Admittedly, it might be hard to tell the difference.) From there it seems more like we should say that this pattern is "consistent with" the claim that PRO is the residue of A-movement, rather than saying that it supports it. Am I missing a step?
ReplyDeleteI agree. But it also argues against an EC that is case marked, as wh-traces are. The latter block contraction. As you know, one popular theory for the distribution of PRO reduces it to case theory. This is an argument that this is wrong. Moreover, if we agree that "PRO" is an ec then IF it is an A-trace (rather than base generated or A'-trace or case marked) then we expect it to function like other A-traces wert to contraction. It does. That was the only point I was trying to make.
Delete@Norbert: aren't you presupposing the explanandum here? The contraction facts* only argue against an EC that is case marked on the assumption that lack of case on the A-trace in (18b) is the relevant factor that distinguishes it from the A-bar trace in (18c). But that is rather premature, don't you think? For one thing, case doesn't properly distinguish A-movement from A-bar movement in the first place. There is A-movement out of case-marked positions and A-bar movement feeding case assignment (see, e.g., one of the papers in the Syntactic Structures anniversary volume you just co-edited ��). Second, even if it were the case that case reliably distinguished A- from A-bar movement (again, it doesn't), it still wouldn't follow that that is the relevant distinction between (18b) and (18c). It could very well be that the relevant factor is whether displacement happens before or after certain cyclic spellout nodes (an idea that goes back to Bresnan in the early 70s). So, A-bar displacement would happen too late to redo the prosodic organization in the embedded clause; A-movement happens before that prosodic structure is finalized. This would not distinguish the PRO and A-trace theories of control at all, since PRO, being a base-generated EC, is as "early" as anything else in the derivation; the odd man out is A-bar movement, that is too late and cannot redo the pre-displacement, non-contracted prosody. So I'm with Tim: these contraction facts* don't really bear on the distinction between the PRO and A-movement theories of control, as far as I can tell.
ReplyDelete* – I seem to remember it being a settled matter that accounting for the distinction between (18a/b) and (18c) in terms of the nature of the empty category is wrong; at that it had been considered a settled matter that the difference lies in larger sentence-level prosodic organization. (Though I can't, off the top of my head, remember the relevant citations.) If so, the whole discussion is moot – though that would be just one more path to the conclusion that this does not bear on theories of control.
Yes it does presuppose this, but not fatally I don't think. The argument is that we have clear cases of A'-movement of the standard variety blocking contraction and we have clear cases where A-movement does not. What now to do with PRO? Is if more like an A or an A' trace? Well, were it an A-trace then we would expect to see contraction across it, which we do. Were it LIKE an A'-trace we would not. Ok, how are A-traces different from A'-traces? Well, in simple cases, the difference is that the latter is from a case position while the former is from a non-case position. So, IF we take PRO's distribution to follow from case, then it is reasonable to expect it to pattern like an A'-trace. This does not follow apodicticly but it makes sense to so group them. If we group PRO with A-traces then we can ask on what basis we are doing so. Here's one answer: they ARE A-traces. Here's another: for some reason despite their not being A-traces they act like they were. I like the first answer.
DeleteThat said, I think these sandhi arguments are not super strong, they just follow the expected logic and so are there for free grabbing.
Last point: you are quite right: if the contraction data (see Anderson on this among others) does not exploit the trace apparatus at all, then this argument is moot. I am assuming that something like the old analysis has some merit. If this is false, then the whole thing is a red herring.
Let me add one more thing: the idea that PRO based theories and MTC are on a methodological par is one that I would argue against. There is no coherent account of PRO in contemporary MP accounts except for the MTC (IMO). I have discussed this before in other venues, but the main observation is that given something like bare phrase structure there is no room for PRO anymore, expect as a lexical primitive. But as a lexical primitive it cannot possibly explain anything. Every theory of control in GG has treated it as a G internal expression so as to derive the properties of control configurations from G first principles. Treating PRO as a lexical primitive blocks this line of argument and, for that reason alone, strikes me as a very bad idea. In that context, there is only really one option: it is either an A or A' trace. And given those options...