The intricacies of A’-syntax is one of the glories of GB.[1] The unification of Ross’s islands in terms of
subjacency and the discovery of ECP dependencies (especially the adjunct/argument
distinction) coupled with wide ranging investigations of these effects in a
large variety of different kinds of languages marked a high point in Generative
Grammar. This all changed with the Minimalist (M) “Revolution” (yes; these are
scare quotes). Thereafter, Island and ECP effects mostly fell from the hot
topics list (compare post M work with that done in the 80s and early 90s where
it seemed that every other paper/book was about A’-dependencies and their
island/ECP restrictions). Moreover, though early M was chock full of
discussions of Superiority, an A’-effect, it was mainly theoretically
interesting for the light that it threw on Minimality and Shortest Move/Attract
rather than how it bore on Islands or the ECP.
Indeed, from where I sit, the bulk of the interesting work within M has
been on A rather than A’ dependencies.[2]
Moreover, whereas there has been interesting research aiming
to unify various grammatical modules, subjacency and ECP have resisted
theoretical integration, at least interesting versions thereof. It is possible,
indeed easy, to translate bounding theory or barriers into phase terminology.[3]
However, there is nothing particularly insightful gained in doing this. It is
also possible to unify Islands with Minimality given the right use of features
placed in appropriate edge positions, but IMO little has been gained to date in
so proceeding. So Island and ECP effects, once the pride of theoretical syntax
have become a backwater and a slightly embarrassing one for three related
reasons.
First, though it is pretty easy to translate Subjacency
(viz. bounding theory) in phase terms, this translation simply duplicates the
peccadillos of the earlier approaches (e.g. we stipulated bounding nodes, we
now stipulate (strong) phases, we stipulated escape hatches (C yes, D no) we
now stipulate phase edges (both which phases have any to use and how many they
have)).
Second, ad hoc as
this is, it’s good compared to the problems the ECP throws up. For example, the
ECP is conceptually a trace licensing requirement. Where does this leave us when
we replace traces with copies as M does? Do copies need licensing? Why if they
are simply different occurrences of a single expression? Moreover, how do we
code the difference between adjuncts versus arguments? What makes the former so restricted when compared
to the latter?
Last, the obvious redundancy between Subjacency and the ECP
raises serious M questions. Both involve the same island like configurations
yet they are entirely different licensing conditions. Talk of redundancy! One
of Subjacency or the ECP is bad enough, but both? Argh!!
So, A’-syntax raises M issues and a natural hope is to
dispose of these problems by placing them in someone else’s trash bin. And
there have been several attempts, to do just this, e.g. Kluender & Kutas,
Sag & Hoffmeister, Hawkins, among others. The idea has been to treat island
effects as a reflection of processing complexity, the latter arising when
parsers try to relate elements outside an island (fillers) to positions (gaps)
within an island. It is well known that
filler/gap dependencies impose a memory/storage cost as the process of relating
a filler to a gap requires keeping the filler “live” until it’s discharged in
the appropriate position. Interestingly, there is independent psycho-ling
evidence that the cost of keeping elements active can depend on the details of
the parse quite independently of whether islands are involved (e.g. beginnings
of finite clauses induce load, as does the parsing of definites).[4]
Island effects, on this view, are just the sum total of these island-independent
processing costs. In effect, Islands are just structures where these other costly
independently manifested requirements converge. If true, this idea could, with some work, let M off the island
hook.[5]
Wouldn’t that be nice?
It would be, but I personally doubt that this strategy will
work out. The main problem is that it
seems very hard to explain the unacceptability profiles of island effects in
processing terms. A recent volume (of which I am co-editor though Jon Sprouse
did all the really heavy lifting and deserves all the credit, Experimental Syntax and Island Effects)
reviews the basic issues. The main take home message is that when considered in
detail, the relevant cited complexity inducers (e.g. definiteness) do not
eliminate the structural contributions of islands to the perceived
acceptability, though they can modulate it (viz. the super-additive effects of
islands remain even if the severity of the unacceptability can be manipulated).
Many of the papers in the volume address these issues in detail (see especially
those by Jon Sprouse, Matt Wagers, and Colin Phillips). The book also contains
good representatives of the processing “complexity” alternative and the
interested reader is encouraged to take a look at the papers (WARNING: being a
co-editor forbids me in good conscience, from advocating purchase but I believe
that many would consider this book a
perfect holiday gift even for those with no interest in the relevant
intellectual issues, e.g. it’s really heavy and would make a perfect
paperweight or door stopper).
A nice companion piece to the papers in the above volume
that I have recently read seconds the conclusion that Island Effects have a
structural source. The paper (here)
is by Yoshida, Kazanina, Pablos and Sturt (YKPS) and it explores the problem in
a very clever way. Here’s a quick review.
YKPS starts from the assumption that if the problem is one
of the processing complexities of islands, then any dependency into an island that
is computed online (as filler/gap dependencies are) should show island like
properties even if these dependencies are not products of movement. They
identify forward cataphora (e.g. his1
managers revealed that [island the studio that notified Jeffrey Stewart1 about the new
film] selected a novel for the script) as one such dependency. YKPS shows that
the indicated referential dependency is calculated online just as filler/gap
dependencies are (both are very greedy in fixing the dependency). However, in contrast to movement dependencies, pronoun
resolution in forward cataphora does not exhibit island effects. The argument
is easy to follow and the conclusion strikes me as pretty solid, but read it
and judge for yourself. What I liked about it is that it is a classic example
of a typical linguistic argument form: YKPS identifies a dog that doesn’t bark. If parsing complexity is
the relevant variable then it needs to explain both why some dependencies
exhibit island effects and, just as importantly, why some do not. In other
words, negative data counts! The absence of
island effects is as much a datum as its presence is, though it is often
ignored.[6]
As YKPS put it:
Complexity accounts, which
attribute island effects to the effect of processing complexity of the online
dependency formation process, need to explain why the same complexity does not affect (my emphasis, NH) the
formation of cataphoric dependencies. (17)
So, it seems to me that islands are here to stay, even if
their presence in UG embarrasses minimalists.
Three points and I end. First, the argument that YKPS
presents is another nice example of how psycho-techniques can be used to
advance syntactic ends. How so? Well, it
is critical to YKPS’s point that forward cataphora involves the same kind of
processing strategies (active filler) as do regular filler/gap dependencies
that one finds in movement despite the dependencies being entirely different
grammatically. This is what makes it possible to compare the two kinds of
processes and conclude from their different behavior wrt islands that
structural effects cannot be reduced to parsing complexity (a prima facie very reasonable hypothesis
and one that might even be nice were it true!).[7]
Second, the complexity theory of islands pertains to
Subjacency Effects. The far harder problem, as I mentioned earlier, involve ECP
effects. Indeed, were Subjacency Effects reduced to complexity effects, the
presence of ECP effects in the very same configurations would become even more puzzling, at least to me. At any
rate, both problems remain, and await a decent M analysis.
Third, let me end with some personal intellectual history. I
taught a course on the old GB A’ material with Howard Lasnik this semester (a
great experience, thx Howard) and have become pretty convinced that finding a
way to simply recapitulate ECP and Island effects in M terms is by no means
trivial. To see this, I invite you to simply
try to translate the GB theories into
an M acceptable idiom. Even this is pretty hard to do, and a simple translation
still leaves one short of a M acceptable account. Conclusion? This is still a
really juicy research topic for the unificationally inclined, i.e. a great
Minimalist research topic.
[1]
I take GB to be the logical culmination of work that first developed as the
Extended Standard Theory. Moreover, I here, again, take GB to be one of several
kissing cousins, such as GPSG, LFG, HPSG.
[2]
This is a bird’s eye evaluation and there are notable exceptions to this coarse
generalization. Here is one very conspicuous exception: how ellipsis obviates
island effects. Lasnik and Merchant have turned this into a small very
productive industry. The main theoretical effect has been to make us reconsider
what makes an island islandy. The ellipsis effects have revived an
interpretation that has some roots in Ross, that it is not the illicit
dependency that matters but the phonological realization thereof that counts. Islands, on this view, are PF rather than
syntactic effects. At any rate, this is really interesting stuff which has led
us to understand Island Effects in new ways.
[3]
At least if one allows D to be a phase, something some (e.g. Chomsky) has only
grudgingly accepted.
[4]
Rick Lewis has some nice models of this based on empirical work by Gibson.
[5]
Of course, more work needs doing. For example, one needs to explain, why,
ellipsis obviates these processing effects (see note 2).
[6]
Note 4 indicates another bit of negative data that needs explanation on the
complexity account. One might think, for example, that having to infer
structure would add to complexity and
thus increase the unacceptability of island violations, contrary to what we in
fact find.
[7]
Very reasonable indeed as witnessed by Chomsky’s extensive efforts to argue
against the supposition that island effects are simple complexity effects in On Wh Movement.
I am actually very fond of deriving island constraints from processing effects. Not so much the proposals that are currently out there, but the basic idea.
ReplyDeleteThe Specifier Island Constraint, for example, does lower parsing complexity because you never have to guess whether a mover came from within a specifier or a complement. From this perspective it also makes sense that there is nothing like the Complement Island Constraint, because sometimes you do not have any specifier at all so things must have come from inside the complement. Basically, if you want to cut down on the number of decision points, ignoring specifiers can work without exception, ignoring complements cannot.
Another curious point is that the processing accounts of islands assume that the constraints are not part of the grammar (or maybe I've read a very skewed sampling so far). This does not follow. Processing reasons could just as well nudge the learner in a direction where they prefer grammars with island constraints that reduce processing complexity.
I like some of these too. My favorite, surprise surprise, is the Berwick and Weinberg proposal that subjacency relates to efficient parsing with an LRK parser. We also know (Colin Phillips work shows this, some of which is reprised in the mentioned volume above) that island effects have on-line processing effects in that filled gap effects do NOT occur within islands unless grammatically licensed via parasitic gaps. This shows a tight connection between parsing and grammar. I like these results, though I am not sure that this indicates that grammatical constraints are DERIVED from parsing constraints, there is an important functional relation that is interesting.
DeleteThe island as grammar agnostics take a more extreme (and so interesting) view: that wrt islands there is no grammar specific to structure at all. Or more exactly, what one finds in islands is what one finds more generally in parsing non-islands and Island effects (including specifiers I would assume) result from the confluence of effects piling up at island boundaries. This is a non-structural view of islands in that there is nothing particularly special about these structures except that many things pile up there. This is what Sprouse and others argue against and that, if correct, has grammatical implications for it implies that islands are structural conditions sensitive to structure islands express but other sentences don't. I suspect that you know all of this, but I thought it would be nice to take advantage of your comment to reiterate it all.
To be clear, there is indeed much evidence that real-time comprehension processes respect island constraints. But that certainly does not motivate a reductionist account of islands. Nor does it provide any evidence for the notion that things like subjacency constraints are in some way a consequence of efficient parsing needs. In just the same way, if I demonstrate that comprehenders are very sensitive to subject-verb agreement, you would not (I hope) conclude that subject-verb agreement is either an epiphenomenon of parsing or motivated by efficient parsing. It bears emphasizing, because I've been surprised at how often people assume that if we have some findings about X in language processing then we must be pursuing a reductionist account of X. The relevant work is summarized in the Sprouse/Hornstein volume in a section that is specifically dedicated to findings that do NOT bear on the reductionist question.
DeleteThe distinction that Norbert and Thomas are making is one that we have referred to as "reductionist" vs. "grounded" accounts of islands. They are very different claims, though they are sometimes mistaken for one another. Reductionist accounts of islands claim that the effects are really epiphenomena. These are quite testable accounts. Grounded accounts claim that the effects are thungs-that-it-would-be-nice-for-a-grammar-to-have. Those accounts are very hard indeed to test.
if I demonstrate that comprehenders are very sensitive to subject-verb agreement, you would not (I hope) conclude that subject-verb agreement is either an epiphenomenon of parsing or motivated by efficient parsing.
DeleteIndeed I wouldn't, but not for the reason you are suggesting. For the argument that I sketched above, the crucial issue isn't processing sensitivity but parser performance. Those are two very different things (if you buy into my idiosyncratic terminology). Processing sensitivity is an empirical question and hence subject to experimental investigation. Parser performance is a theoretical issue: given a parsing model, how does this model's performance scale with respect to grammar size, the choice of constraints and movement types, etc. This is the realm of proofs and theorems. So if you can prove that some island constraints improve parsing performance, you've got a nice argument why we find island constraints even if they're not part of UG.
Of course the argument hinges on some assumptions that are subject to empirical investigation: 1) Is your parsing model a good representation of the parser(s) humans use? 2) Is your measure of performance relevant for the problems the human parser has to solve, and does it correlate with perceived sentence difficulty? Still, those aren't problems with the validity of the argument as such but rather with the applicability of the models and definitions being employed.
[A quick side remark for illustration of 2): Berwick and Weinberg have pointed out some real issues with the CS notion of parsing performance, which gives string length a lot more weight than grammar size --- if you're interested in parsing programs with thousands of lines of code, that is indeed the right perspective to take, but no so much for natural language sentences that hardly ever have more than 30 words]
Anyways, you guys have gotten me really interested in the book, so I'll get myself a copy as an early Xmas gift.
I agree, Thomas. Thanks. I was responding to Norbert's (partially retracted) suggestion that some of our experiments had shown something that I do not think that they show.
DeleteOIC. With proper threading it's sometimes easy to misconstrue comments --- it would be really nice if Blogger would finally lift the arbitrary restriction to two levels of embedding. But it's good to know that you agree, it suggests that my view is not completely bonkers ;)
DeleteThere seems to be a fairly clear split between island effects and the ECP in terms of how easy it is to ground these in parsing considerations. As Thomas and Norbert have pointed out, there are lots of "minimal search" type reasons why parsing might be aided by limits on possible dependency paths. On the other hand, the adjunct/argument asymmetry seems less amenable to an explanation in these terms. This is particularly so because it appears to be independent of the distinction between A'-movements which leave an argument structure gap and those which don't (which is the distinction you might expect the parser to be interested in). This is what I take the following paradigm to show. Sentence (1) is ambiguous, depending on whether how many books binds a variable in the object position or (I guess) a variable within the object position. The ambiguity disappears in (2), presumably because the ECP views extraction of how many books as argument extraction in (a) and adjunct extraction in (b). The embedded question blocks adjunct extraction but not argument extraction. From a parsing point of view all of these cases should look like argument extraction, since read is missing an object.
ReplyDelete(1)
How many books did you say that John read?
a. For how many books x did you say that John read x?
b. For which n did you say that John read n books?
(2)
How many books did you ask John why he read?
a. ?For how many books x did you ask John why he read x?
b. *For which n did you ask John why he read n books?
That's an interesting point, Alex. I'm fond of that paradigm, but hadn't thought about it in the context of this particular debate before. The one reservation that I'd have about your argument is whether the real-time interpretive mechanism really does treat both types of extraction as equivalent. If its goal is purely syntactic ("find me a gap"), then your argument is quite right. But if the interpretive difference is adopted from the start of the sentence, then the argument is less straightforward.
DeleteIt pays to consider another version of the same effect. In sentences like (1) it is possible to get a pair/list answer (i.e. John a pizza, Sue a pie, Sam a salami)
Delete(1) What did you say that everyone brought
But this is not possible once there is a wh-island. In this case the only available reading is a singleton, e.g. a desert
(2) What did you ask if everyone brought
So it seems that the "dependent" reading is blocked into wh-islands while the "wide scope" reading is permitted.
A question: what evidence do we have that filler/gap dependencies look for semantically specific gaps of this variety, as you suggested to Alex?
Interesting. I'm not sure that we have any evidence on the grain size of the interpretations that people attribute to wh-phrases upon first encounter. Your paradigm is interestingly different. In Alex's paradigm, one could claim that there's an interpretive ambiguity that is already apparent at the "how many" phrase, and so a comprehender could make a commitment right away. But in your examples that seems less plausible, as the interpretive difference only becomes relevant once the universal quantifier is reached.
Delete