This continues the saga from the previous post.
The LK theory had a good run. However, two particularly thorny
problems strongly urged revision. Here
we briefly outline the main problems with LK approaches. We then discuss the
properties of the GB Binding Theory (BT) that came to replace it with
particular emphasis on how it addressed these problems.
There are two problems with LK approaches. First, how to
analyze sentences like (5a). On analogy with examples like John kissed himself (derived form underlying John kissed John), they should have underlying structures like
(5b). However, (5a) does not mean what (5b) does, and this serves to gut the
basic LK approach to reflexivization.[1]
Analogous examples in (6) argue against the LK analysis of pronominalization.
(5) a. Everyone
kissed himself
b. Everyone
kissed everyone
(6) a. Everyone
thinks that he is tall
b. Everyone
thinks that everyone is tall
It is worth considering how these examples pose a problem
for the LK analysis. First, given LK’s background assumptions, it appears that
the anaphoric morphemes are semantically significant. Specifically, it is plausible that (5a)/(6a)
differ in meaning from (5b)/(6b) precisely because the semantic contributions
of himself and he are different from that of everyone. Adverting to the semantic contributions of
the specific anaphors requires reneging on the assumption that binding
dependencies are morpheme blind, and suggests that a morpheme centered analysis
of binding is the right way to proceed. In other words, rather than being
grammatical afterthoughts, the licensing requirements of anaphoric morphemes
drive the grammar. Reflexives and bound pronouns are elements that require
grammatical validation and grammatical processes cater to their licensing requirements.
Second, once binding theory becomes morpheme centered,
economy is no longer required (or desired). Reflexivization and
Pronominalization must be ordered with (1) before (2) because Pronominalization
applies to the same inputs (aka Structural Descriptions (SD)) as
Reflexivization. Were they unordered, the grammar would generate illicit bound
pronoun constructions. However, ordering
(1) before (2) is nugatory if they have different SDs. Requiring reference to
specific reflexive and a pronoun morphemes in the statement of the operations
(as the data in (5) and (6) suggests is necessary) automatically results in
rules with different SDs and this then obviates strictly ordering
reflexivization before pronominalization. In other words, once binding theory
aims to license reflexive morphemes
and bound pronoun morphemes, the
principles that do so can apply without reference to the applicability of other
rules, i.e. the binding principles can be stated unconditionally rather than
conditions whose applicability is relative to that of others.
In addition to the problems posed by (5) and (6), there is a
second problem with the LK account, forcefully noted in Lasnik 1976. Say it is
correct that rules like (1) and (2) prevent bound pronouns from appearing where
reflexives do. This still leaves open
the possibility that referential pronouns might appear with the requisite
interpretation. Recall, that LK does not regulate (co-)reference, merely
binding. Thus, what is wrong with (7a) where the pronoun is not the product of
any grammatical process. This kind of
base generated deictic pronoun is what we find in sentences like (7b). So what
prevents generating such a pronoun in (7a) with the same reference as John? As Lasnik 1976 notes, (7a) with
the co-referential interpretation of the pronoun is rather unacceptable and
this is left unexplained so long as the co-reference is not the result of
binding but is “accidental.” We can patch up the LK account by adding another disjointness rule, something to the
effect that pronouns cannot have local binders (along the lines of Principle B)
but, and this is the critical point, this patch seems to render the
Pronominalization rule superfluous as the anti-binding Principle B alone
suffices to account for the distribution of both bound and unbound pronouns. In
effect, they can appear anywhere they are not prohibited.
(7) a. John loves
him
b. Mary saw
him
Lasnik’s (1976) proposal has an interesting theoretical
effect: it reinterprets what the grammar tracks. For LK, the grammar licenses
antecedence and establishes antecedent-anaphor dependencies. However, the
disjointness rule does not establish antecedence but anti-antecedence. This
results in a somewhat schizophrenic grammatical theory of bound anaphora.[2]
Reflexive are treated more or less as in LK with the grammar requiring that a
reflexive have an appropriate local antecedent. In effect, all reflexives are
anaphoric and require antecedents and the grammar functions to ensure this
result. Pronouns, in contrast, do not
require antecedents and what the grammar does is ensure that if there is an antecedent it is not too
local. Importantly, this collapses bound
and non-bound pronouns into one class. Indeed, it’s this move that allows
Lasnik (1976) to address the puzzle he identified. For LK, grammars code for
anaphoric dependency. After Lasnik (1976) grammars regulate the distribution of
classes of morphemes regardless of interpretation, with some later extra
grammatical process (we might now say interface processes) determining which
pronouns are understood as bound and which not.
So with this as background, let’s consider the general
features of the GB Binding Theory (BT).
It consists of three principles:
(8) A. An anaphor
(e.g. a reflexive) must be bound in its domain.
B. A
Pronoun cannot be bound (must be free) in its domain.
C. An
R-expression cannot be bound.
Domains have been variously characterized, but for the nonce
we can assume that it is the minimal (finite) clause containing the
anaphor/pronoun/R-expression. There are several things to observe about these
principles. [3]
First, they are morpheme centered. The inputs to the rules
are expressions that fall into one of three classes, anaphors (+a,-p),
pronominals (+p,-a) and R-expressions (-a,-p).[4]
Furthermore, the rules state licensing requirements for these expressions. In other words, differing binding
requirements restrict the distribution (and interpretation) of lexical items
with differing feature constitutions. Note that this effectively rejects the LK
distinction between lexical and grammatical formatives. Reflexives and bound pronouns are just as
lexical as cat and run, though their features call forth
grammatical licensing in ways that more “ordinary” lexical items do not.
Second, the principles are unconditional in the sense that
neither A nor B is ordered or needs to be ordered with respect to the
other. What accounts for the
complementarity of reflexives and pronouns is the fact that both elements meet
inverse licensing requirements in the same local domain. Where anaphors must be bound, pronouns must not be. There is no sense within
the standard GB version of BT that pronouns and anaphors or their licensing
requirements are in an economy relation or that one kind of dependency is
preferred to the other. Both apply where
they can and the presence or absence of either has no grammatical effect on the
other.
A thought experiment makes the contrast with LK evident. I
imagine that an epidemic struck wiping out the rule of reflexivization. In an
LK grammar, pronominalization would apply to yield sentences like John likes him with a bound
interpretation. Given the same epidemic,
this time wiping out A, such sentences would still be prohibited for
pronominals would still be subject to B.
Third, the principles apply regardless of how the morphemes
are semantically interpreted. This is clearest in the case of B. The grammar
does not single out any particular antecedence relation. B states that pronouns
cannot appear in certain configurations, it does not specify or distinguish and
DP as antecedent. In fact, the
co-indexing that is part of the system has no univocal semantic interpretation.
This is what allows B to accommodate Lasnik’s (1976) worries. The indexing in
(8) violates B regardless of whether it is interpreted as binding or
co-reference.
(8) *John1 likes him1
This has the effect of enriching the interpretive systems,
as some indexations are interpreted as binding dependencies and some are not.
Which are which falls beyond the purview of the grammar despite the fact that
they have consequences for acceptability that do not seem particularly
“semantic.” Consider one example. Weak Crossover Effects (WCO) are restricted
to bound pronouns, e.g. ones where the antecedent is a quantified DP. Thus
there is a contrast in (9a,b) where him
can be interpreted as John but not as
bound by everyone.
(9) a. his1
photos distressed John1
b. *his1
photos distressed everyone1
The standard description of this is that WCO constrains
binding relations but not (co-) reference. The indexing in (9a) is interpreted
as co-reference while the one in (9b) cannot be so interpreted as everyone is not a referring
expression. However, as binding is
illicit in WCO configurations, it cannot receive a binding interpretation so
the indexing yields no good interpretation and the structure yields
unacceptability.[5]
If this is correct, then BT requires supplementation to distinguish those
indexings that will be interpreted as bindings from those that will not. Why?
Because the indexing above is purely syntactic and it tracks a mixed bag of
interpretive possibilities. This is just
another way of saying that BT per se
accounts for the distribution of certain morphemes, not the dependencies that
they enter into.
Another consequence of the GB re-invention of the binding
theory is Principle C. As noted above, LK had no corresponding principle.
Rather, for LK Principle C effects are by products of how Reflexivization and
Pronominalization are stated. Once the
idea that reflexives and pronouns are real inputs to semantic interpretation
and not mere morphological dress-up, the LK strategy for dealing with Principle
C effects is not longer viable and an explicit statement is required.
As has long been noted, Principle C is somewhat odd. First,
it is not bounded in any way, unlike A and B, which apply within circumscribed
domains. Second, at least initially, it
was taken to apply to what on the surface appear to be entirely different kinds
of cases. The examples in (10) do not violate Principles A or B. Consequently, if these are to be excluded an
additional binding principle is required. Furthermore, whereas (10c,d)
plausibly pertain to the structural conditions imposed on antecedents and their
dependent anaphors, (10a,b) do not appear to involve anaphoric elements at
all. To collect all these cases under
the same principle it is critical to add another category of expressions,
R-expressions, to the inventory of elements regulated by the Binding Theory.[6] Like Principle B, Principle C does not
specify licit anaphoric dependencies but blocks illict ones. It is negative,
rather than a positive rule of grammar.[7]
(10) a. *John1
likes John1
b. *John
thinks that John is tall
c. *John1
expect himself1 to like John1
d. *He1
thinks that John1 is tall
Ok, let’s take a step back now and consider these two
approaches. As should be evident, the theoretical intuitions behind BT are very
different than those behind LK. They
differ not only technologically (LK uses transformations, is derivational in
spirit and has morpheme rewrite rules while BT has indexing algorithms and is
representational in spirit and uses filters stated at various grammatical
levels) but in what they take the subject matter of binding to be. They differ along three critical dimensions:
1. Do binding principles have a natural
semantic interpretation?
2. Are binding principles economy principles
(or absolute)?
3. Are binding principles morpheme (or dependency)
centered?
LK answers yes to (1) and (2) and no to (3). GB answers no
to (1) and (2) and yes to (3). What of minimalist approaches? I don’t know, but
my hunch is that LK approaches have some features that are worth reconsidering
in in MP context. How so?
Well, first, the main problem with LK accounts noted above
disappear in the context of theories that distinguish copies due to I-merge and
those that result from multiple selections of the same expression from the
lexicon. The LK theory did not (and could not) distinguish these two
possibilities and so the fact that everyone
loves himself does not mean the same as everyone
loves everyone, was sufficient to sink the LK approach. However, as you
know, this does not hold for MP accounts so
long as binding tracks movement. If it does, then we can distinguish the
two case above syntactically.
As noted, this requires endorsing a movement theory of
binding. The main obstacle to this in earlier GG theories was D(eep) Structure.
Given DS, there could be no movement between “theta” positions and so the
technology MP provides to distinguish everyone1
loves everyone2 from everyone1 loves
everyone1 could not be applied. However, if we eschew DS (as MP stories do)
then it is in principle possible to
move between “theta” positions and so distinguish these two kinds of chains. We
can then restrict LK reasoning to the second kind of chain without empirical
hazard. So, the elimination of DS restrictions is a pre-requisite for
revivifying the LK approach, and this is precisely what MP theories allow.
Last, we must allow Gs that differentially spell out copies.
The LK theory, recall, treats the morphological differences between reflexives
and bound pronouns as syntactically very superficial. What counts are the
underlying chains/dependencies, not the morphemes that express them. Imo, this
is likely a good thing. Let me explain why.
The aim of MP is to explain why we have the UG principles we do. In other words, the aim is to
explain the structure of FL/UG. Morpheme centered Gs don’t cannot support this
kind of project. Why not? Because they effectively stipulate G requirements by packing them into the idiosyncratic
feature make-ups of specific lexical items. Why must anaphors be locally bound?
Because they have features that require that they be locally bound. Why do they
have these features? Well, because they are reflexives and reflexives inherently
have such features by stipulation.
This explanation is circular in the worst sense: the circle is very very tight.
Note, that such stipulations are particularly problematic
for those with minimalist fish to fry. They are not, for example, worrisome for Plato Problem kinds of issues. So,
if A-anaphors are innately part of any lexicon then their binding requirements
need not be learned.[8]
However, if one wants to go beyond
explanatory adequacy, then such stipulations stink. They defeat the MP project
from the get-go, as they stipulate what we want to have explained.[9] In
other words, from an MP perspective the problem with classical binding theory
is that it is based on a series of morphological stipulations concerning
specific lexemes, and stipulations like these prevent explanation. So, if your goal is to explain why the binding theory looks the way it
does, then you don’t want morpheme centered accounts of binding like the ones
we find in GB. Of course, this goal might be unattainable and morpheme based
accounts might be the best we can do, but…
Ok, basta! This has gone on far too long. Let me just
suggest that earlier GG theories had some properties that are worth
re-examining, most particularly the idea that some morphemes are just by-products of grammatical
processes rather than being the causal engines behind them. This is not a new
idea, but MP has given them, imo, a new lease on life and part of this lease
implies rethinking the idea that all formatives are created equal and have an
equal purchase in interpretation at the interfaces.
[1]
Recall in LK accounts the pre-transformational phrase marker was sole input to
semantic interpretation. However, even later approaches which allowed
information from several grammatical levels to contribute to semantic
interpretation would not have been able to incorporate an LK analyses which
sensibly captured the meaning of examples like (5a) and (6a).
[2]
This is curiously mirrored in Lasnik’s paper where the appendix deals with bound
pronouns and the body of the paper with co-reference.
[3]
There is a third reason for moving from an LK approach to a morpheme centered
one. In the mid 1970s there was a well motivated theoretical move to
dramatically simplify structural descriptions (SDs) and Structural Changes
(SCs). This made rules that included the insertion of specific morphemes less
natural/desirable. The main problem was that such morpho-phonological intruders
complicated the move to simple rules like Move
alpha anywhere. Flash forward 40 years and the analogue of LK rules finds a
natural home: it results from processes the spell out copies/occurrences. See
below.
[5]
Strong Crossover effects yield a similar problem. Variables are the
semantically quintessential anaphors.
They are no referring expressions and require binders. As such, one might think that they would
qualify as anaphoric (+a,-p) elements. In fact the earliest versions of trace
theory categorized wh-traces/variables as anaphors. However, so categorizing
variables leads to a big empirical problem. Sentences like (i) are wrongly
expected to be interpretable as (ii).
(i)
Who1 does he1 think t1
is intelligent
(ii)
Who1 t1 thinks he1
is intelligent
This conclusion can be finessed by cataloging residues
of movement, t1, as an
R-expression subject to principle C.
This works, but it also highlights the fact that ‘R’ does not mean
“referential” in the naïve sense of the term.
[6]
One of the authors suspects that the attractiveness of the GB theory of PRO was
in part the result of filling an available cell in the required feature matrix.
If anaphors are +a,-p and pronouns are –a,+p, and R-expression are –a,-p then
there should be something that is +a,+p. PRO was the proposed missing
link.
[7]
There are some problems with this way of dealing with the data in (10). First,
it is not clear that (10a,b) are really as unacceptable as (10c,d). Indeed
there are languages where analogous sentences seem perfectly well formed and
express anaphoric dependencies (c.f. Boeckx, Hornstein and Nunes 200x for some
discussion). Second, it is not particularly clear what an R-expression is.
Among the elements that fell under the category are traces (interpreted as
bound variables), names, definite descriptions, demonstratives etc. What all these expressions have in common
besides is unclear. Variables are prototypical anaphors. Names are prototypical
non-anaphors. Definite descriptions can
be used anaphorically or not and semantically and grammatically they share many
properties with pronouns. Nonetheless, they too are categorized as
R-expressions. The category seems to be
a catch-all with entry requirements to the fraternity driven entirely by empirical
necessity. This results in negligible explanatory force.
[8]
These kinds of theories, however, do require some non-trivial explications of
how morpho-phonoligical agreement implicates semantic dependency. Just because
two expressions have the same features need imply nothing about whether/how
these features have semantic significance.
[9]
Incidentally, this is why PRO based accounts of control should also be MP
suspect. Assuming PRO with its special licensing requirements allows to track
control facts but not explain why control exists.