This post will be pretty free form, involving more than a
little thinking out loud (aka rambling). It will maunder a bit and end pretty
inconclusively. If this sort of thing is not to your liking, here would be a
good place to stop.
I’ve recently read an interesting paper on a question that
I’ve been thinking about off and on for about a decade (sounds longer than 10
years eh?) by Epstein, Kitihara and Seely (EKS) (here). The question: to what degree are licit formal
dependencies of interacting expressions functions of the substantive
characteristics of the dependent elements? This is a mouthful of a sentence,
but the idea is pretty simple: we have lots of grammatical dependencies, how
much do they depend on the specific properties of specific lexical/functional
items involved?[1] Let me give a couple of illustrations to clarify
what I’m trying to get at.
Take the original subjacency condition. It prohibited two
expressions from interacting if one is within an island and the other is
outside that island. So in (1) Y cannot move to X:
(1) […X…[island…Y…]…]
Now, we can list islands by name (e.g. CNPC, WH-island,
Subject Islands etc.) or we can try to unify them in some way. The first
unification (due to Chomsky) involved two parts; the first a specification of
how far is too far (at most one bounding node between X and Y), the
second an identification of the bounding nodes (BN) (DP and CP, optionally TP
and PP etc.). Now, the way I always understood things is that the first part of
the “definition” was formal (i.e. the same principle holds regardless of the BN
inventory), the second substantive (i.e. the attested dependencies depend on
the actual choice of BNs). Indeed, Rizzi’s famous paper (actually the one
limned in the footnotes, rather than the one in the text) was all about how to
model typological differences via small changes in the inventory of BNs for a
given grammar. So, the classical theory
of subjacency comprises a formal part that does not care about the actual
categories involved and a substantive part, that cares a lot.
Later theories of islands cut things up a little
differently. So, for example, one intriguing feature of Barriers was its ambition to eliminate the substantive part of
subjacency theory. Rather than actually
listing the BNs, Barriers tried to
deduce the class of BNs to general formal properties of the phrase marker. Roughly speaking, complements are porous,
while non-complements are barriers.[2]
Complementation is itself an abstract formal dependency, largely independent of
the contents of the interacting expressions.
I say “largely independent” for in Barriers
it was critical that there be some form of L-marking that was itself dependent
on theta marking. However, the L-marking relation was very generic and applied
widely to many different kinds of expressions.
Cut to the present and phases: phases have returned to the
original conception of BNs. Of course we now call them phase heads rather than
BNs, and we include v as well as CP
(an inheritance from Barriers) but
what is important is that we list
them.[3]
The grammar functions as it does because v
and C are points of transfer and they
are points of transfer because they are phase heads. Thus, if you are not a
phase head you are not a point of transfer. However, theoretically, you are a
phase head because you have been so listed. BTW, as you all know, unless D is included in this inventory, we
cannot code island effects in terms of phases.
And as you also all know, the phase-based account of islands is no more principled than the older
subjacency account.[4] However, this is not my topic here. All I want to observe is how substantive
assumptions interact with formal ones to determine the class of licit
dependencies and how some accounts have a “larger” substantive component than
others. I also want to register a minimalist observation (by no means original)
that the substantive assumption about
the inventory of Phases/BNs raises non-trivial minimalist queries: “why these?”
being the obvious one. [5]
Let’s contrast this case with Minimality. This, so far as I can tell, is a purely
formal restriction, even in its relativized form. It states that in a
configuration like (2), X,Y,Z being of the same type (i.e. sharing the same
relevant features) Y cannot interact with X over an intervening Z. For (2) the
actual feature specifications do not matter. Whatever they are, minimality will block interaction in these
cases. This is what I mean by treating it as a purely formal condition.
(2) …X…Z…Y…
So, we now have two different examples, let’s get back to
the question posed in EKS: we all assume a universal base hypothesis with the
rough structure C-T-v-V, to what degree does this base hierarchical order
follow from formal principles? Note, the
base theory names the relevant heads
in their relevant hierarchical order, the question is to what degree do formal
principles force this order. EKS
discuss this and argue that given certain current assumptions about phases, we
can derive the fact that theta domains are nested within case domains and,
suggest, that the same reasoning can apply to the upper C-T part of the base
structure. Like I said the paper is interesting and I recommend it. However, I would like to ask EKSs question in
a slightly different way, stealing a trick from our friends in physics (recall,
I am deep green with physics envy).
Among the symmetries physicists study is one in
which different elements are swapped for one another. Thus, as Carrol (here) noted
concerning nuclear structure: “In 1954, Chen Ning Yang and Robert Mills came up
with the idea that this symmetry should be promoted to a local symmetry – i.e.,
that we should be allowed to “rotate” neutrons and protons into each other at
every point in space (154).” They did this to consider whether the strong force
really cared about the obvious differences between protons and neutrons. Let’s
try a similar trick within the C-T-v-V domain, this time “rotating” theta and
case markers into each other, to see whether the ordering of elements in the
base really affects what kinds of formal dependencies we find.
More specifically: consider the basic form of the sentence,
and let’s consider only the dependencies within TP:
(3) [CP
C [TP …T…[vP Subj v [VP V Obj]]]]
In (3) Subj gets theta from v and case from T. Object gets theta
from V and case from v. So there is a T-Subj relation, a Subj-v relation a
v-Obj relation and a V-Obj relation.
Does it matter to these relations and further derivations that in fact the specific features
noted are checked by the indicated heads. To get a handle on this, imagine if
we systematically changed case for theta assignments above (i.e. rotated case and
theta into each other so that T assigns theta to Subj and v assigns case, v
assigns theta to Obj and V assigns case etc.) what would go wrong? If nothing
goes wrong, then the actual labels here make no difference. If nothing goes
wrong then the formal properties do not determine the actual substantive order
To sound puffed up and super scientific we might say that
the formal properties are symmetric wrt the substantive features of case and
theta assignment. Note, btw, we already think this way for theta and case values. The grammatical operations are
symmetric with respect to these (i.e. they don’t care what the actual theta
role or case value is). We are just
extending this reasoning one step further by asking about assignment as well as
values.
Observe that things can go “wrong” in various ways: we could
get lots of decent looking derivations honoring the formal restrictions but the
derivations either under or over generate.
For example, If T assigns the external theta role then transitive small
clauses might be impossible if small clauses have no structure higher than v. This seems false. Or, if this is right, then
we might expect expletives to always sit under neg in English as they cannot
move to Spec t this being a theta position. Again, this seems wrong. So, there
seem to be, at least at first blush, empirical consequences of making this
rotation. However, the “look” of the
system is not that different if this is the only
kind of problem, i.e. the resulting system is language like if not exactly identical
to what we actually find. In other words, it’s a possible UG, just not our UG. There is clearly a minimalist
question lurking here.
A second way things could go wrong is that we do not in
general get convergent derivations. EKS argue that certain phase-based accounts
have this more expansive consequence. The problem is not a little over/under
generation, the problem is that we can barely get a decent derivation at all. In
our little though experiment this means that rotating the case and theta values
results in impossible UGs. This would
be a fascinating result, with obvious minimalist intepretations.
Both kinds of “problems” are interesting. This first showing
that our UG deeply cares about the substantive heads having the specific
properties they do. The second suggests that there is a very strong tie between
the basic structure of the clause and the formal universals we have. Both kinds of results would be interesting.
I have no worked out answer to the ‘what goes wrong?’
question (though if you get one I would love to hear about it). Note that I
have abstracted away from everything but what is assumed to be syntactically
relevant; case and theta “features.” I have also assumed that how these
features are assigned is symmetrical: that both are assigned in the same way.
If this is false then this might be the source of the substantive base order
noted (e.g. if theta were only under merge and case could be under agree). However,
right now I am satisfied to leave my version of the EKS question open.
Let me end with two further random observations:
First, whatever answer we find, I find the question to be
really cool and novel and exciting! We may need to stipulate all sorts of things but until we start
asking the kinds of question EKS pose, we won’t know what is principled and
what not.
Second, as noted, the answer to this question will have
significant implications for the Minimalist Program (MP). To date, many minimalists (e.g. me) have
concentrated on trying the unify the various dependencies as prelude to
explaining their properties in generic cognitive/computational terms. However, if C-T-v-V base structure is part of
FL/UG then it presents a different kind of challenge to MP, given the apparent
linguistic specificity of the universals.
Few notions are more linguistically parochial than BN or phase head or
C-T-v-V. It would be nice if some of this
followed from architectural features of the grammar as EKS suggest, or from the
demands of the interface (e.g. think Heim’s tri-partite semantic structures),
or something else still. The real challenge to MP arises if these kinds of
substantive universals are brut. At any rate, it seems to me that substantive
universals present a different kind of challenge to MP and so they are worth
thinking about very carefully.
That’s enough rambling.
[1]
I know, all the rage lately has been to pack all interesting grammatical
properties into functional heads, specific lexical content being restricted to
roots. The question still arises: do we need to know the special properties of
the involved heads, be they functional or not, to get the right class of
grammatical dependencies?
[2]
This idea was not original to
Barriers. Cattell and Cinque, I believe,
had a similar theoretical intuition earlier.
[3]
I think that we can all agree that Chomsky’s attempt to relate the selction of
C and v to semantic properties has not been a singular success. Moreover, his
rationalization leaves out D, without which extending phases to cover island
effects is impossible. If phases do not explain island phenomena, then their utility
is somewhat circumscribed, given the view over the last 30 years that cyclicity
and islandhood are tightly connected. Indeed, one might say that the central
empirical prediction of the subjacency account was successive cyclic movement.
All the other stuff was there to code Ross’s observations. Successive cyclicity
was a novel (and verified) consequence.
[4]
Boeckx and Grohmann (2007) have gone through this in detail and so far as I can
tell, the theoretical landscape has stayed pretty much the same.
[5]
There are attempts to give a “Barriers” version of phases (e.g. Den Dikken) and
attempts to argue that virtually every “Max P” is a phase in order to finesse
this minimalist problem.
I think what's interesting is that even the structural definition of minimality is substantive, to the extent that it takes certain relations to be primitive. I'm not sure there's a principled reason why we could say that a particular set of categories (Cat = {C, D, ...}) is more substantive than a particular class of relations (Immediate Dominance, at the minimum). On purely mathematical grounds, a set of categories surely is simpler than a whole class of relations. But cognitively this probably isn't the case -- I could imagine that a class of relations would result from specific cognitive components being used in particular ways, a sort of epiphenomenon. I think that kind of story a lot harder to tell for sets of things. I think that's probably one of the bigger mysteries of language that we haven't begun to approach -- why is it that these kinds of things are what language uses? Why are the "substantive" universals, in the usual sense, probably not the right answer?
ReplyDeleteI think of Substantive Universals as being like constants in a physical theory. They have certain specific properties that look, from where we sit now, to be arbitrary. This seems less true (or I hope it is less true) for the formal universals, which (I hope) can be unified with other more generic cognitive principles/operations. If this analogy is roughly right, then the question is whether the specific values we have for these constants is derivable from the properties of the computational system (a project familiar from contemporary physics (which, btw, looks like it has failed)). Maybe some are, maybe others aren't. For those that are, great. For those that aren't, we should ask whether they are actually part of FL/UG or maybe just appropriated from the rest of cognition for linguistic purposes. There is a tendency to see grammatical features as internal to FL/UG e.g. animacy, phi-features, etc. But is this necessary? I don't know.
ReplyDelete