So what’s classical (viz. GB) Case Theory (CCT) a theory of?
Hint: it’s not primarily about overt morphological case, though given some
ancillary assumptions, it can be (and has been) extended to cover standard
instances of morphological case in some languages. Nonetheless, as originally
proposed by Jean Roger Vergnaud (JRV), it has nothing whatsoever to do with
overt case. Rather, it is a theory of (some of) the filters proposed in
“Filters and Control” by Chomsky and Lasnik (F&C).
What do the F&C filters do? They track the distribution
of overt nominal expressions. (Overt) D/NPs are licit in some configurations
and not in others. For example, they shun the subject positions of non-finite
clauses (modulo ECM), they don’t like being complement to Ns or As, nor
complements to passivized verbs. JRV’s proposal, outlined in his famous letter
to Chomsky and Lasnik, is that it is possible to simplify the F&C theory if
we reanalyze the key filters as case effects; specifically if we assume that
nominals need case and that certain heads assign case to nominals in their
immediate vicinity. Note, that JRV understood the kind of case he was proposing
to be quite abstract. It was certainly not something evident from the surface
morphology of a language. How do I know? Because F&C filters, and hence
JRV’s CCT, was used to explain the distribution of all nominals in English and
French and these two languages display very little overt morphology on most
nominals. Thus, if CCT was to
supplant filters (which was the intent) then the case at issue had to be
abstract. The upshot: CCT always trucked in abstract
case.
So what about morphologically overt case? Well, CCT can
accommodate it if we add the assumption that abstract case, which applies
universally to all nominals in all Gs to regulate their distribution, is morphologically
expressed in some Gs (a standard GG maneuver). Do this and abstract case can
serve as the basis of a theory of overt morphological case. But, and this is
critical, the assumption that the mapping from abstract to concrete case can be
phonetically pretty transparent is not a central feature of the CCT.
I rehearse this history because it strikes me that lots of
discussion of case nowadays thinks that CCT is a theory of the distribution of
morphological case marking on nominals. Thus, it is generally assumed that a
key component of CCT assigns nominative case to nominals in finite subject
positions and accusative to those in object slots etc. From early on, many
observed that this simple morphological mapping paradigm is hardly universal.
This has led many to conclude that CCT must be wrong. However, this only
follows if this is what CCT was a theory of, which, I noted above, it was not.
Moreover, and this is quite interesting actually, so far as
I can tell the new case theorists (the ones that reject the CCT) have little to
say about the topic CCT or C&F’s filters tried to address. Thus, for
example, Marantz’s theory of dependent case (aimed at explaining the
morphology) is weak on the distribution of overt nominals. This suggests that
CCT and the newer Morphological Case Theory (MCT) are in complimentary
distribution: what the former takes as its subject matter and what the latter
takes as its subject matter fail to overlap. Thus, at least in principle, there
is room for both accounts; both a theory of abstract case (CCT) and a theory of
morphological case (MCT). The best theory, of course, would be one in which
both types of case are accommodated in a single theory (this is what the
extension of the CCT to morphology hoped to achieve). However, were these two
different, though partially related systems this would be an acceptable result
for many purposes.
Let’s return to the F&C filters and the CCT for a
moment. What
theoretically motivated
them? We know what domain of data they concerned themselves with (the
distribution of overt nominal).
But why have any filters at all?
F&C was part of the larger theoretical project of
simplifying transformations. In fact, it was part of the move from construction
based G rules to rules like move alpha (MA). Pre MA, transformations were
morpheme sensitive and construction specific. We had rules like relative clause
formation and passive and question formation. These rules applied to factored
strings which met the rules’ structural conditions (SD). The rules applied to these
strings to execute structural changes (SC). The rules applied cyclically, could
be optional or obligatory and could be ordered wrt one another (see
here
for some toy illustrations). The theoretical simplification of the
transformational component was the main
theoretical
research project from the mid 1970s to the early-mid 1980s. The simplification
amounted to factoring out the construction specificity of earlier rules,
thereby isolating the fundamental displacement (aka, movement) property. MA is
the result. It is the classical movement transformations shorn of their
specificity. In technical terms, MA is a transformation without specified SDs
or SCs. It is a very very simple operation and was a big step towards the merge
based conception of structure and movement that many adopt today.
How were filters and CCT part of this theoretical program?
Simplifying transformations by eliminating SDs and SCs makes it impossible to
treat transformations as obligatory. What would it mean to say that a rule like
MA is obligatory? Obliged to do
what
exactly?
So adopting MA means having
optional movement transformations. But optional movement of anything anywhere
(which is what MA allows) means wildly overgenerating. To regulate this
overgeneration without SDs and SCs requires something like filters. Those in
F&C regulated the distribution of nominals in the context of a theory in
which MA could freely move them around (or not!). Filters make sure that these
vacate the wrong places and end up in the right ones. You don’t move
for case strictly speaking. Rather the G
allows free movement (it’s not
for
anything as there are no SDs that can enforce movement) but penalizes
structures that have nominals in the wrong places. In effect, we move the power
of SDs and SCs from the movement rules themselves and put them into the
filters. F&C (and CCT which rationalized them) outline one type of filter,
Rizzi’s criterial conditions provide another variety. Theoretically, the cost
of simplifying the rules is adding the filters.
So, we moved from complex to simple rules at the price of Gs
with filters of various sorts. Why was this a step forward? Two reasons.
First, MA lies behind Chomsky’s unification of Ross’s
Islands via Subjacency Theory (ST) (and, IMO, is a crucial step in the
development of trace theory and the ECP). Let me elaborate. Once we reduce
movement to its essentials, as MA does, then it is natural to investigate the
properties of movement
as such,
properties like island sensitivity (a.o.). Thus, ‘On Wh Movement’ (OWM) demonstrates
that MA
as such respects islands. Or,
to put this another way, ST is not construction specific. It applies to
all movement dependencies regardless of
the specific features being related. Or, MA serves to define what a movement
dependency is and ST regulates this operation regardless of the interpretive
ends the operation serves, be it focus or topic, or questions, or
relativization or clefts or… If MA is involved, islands are respected. Or, ST
is a property of MA
per se, not the
specific constructions MA can be “part” of.
Second, by factoring out MA form movement transformations
and replacing SDs/SCs with filters focuses on the question of where these
filters come from? Are they universal (part of FL/UG) or language specific? One
of the nice features of CCT was that it had the feel of a (potential) FL/UG
principle. CCT Case was abstract. The relations were local (government). Gs as
diverse as those found in English, French, Icelandic and Chinese looked like
they respected these principles (more or less). Moreover,
were CCT right, then it did not look like easily learnable given
that it was empirically motivated by
negative
data. So, simplifying the rules of G led to the discovery of plausible universal
features of FL/UG. Or, more cautiously, it led to an interesting research
program: looking for plausible universal filters on simple rules of derivation.
What should we make of all of this today in a more
minimalist setting? Well, so far as I can tell, the data that motivated the
F&C filters and the CCT, as well as the theoretical motivation of
simplifying G operations, is still with us. If this is so, then some residue of
the CCT reflects properties of FL/UG. And this generates a minimalist question:
Is CCT linguistically proprietary? Why Case features at all? How, if at all, is
abstract case related to (abstract?) agreement? What is anything relates CCT
and MTC? How is case discharged in a model without the government relation? How
is case related to other G operations? Etc. You know the drill.
IMO, we have made some progress on some of these questions (e.g. treating case
as a by product of Merge/Agree) and no progress on others (e.g. why there is
case at all).
However, I believe research has been hindered, in part, by forgetting what CCT
was a theory of and why it was such a big step forward.
Before ending, let me mention one more property of abstract
case. In minimalist settings abstract case freezes movement. Or, more
correctly, in some theories case marking a nominal makes it ineligible for
further movement. This “principle” is a reinvention of the old GB observation
that well formed chains have one case (marked on the head of the chain) and one
theta role (marked on the foot). If this is on the right track (which it might
not be) the relevant case here is abstract. So, for example, a quirky subject
in a finite subject position in a language like Icelandic can no more raise
than can a nominative marked subject. If we take the quirky case marked subject
to be abstractly case marked in the same way as the nominative is, then this follows
smoothly. Wrt abstract case (i.e. ignoring the morphology) both structures are
the same. To repeat, so far as I know, this application of abstract case was
not a feature of CCT.
To end: I am regularly told that CCT is dead, and maybe it
is. But the arguments generally brought forward in obituary seem to me to be at
right angles to what CCT intended to explain. What might be true is that
extensions of CCT to include
morphological case need re-thinking. But the original motivation seems intact
and, frow what I can tell, something like CCT is the only
theory around to account for these classical data.
And this is important. For if this is right, then minimalists need to do some
hard thinking in order to integrate the CCT into a more friendly setting.