So what’s classical (viz. GB) Case Theory (CCT) a theory of? Hint: it’s not primarily about overt morphological case, though given some ancillary assumptions, it can be (and has been) extended to cover standard instances of morphological case in some languages. Nonetheless, as originally proposed by Jean Roger Vergnaud (JRV), it has nothing whatsoever to do with overt case. Rather, it is a theory of (some of) the filters proposed in “Filters and Control” by Chomsky and Lasnik (F&C).
What do the F&C filters do? They track the distribution of overt nominal expressions. (Overt) D/NPs are licit in some configurations and not in others. For example, they shun the subject positions of non-finite clauses (modulo ECM), they don’t like being complement to Ns or As, nor complements to passivized verbs. JRV’s proposal, outlined in his famous letter to Chomsky and Lasnik, is that it is possible to simplify the F&C theory if we reanalyze the key filters as case effects; specifically if we assume that nominals need case and that certain heads assign case to nominals in their immediate vicinity. Note, that JRV understood the kind of case he was proposing to be quite abstract. It was certainly not something evident from the surface morphology of a language. How do I know? Because F&C filters, and hence JRV’s CCT, was used to explain the distribution of all nominals in English and French and these two languages display very little overt morphology on most nominals. Thus, if CCT was to supplant filters (which was the intent) then the case at issue had to be abstract. The upshot: CCT always trucked in abstract case.
So what about morphologically overt case? Well, CCT can accommodate it if we add the assumption that abstract case, which applies universally to all nominals in all Gs to regulate their distribution, is morphologically expressed in some Gs (a standard GG maneuver). Do this and abstract case can serve as the basis of a theory of overt morphological case. But, and this is critical, the assumption that the mapping from abstract to concrete case can be phonetically pretty transparent is not a central feature of the CCT.
I rehearse this history because it strikes me that lots of discussion of case nowadays thinks that CCT is a theory of the distribution of morphological case marking on nominals. Thus, it is generally assumed that a key component of CCT assigns nominative case to nominals in finite subject positions and accusative to those in object slots etc. From early on, many observed that this simple morphological mapping paradigm is hardly universal. This has led many to conclude that CCT must be wrong. However, this only follows if this is what CCT was a theory of, which, I noted above, it was not.
Moreover, and this is quite interesting actually, so far as I can tell the new case theorists (the ones that reject the CCT) have little to say about the topic CCT or C&F’s filters tried to address. Thus, for example, Marantz’s theory of dependent case (aimed at explaining the morphology) is weak on the distribution of overt nominals. This suggests that CCT and the newer Morphological Case Theory (MCT) are in complimentary distribution: what the former takes as its subject matter and what the latter takes as its subject matter fail to overlap. Thus, at least in principle, there is room for both accounts; both a theory of abstract case (CCT) and a theory of morphological case (MCT). The best theory, of course, would be one in which both types of case are accommodated in a single theory (this is what the extension of the CCT to morphology hoped to achieve). However, were these two different, though partially related systems this would be an acceptable result for many purposes.
Let’s return to the F&C filters and the CCT for a moment. What theoretically motivated them? We know what domain of data they concerned themselves with (the distribution of overt nominal). But why have any filters at all?
F&C was part of the larger theoretical project of simplifying transformations. In fact, it was part of the move from construction based G rules to rules like move alpha (MA). Pre MA, transformations were morpheme sensitive and construction specific. We had rules like relative clause formation and passive and question formation. These rules applied to factored strings which met the rules’ structural conditions (SD). The rules applied to these strings to execute structural changes (SC). The rules applied cyclically, could be optional or obligatory and could be ordered wrt one another (see here for some toy illustrations). The theoretical simplification of the transformational component was the main theoretical research project from the mid 1970s to the early-mid 1980s. The simplification amounted to factoring out the construction specificity of earlier rules, thereby isolating the fundamental displacement (aka, movement) property. MA is the result. It is the classical movement transformations shorn of their specificity. In technical terms, MA is a transformation without specified SDs or SCs. It is a very very simple operation and was a big step towards the merge based conception of structure and movement that many adopt today.
How were filters and CCT part of this theoretical program? Simplifying transformations by eliminating SDs and SCs makes it impossible to treat transformations as obligatory. What would it mean to say that a rule like MA is obligatory? Obliged to do what exactly? So adopting MA means having optional movement transformations. But optional movement of anything anywhere (which is what MA allows) means wildly overgenerating. To regulate this overgeneration without SDs and SCs requires something like filters. Those in F&C regulated the distribution of nominals in the context of a theory in which MA could freely move them around (or not!). Filters make sure that these vacate the wrong places and end up in the right ones. You don’t move for case strictly speaking. Rather the G allows free movement (it’s not for anything as there are no SDs that can enforce movement) but penalizes structures that have nominals in the wrong places. In effect, we move the power of SDs and SCs from the movement rules themselves and put them into the filters. F&C (and CCT which rationalized them) outline one type of filter, Rizzi’s criterial conditions provide another variety. Theoretically, the cost of simplifying the rules is adding the filters.
So, we moved from complex to simple rules at the price of Gs with filters of various sorts. Why was this a step forward? Two reasons.
First, MA lies behind Chomsky’s unification of Ross’s Islands via Subjacency Theory (ST) (and, IMO, is a crucial step in the development of trace theory and the ECP). Let me elaborate. Once we reduce movement to its essentials, as MA does, then it is natural to investigate the properties of movement as such, properties like island sensitivity (a.o.). Thus, ‘On Wh Movement’ (OWM) demonstrates that MA as such respects islands. Or, to put this another way, ST is not construction specific. It applies to all movement dependencies regardless of the specific features being related. Or, MA serves to define what a movement dependency is and ST regulates this operation regardless of the interpretive ends the operation serves, be it focus or topic, or questions, or relativization or clefts or… If MA is involved, islands are respected. Or, ST is a property of MA per se, not the specific constructions MA can be “part” of.
Second, by factoring out MA form movement transformations and replacing SDs/SCs with filters focuses on the question of where these filters come from? Are they universal (part of FL/UG) or language specific? One of the nice features of CCT was that it had the feel of a (potential) FL/UG principle. CCT Case was abstract. The relations were local (government). Gs as diverse as those found in English, French, Icelandic and Chinese looked like they respected these principles (more or less). Moreover, were CCT right, then it did not look like easily learnable given that it was empirically motivated by negative data. So, simplifying the rules of G led to the discovery of plausible universal features of FL/UG. Or, more cautiously, it led to an interesting research program: looking for plausible universal filters on simple rules of derivation.
What should we make of all of this today in a more minimalist setting? Well, so far as I can tell, the data that motivated the F&C filters and the CCT, as well as the theoretical motivation of simplifying G operations, is still with us. If this is so, then some residue of the CCT reflects properties of FL/UG. And this generates a minimalist question: Is CCT linguistically proprietary? Why Case features at all? How, if at all, is abstract case related to (abstract?) agreement? What is anything relates CCT and MTC? How is case discharged in a model without the government relation? How is case related to other G operations? Etc. You know the drill. IMO, we have made some progress on some of these questions (e.g. treating case as a by product of Merge/Agree) and no progress on others (e.g. why there is case at all). However, I believe research has been hindered, in part, by forgetting what CCT was a theory of and why it was such a big step forward.
Before ending, let me mention one more property of abstract case. In minimalist settings abstract case freezes movement. Or, more correctly, in some theories case marking a nominal makes it ineligible for further movement. This “principle” is a reinvention of the old GB observation that well formed chains have one case (marked on the head of the chain) and one theta role (marked on the foot). If this is on the right track (which it might not be) the relevant case here is abstract. So, for example, a quirky subject in a finite subject position in a language like Icelandic can no more raise than can a nominative marked subject. If we take the quirky case marked subject to be abstractly case marked in the same way as the nominative is, then this follows smoothly. Wrt abstract case (i.e. ignoring the morphology) both structures are the same. To repeat, so far as I know, this application of abstract case was not a feature of CCT.
To end: I am regularly told that CCT is dead, and maybe it is. But the arguments generally brought forward in obituary seem to me to be at right angles to what CCT intended to explain. What might be true is that extensions of CCT to include morphological case need re-thinking. But the original motivation seems intact and, frow what I can tell, something like CCT is the only theory around to account for these classical data. And this is important. For if this is right, then minimalists need to do some hard thinking in order to integrate the CCT into a more friendly setting.
 Nor, as I recall, did people think that it was likely to be true. It was understood pretty early on that inherent/quirky case (I actually still don’t understand the difference, btw) does not transparently reflect the abstract case assigned. Indeed, the recognized difference between structural case and inherent case signaled early on that whatever abstract case was morphologically, it was not something easily read off the surface.
 Indeed, Distributed Morphology might be the form that such a hybrid theory might take.
 Actually, there was a debate about whether only overt nominal were relevant. Lasnik had a great argument suggesting that A’-traces also need case marking. Here is the relevant data point: * The man1 (who/that) it was believed t1 to be smart. Why is this relative clause unacceptable even if we don’t pronounce the complementizer? Answer: the A’-t needs case. This, to my knowledge, is the only data against the idea that case exclusively regulates the distribution of overt nominal expressions. Let me know if there are others out there.
 Well, if you care about overgeneration. If you don’t, then you can do without filters or CCT.
 Whether this is an inherent property of movement rather than, say, overt movement, was widely investigated in the 1980s. As you all know, Huang argued that ST is better viewed as an SS filter rather than part of the definition of MA.
 I should add, that IMO, this project was tremendously successful and paved the way for the Minimalist Program.
 Curiously, the idea that case and agreement are effectively the same thing was not part of CCT. This proposal is a minimalist one. It’s theoretical motivation is twofold: first to try to reduce case and agreement to a common “mystery,” one being better than two. Second, because if case is a feature of nominals then probes are not the sole locus of uninterpretable features. Case is the quintessential uninterpretable feature. CCT understood it to be a property of nominals. This sits uncomfortably with a probe/goal theory in which all uninterpretable features are located in probes (e.g. phase heads). One way to get around this problem is to treat case as by-products of the “real” agreement operation initiated by the probe.
From what I gather, the idea that case reduces to agreement is currently considered untenable. This does not bother me in the least given my general unhappiness with probe/goal theories. But this is a topic for another discussion.
 Reducing nominal distribution to syntactic selection is not a theory as the relevant features are almost always diacritical.