Wednesday, June 20, 2018

Classical and dependent case

This is a long rambling post. It has been a way for me to clarify my own thoughts. here goes.

Every year I co-teach the graduate syntax 2 course at UMD. It follows Howard Lasnik’s classic intro to syntax that takes students from LSLT/Syntactic Structures to roughly the early 1990s, just before the dawn of the Minimalist Program (MP) (personal note: it is perhaps the greatest intro class I have ever had the pleasure to sit in on). Syntax 2 takes the story forward starting with early MP and going up to more or less the present day. 

For most of the last 10 years, I have taught this course with someone else. The last several years I have had Omer Preminger as wingman (or maybe, more accurately, I have been hiswingman) and it has been delightfully educational. He is a lot of fun to argue with. I am not sure that I have come out victorious even once (actually I am but my self- confidence doesn’t allow such a bald admission), but it has been educational and exhilarating. And because he is an excellent teacher, I learn at least as much as the other students do. Why do I mention this? It’s all in service of allowing me to discuss something that arose from Omer’s class discussion of case theory over the last several years and that I hope will provoke him to post something that will help me (and I suspect many others) get clear what the “other” theory of case, Dependent Case Theory (DCT), takes its core empirical and theoretical content to be. 

To help things along, I will outline oneversion of DCT, observe some obvious problems for that interpretation and note another version (spurred by Omer’s remarks in the Syntax 2 class). By way of contrast, I will outline what I take the core features of classical case theory (CCT) to be and, at the end, I will explain why I personally am still a partisan CCT despite the many problems that beset it.[1]So here goes. I start with CCT as background theory.

CCT is primarily a theory of the distribution of lexical nominals (though what “counts” as lexicalis stipulated (phonetically overt material always does, some traces might (A’-traces) and some traces never do (A-traces and maybe PRO)). The theory takes the core licensing relation to be of the X0-YP variety, some set of heads licensing the presence of some set of DPs. CCT, as Vergnaud conceived it, is intended, in the first instance,to unify/reduce the various filters that Chomsky and Lasnik (1977) proposed. The core assumptions are that (i) nominals require case and (ii) certain heads have the power to assign case. Together these two assumptions explain that we find nominals in positions where heads can case mark them. 

Overt morphological case marking patterns no doubt influenced Vergnaud. However, given that CCT was intended to apply to languages like English and French and Chinese where many/most nominals show no apparent case morphology, it was understood from the get-go that CCT’s conception of case was abstract, in the sense that it is present and regulates the distribution of lexical nominals even when it is morpho-phonologically invisible. This abstractness, and the fact that case lives on a head-XP relation are the two central characteristics of the CCT in what follows.

GB extended the empirical reach of CCT beyond the distribution of nominals by making a second assumption. Assume that abstract case can be morphologically realized. In particular, if it is assumed that in grammars/languages with richer morphological case, the core structuralcases (nominative, accusative, etc.)[2]can be made visible then in these languages case theory can explain why some expressions have the morpho-phonological shapes they do. Adding this codicil allows CCT to claim that case marked nominals look as they do in virtue of overtlyexpressing the abstractcases nominals bear universally.[3]

In sum, CCT covers two domains of data: the original explicandum is the distribution of lexical nominals, the second is the morpho-phonological forms found in richer case languages. Importantly, this second empirical domain requires an extra assumption concerning the relationship between abstract case and its morpho-phonological(henceforth ‘overt’) expression. Usefully, these two parts of the theory allow for the investigation of abstract case by tracking the properties of the overtly marked nominals in those languages where it can be seen (e.g. the pronominal system in English).

CCT has other relevant (ahem) subtleties. 

First, it primarily concerns structuralcase. This contrasts with inherent/lexicalcase, which is tied to particular thematic or semantic roles and is idiosyncratically tied to the specific lexical predicates assigning the roles.[4]For example, some languages, most famously Icelandic, have predicates that require that their nominal arguments have a particular overt case marking (dative or genitive or even accusative). 

CCT has almost always distinguished inherent from lexical case, though the two bear some similarities. For example, both are products of head-XP relations (lexical case piggy backing on theta roles which, like case, are assigned by heads). That said, CCT has generally put lexical case to one side and has concentrated on the structural ones, those unrelated to any thematic function.[5]

Second, MP slightly revamped the CCT. The contemporary version takes structural case to alwaysbe a dependency between a functionalhead and an XP  while lexical case, as the name implies, is alwaysmediated by a lexical head. This differs from earlier GB accounts where nominative is assigned by a functional head (finite T) but accusative is assigned by a lexical V (or P) which also, typically (modulo ECM constructions) theta marks the nominal it case assigns. Thus, MP assimilates accusative to the nominative configuration and takes structural accusative, like nominative, to involve a non-local dependency between a functional head (one that does not theta mark the dependent) and a nominal that is not a complement. The relevant functional head for accusative is Agror some flavor of v. What’s crucial is that in MP accounts the case assigner is always significantly “higher” in the phrase marker and so the dependency must be mediated by AGREE or I-Merge (aka, Move). The most interesting consequence of this amendment to the CCT is the consequence that case should affect scope.[6]Surprisingly, there is rather good evidence that it does. As Lasnik and Saito showed (reanalyzing data from Postal) case dependency affects the domain of a nominal’s scope.[7]The dependency between case and scope is one of the deeper discoveries of MP in my opinion, though I will not pursue that claim here.[8]

That’s the sum of what is relevant about CCT for this post. There are well-known problems (wagerverbs being the most conspicuous (see the discussion section of the earlier post for elucidation)) but IMO the main virtue of CCT is that it combines quite smoothly with a Merge based conception of FL/UG. I will say something about this at the end, but first it’s time to look at the DCT.

First off, the DCT is actually not that new. A version was mooted by the Brandeis Bunch of Yip, Mailing and Jackendoff in the mid 1980s.[9]However, it didn’t catch on much until Alec Marantz revived a version in 1991. Since then, I would wager that the DCT has become the “standard” approach to morphological case, though, as we shall see, what it claims exactly, has been (at least to me) elusive.

Here are some points that are fairly clear. 

First, it is exclusively a theory of overt case marking. It has no intention of accounting for nominal distributions. So, DCT is a theory of what CCT takes to be an add-on.[10]This is not in itself a problem for either account, but it is worth noting (yet again). 

So, DCT is a theory of the overtly witnessed case values (or at least it is so in Marantz and Bobaljik and most other advocates (though see Omer’s take below)). The standard version thus aims to explain the distribution of the overt morpho-phonologically case marking found on nominals (in many, though not all, languages). An important virtue is that it extends to bothnominative-accusative case marking and ergative-absolutive case marking systems. Consequently, investigations of DCT have enriched our appreciation of the complexity of case marking patterns cross linguistically.

A second key feature: the standard version of DCT (e.g. Marantz, Bobaljik) treats case marking as a non-syntactic process. More concretely, it is part of post syntactic processing. Omer and others have argued against this conception. But as originally mooted, DCT operations are outside the syntax.

Third, CCT’s core idea is that overt case marking is an inter-nominal dependency. Case marking reflects a relation between nominals within a specified domain. It is not primarily a head-nominal relation. So, for example, accusative is what one finds on DPs that have anothernon-case valued nominal higherup in the samedomain. Overt case, then, is a XP-YP relation (notan X0-YP relation) with the case value of the dependentnominal stemming from the presence of another non-dependentnon-valuednominal in the same domain. The advertising that generally accompanies DCT stresses that for DCT (in contrast to CCT) case is notabstract, but very concrete in that it concerns itself with theovertcase that we can hear/see on nominals, not some abstract feature that only sometimes overtly surfaces. The strongest version of this idea is that the purview of the DCT is the full panoply of morpho-phonologically realized cases. However, this is very clearly too strong. Let me explain.

First, like CCT, DCT agrees that there is an important distinction between inherent/lexical case that is semantically/thematically restricted and structural case that isn’t. Like CCT, it takes the former to be effectively a X0-YP dependency with X0being a theta role assigner. As lexical case is a by-product of theta assignment, this supposition, like the one made in CCT, is not unreasonable. The important point is that like CCT, DCT focuses on a subset of the cases overtly expressed. So, like CCT, the dependent part of the DCT is at besta theory of overt structuralcase.

A strong version of a DCT approach to structural case would have it that all dependent case values in a language are the product of a DP-DP relation holding within a circumscribed domain. So, for example, if accusative is the realization of dependent case in language L then modulo some quirky lexical case instanceswe would find accusative case on dependentDPs alone (nominative in L being what we find on otherwise non-valued non-dependent cases). This would be a strong theory. Unfortunately, it looks to be false.

In particular, we know that there exists languages in which overt structural accusative case occurs on DPs in structures where the DP that carries it cannot bea dependent case. For example, the accusative we find in forinfinitives (For her to leave would be terrible) or the one we find in acc-ing gerunds (her kissing him would be terrible) or in languages like Latin which have acc-infinitive constructions. In all these examples, the overt case of the DP looks like the case we find on the object of transitive verbs, which is the canonical dependent case value. 

Moreover, these are clearly structural cases as they are in no way thematically restricted. And it is clear the accusative we find here is not dependent on any other DP in its domain (we can always find them, for example, on the sole argument of an unaccusative in these settings). So there exists a class of morpho-phonologically overt accusatives that are not overtly marked accusative in virtue of being dependent. What then marks them overtly accusative? Well, the obvious answer is some local head (foror the head of the gerund/infinitive). Thus, DCT must allow that there are two ways of assigning accusative values, only one of which reflects its “dependent” status. Or to put this less charitably, DCT presupposes that the core case marking relation identified by CCT also plays a role in accounting for overt structural case. What DCT denies, then, is that allaccusatives are marked by heads. Some are dependent cases, some are marked by heads. On this construal of the DCT, it appears that CCT is not so much wrong as incomplete. 

Moreoever, it appears that within DCT case is no less abstract than it is in CCT as there is, at best, an indirect relationbetween a specific overt value (e.g. accusative) and being a dependent or a non dependent case.

Nor is this problem restricted to subjects, or so I have been reliably told. Here’s what I mean. I gave a paper at the German LSA this past spring and one of the perks of doing so was hearing Julie Legate give a paper on argument structure and passives. She provided scads of data that showed that in languages that systematically violate Burzio’s generalization we can nonetheless find “objects” that bear accusative case. In other words, there exist many nom-acc languages where we find accusative case sitting on DPs in object positions in sentences where there is every reason to believe that there are no additional arguments (i.e. 1-place predicates where the internal argument is marked accusative). If this is right, then we cannot analyze the accusative as an instance of dependent case and must, instead, trace the witnessed value to some head that provides it (a CCTer would suggest v). 

So, neither subjects nor objects in general can be assumed to bear a dependent case valuein virtue of being a dependent case. The generalization seems to be not that DCT is a theory of overtmorpho-phonological structural case in general, but only of overt structural case that we find most typically in transitive constructions (ones where Burzio’s generalization holds). In other words, the purview of the theory seems, at least to me, much narrower and different than I thought it was from reading the general advertising.

Let me repeat an important point that Omer repeatedly made in class is that there is a good sense in which DCT is no less abstract than CCT. Why? Because, we cannot go from a witnessed case value to the conclusion that the DP that carries it is dependent. Rather we can only assume that if a DP is dependent it will carry the dependent case value though what that overtvalue is remains a free parameter (so far as I can tell).[11]

So, in effect, DCT is effectively a theory of abstract case defines a mapping between some DPs in multiple DP domains (the dependent ones) and some case values (e.g. acc in nom-acc Gs). At best, the theory explains what values one finds in transitive clauses (one with at least two nominals) modulo some specification of the relevant domains.[12]

This last point is worth emphasizing. For illustration consider a sentence like (1):

(1)  DP1V [IPDP2Infinitive V DP3]]

What case should we expect here? Well DP3should get dependent case if DP2is in its domain and DP2is unvalued (at least at the point that DP3’s value is determined). So, if this is a nom-acc language, we should expect to find DP3bearing acc case. What of DP2? This will depend on whether DP1is in the same domain as DP2. If it is, then if DP1has no case value then DP2will also bear acc case. In effect what we will have here is an instance of what we find in English ECMs. DP1will surface with nominative case. 

However, what if DP2is not in the same domain as DP1? If it isn’t then it would get assigned nominative case. What we would to find is something that does not arise in English, the sentential analogue of He expects he to kiss her. So, to block this we need a story that forcesDP2to be in the same domain as DP1, maybe by forcing it to move. What forces this? So far as I know, nothing. We can, for example, stipulate that it is in the same domain or that it move (e.g. by putting some EPP feature on a functional (phase) head in the higher clause), but there is nothing that forces either conclusion. So this would have to be an extra feature. Recall that DCT is nota theory of DP distribution. There is no analogue of the case filter that forces movement. And even if there were, DP2could be assigned nominative case if it did not move. So why must DP2bear acc case? Because we are assuming it moves orbecause the domain of case assignment is the whole clause and case is assigned bottom up or….[13]All of these are free parameters in DCT. At any rate, there is nothing in the theory that in principle prohibits nominative case in ECM constructions. It all depends on how the relevant domains are constructed.[14]

There are other questions as well. For example, given that we can get the same surface forms either via a X0-DP relation or a XP-YP relation learnability issues will most likely arise. Whenever we can do things in various ways then in any given circumstance the child will have to choose, presumably based on PLD. SO there are potential learnability issues with this kind of mixed theory.

In addition, there are theoretical puzzles: why do the very same case values surface in these very different ways? Are there are languages that distinguish the class of morphological cases assigned by CCT mechanisms and those assigned by DCT mechanisms. If there aren’t, why not.[15]

None of these questions are unanswerable (I would assume), but it did surprise me that how much narrower the scope of the DCT was than I had supposed it to be before Omer walked us through the details. It is basically a theory of the distribution of overt case in transitive clauses. It is nota general theory of overt case morphology (as sometimes advertised), or at least the distinctive features are not. From where I sit, it’s a pretty modest part of UG even if correct and it leaves large chunks of CCT potentially intact

Given all of this, I would argue that the empirical advantages of the DCT had better be pretty evident if we are to add it to CCT. Here is what I mean. Given that it presupposes mechanisms and relations very like those found in CCT, then the argument that we need DCT ones in addition to those of CCTmust argue either that CCT cannothandle the data that DCT addresses or that DCT handles it in a far superior manner. The arguments I have seen fall into the second disjunct. It is actually quite hard to argue that noversion of CCT could accommodate the data (after all, there is always a possible (ad hoc) head around to do the CCT dirty work). So, the argument must be that even though CCTcando it, it does not do it nicely or elegantly. But given that adding the DCT to the required mechanics of CCT creates a theoretically more redundant account, then the superiority of DTC accounts had better be overwhelming to pay the cost of such theoretical enrichment. 

As a personal matter, I have not been convinced that the cost is worth it, but others will no doubt differ in their judgments. What seems to me, however, is that given that DCT requires something like CCT to getsomeof the case values right, even a cumbersome CCT account might not be theoretically worse off than a DCT account that incorporates a CCT component.

That said, none of this is why I favor the CCT accounts. The main reason is that I understand how a CCT could be incorporated into a minimalist Merge based account of FL/UG.  Here’s what I mean.

There is a version of MP that I am very fond of. It adopts the following Fundamental Principle of Grammar (FPG):

FPG: For expressions A and B to grammatically interact, A and B must merge

In other words, the only way for expressions to grammatically interact is via Merge. They locally interact under E-merge and non-locally under I-merge (which, as you all know, is one and the same operation). Thus, e.g. selection or subcategorization or theta marking is under E-merge and antecedence or control or movement is under I-merge. I like the FPG. It’s simple. It even strikes me as natural as it enshrines the importance of constituency at its very core.

What’s important here is that it easily accommodates the CCT. A head A case marks a nominal B only if A and B Merge. In MP accounts, this would be I-merge. 

In contrast, I don’t see how to make DCT fit with this picture. The inter-nominal dependency that marks the most fundamental relation is not plausibly a chain relation formed under I-merge.  So, it is not a possible legit relation at all if we buy into something like the FPG. As I have so purchased, the DCT would induce in me a strong sense of caveat emptor.

I know, I know; one thoerists’s modus ponens is another’s modus tollens. But ifyou like the FPG (which I do) then you won’t like replacing CCT with DCT (which I don’t).

Ok, enough. There are many reasons to get off this long post at many points. The real intent here has been to provoke DCTers (Omer!) to specify what they take the theory to be about and whether they agree that it is quite abstract and that it requires something like CCT as a sub-part. And if this is so, why we should be happy about it. So, I hope this provoked, and thx for the course Omer. It was terrifically enlightening.

[1]I’ve reviewed CCT before (here) so some of this will be redundant. That’s the nice thing about blog posts. They can be. We would neversee anything similar in a refereed article where all contents are novel and nobody ever chews their food twice.
[2]Actually what goes into the “etc” is not set. Is dative structural? It can be. Genitive? Can be, but need not be. The clean ones in nom-acc languages are nominative and accusative, though even these can come from another source as we shall presently note.
[3]This does not require assuming that FL have but one morpho-phono realization scheme. There is room for an erg-abs as well as a nom-acc schema. But as these matters are way beyond my competence, I leave them with this briefest mention.
[4]I frankly have never fully grasped the distinction (if there is one) between inherent and lexical case. However, what is critical here is that it is semantically restricted, being the overt manifestation of a semantic role/value. This is decidedly notwhat we see with structural case, which seems in no way restricted thematically (or semantically for that matter).
[5]FWIW, IMO, structural case (both abstract and overt) is the really weird one. Were all cases lexical there would be nothing odd about them. They would just be the outward form of a semantic dependency. But this is precisely what structural case is not. And so why nominals bear or need them is quite unclear. At least it is to me. In fact, I would call it one of the great MP conundra: why case?
[6]This consequence is obvious if one assumes that case is discharged under I-merge. It is less obvious if it case is discharged via Agree. The original Lasnik and Saito analysis assumed the former.
[7]We find similar data in many other languages (e.g. Japanese) where we can predict a DP’s scope properties from the case values it bears. 
[8]Interestingly, this data has been largely ignored by many mainstream minimalists for it remains inexplicable in theories where features “lower” fromto V and case is executed via Agree rather than I-merge.
[9]At the time, they were all Brandeis faculty.
[10]And as such, DCT should be in principle compatible with CCT. However, though this should be possible, my impression is that devotees of the DCT have been disinclined to treat the two in this way. I do not know why.
[11]It is not even clear to me that one can conclude that alldependent cases will realize a single value. This does not follow from the theory so far as I can tell.
[12]Of course, not all clauses with two nominal include a subject, but thee are niceties that I don’t want to get into here.
[13]Note, if we could treat the relevant domain as the whole clause, then there would be no need for DP2to move at all and the correlation between scope and case that Lasnik and Saito discussed would theoretically evaporate. 
[14]Note that the acc feature in ECM in CCT is a product of the particular head we have. The analogue of this stipulation in DCT is the size of the relevant domain. One might try to get around this theoretical clunkiness by assuming the relevant domains are phases (though deciding what is and isn’t a phase is no small theoretical feat nowadays either). But what matters is that the embedded clause is not phase like in (1). Were it such, the problem would reappear. By breaking the link between a theory of the distribution of DPs and their values, DCT has little to say about such cases except that something else must determine when the two subjects are in the same local domain relevant for determining dependent case.
[15]E.g. There are lexical cases (e.g. instrumental, comitative, ablative) that seem to be exclusively lexical. Are there any dependent cases that are exclusively dependent?
            Note that this is theoretically similar to an observation standardly made wrt resumptive pronouns. Why do they always assume the shape of “regular” bound pronouns? One answer is that they arebound pronouns, i.e. that there is no separate class of resumptivesprecisely because if there were we might expect them to have a distinctive morphology somewhere. The absence of a distinction argues for an absence of the category. Similar logic applies in the domain of dependent case vs head assigned case.


  1. I know very little about DCT --- basically all I know is based on this FG paper by Sabine Laszakovits on the computational properties of MGs with DCT. Assuming that I didn't get things completely wrong, though, I think some of the problems with DCT aren't all that much of an issue:

    To my computational eye, the core of DCT seems to be that "the distribution of morphologically case-marked DPs is contingent on the shape of the closest c-commanding node in the derivation tree that can control case". This is the central dependency that a computational device has to be sensitive to, without that you can't get DCT off the ground. But once you have that, you can do a lot more than what standard DCT does with it.

    The standard version of DCT only considers DPs as case-controllers and then builds a daisy chain of dependent cases. But computationally, it's no less natural to broaden the set of potential acc-assigners from nom-DPs to also C-head for, unaccusative V, quirky-case V, and so on. And one can just as well lump in lexical case and have it controlled by Ps, Vs, and so on.

    What I find really interesting is that this system does not seem to require new computational mechanisms. As far as I can tell, the dependencies should be TSL over derivation trees, just like Merge and Move. So if we modify your FPG such that A and B can only interact using mechanisms that can be computed by the abstract machinery used for Merge/Move, DCT is a natural fit.

    Perhaps I'm abstracting away too much of what is considered the theoretical core of DCT; perhaps I got something very basic horribly wrong; and I'm sure that even this generalized version of DCT faces some thorny empirical issues that I'm unaware of; but at least for my own thinking about case as a computational problem, DCT has proved more productive than CCT.

  2. Well, it seems like I have been pretty much browbeaten into chiming in (kidding!), so here goes. In no particular order:

    1. I agree wholeheartedly that the DCT bears a considerable burden of proof. (I argued at the last NELS that its expressive power is a proper superset of the expressive power of the CCT, in fact, and so I am in complete agreement with you on that.) But it was my impression during our latest round of Syntax 2 that you agreed that the Baker & Vinokurova data from Sakha – data in which a DP got dependent case by virtue of raising into, e.g., an unaccusative clause, whose own inventory of functional heads was demonstrably unable to assign such case – met this threshold, even for a skeptic like you!

    2. I was not at Legate's talk which you mentioned here, but the way it's described above, it seems like a complete red herring. There is no such thing as "accusative" (on any theory, as far as I can tell). "Accusative" is almost always a descriptive label assigned by the first people to document a language's grammar who also happen to know Latin. The primitives are "dependent case", "X0-assigned case", "unmarked case", and so forth. So as best I can tell, all Legate has shown is that, in some languages, the case that has been descriptively labeled "accusative" is different in its actual, grammatical nature from the case that has been descriptively labeled "accusative" in, say, Icelandic or Sakha. That is interesting in its own right, but I don't see how on earth that bears on the correctness of the DCT. Unless and until opponents of the DCT have accounts of case in Icelandic, Sakha, Shipibo, etc. etc., I really fail to see what we're even arguing about.

    3. You talked about ruling out They expect he to kiss me. Let me say that if your theory of case rules out this sentence, then it's almost certain that you have the wrong theory of case at hand. That's because sentences that have exactly this structure from the purview of case theory are often just fine. Cf. Szabolcsi's work on overt nominative controllees in Hungarian, but also basically any instance of control in Icelandic, where you can show that the controllee, while not overt, bears the exact same case value you argued needs to be ruled out here. Granted, these sentences are not identical to They expect he to kiss me, but they are identical as far as case is concerned, so your theory of case better not rule this out. The answer is to be found elsewhere.

    4. On correlations between case and scope: the DCT is well-positioned to capture this, because height (and therefore, scope) is correlated with your ability to get into a sufficiently local configuration with the next-higher-caseless nominal, and thus with the ability to receive (or assign) dependent case.

    5. Finally, while I (obviously) agree with your claim that the DCT cannot possibly be a theory of overt case forms, some readers may find this curious. So if I may, let me refer such readers to ex. (6) and the surrounding discussion, on p. 2 here.

    1. On Omer's 2. If we have a case in a superficially nominative-accusative language that is the standard direct object case, in almost all situations it patterns as we expect a "dependent case" to pattern, but in a special construction (Milena calls it an active existential) the verb (which is any regular transitive verb) is active but has no projected external argument and yet the object still has the standard direct object case, that is an interesting situation. It means that not all standard direct object cases are covered by dependent case theory. (It also means Burzio's generalization is wrong.) Again, this is exactly what we expect if there's no innate theory of case.

    2. I'm confused about Omer's point 3 (well, among many other things here). If DCT is not supposed to be a theory of overt case forms, and the problem with 'They expect he to kiss me' (i.e. the kind of distributional issues that CCT concentrated on) is also thought to be unrelated to it, then what is it a theory of?

    3. @Tim: It's a theory of syntactic case features, that morphology may then expone or not, syncretically or not, etc. etc., on a per language basis. As for the sentence you're asking about, the string underdeternines the structure because of the possibility of raising-to-obj. I was talking about specifically the parse where the DP has stayed low, with unmarked case, exponed as "he": that sentence should not be ruled out for case reasons, because of the crosslinguistc facts mentioned abive.

    4. I'm still confused. If the theory of case features has this many moving parts, and makes the relation between the observable morphology and the features so indirect, then how do we know that the relevant Hungarian sentence is identical as far as case is concerned? It seems like there is some pre-theoretical notion of case that's being held fixed as a way to get a foothold, but I don't know what that is.

    5. This comment has been removed by the author.

    6. @Tim: Not sure what you mean. The claim about Hungarian is relatively simple, and goes as follows. There is one morphological form, call it NOM, associated with subjects of finite clauses. There is another, call it ACC, associated with direct objects. Hungarian allows embedded nonfinite clause whose subject has not moved out to bear NOM rather than ACC (under certain circumstances, largely related to focus; see Szabolcsi's pair of papers from 2009 for details). Of course we have no direct access to the primitives of the theory – this, I think, is par for the course in linguistics and probably cogsci more generally – and so any theory that compares "Hungarian nominative" to "English nominative" is on the hook to show that there is ontological identity between the two, first. (And I am in the camp that believes that English "nominative" is not ontologically identical to what we call "nominative" in most other NOM-ACC languages; but I suspect that most readers are not, so let's abstract away from that here.)

  3. Let me take this opportunity to reveal my current thinking on case. I have now convinced myself of a framework whereby merge is innate, but any (other) language-specific innate properties are highly suspect and require significant evidence. (This is due to Noam’s writings on evolution finally sinking in, and due to my accumulating knowledge about the extent of language variation.) Case, both the distribution of noun phrases and the case morphology, is not universal, varies considerably across languages, and so must be learned. (See Charles Yang’s 2016 book on how.) We have merge, and thereby hierarchy, so the child can make generalizations on case distribution based on that. We have meaning (theta-roles) and lexical items, so the child can make generalizations on case distribution based on that. There’s no need for an innate distinction between structural vs inherent/lexical case. (And indeed ongoing work by my student and former student, Milena Šereikaitė and Einar Sigurdsson, shows that it’s not a strict dichotomy, as we might expect.) Could it be the case that in some languages the child makes generalizations based on the number of DPs in a domain, while in other languages the child makes generalizations based on the relationship with a functional head? I see no reason why not.

    1. Thx. Let me say that to the degree that Case is not part of FL/UG then nothing that I say is relevant. If we just learn case from the PLD then we can safely ignore it in our minimalist musings as it has nothing to tell us about the structure of FL/UG (which is what a minimalist theory is a theory of). I could live with this. My point was different: IF case theory is grounded in FL/UG then CCT fits better with MP ideas than DCT does and given that DCT is a much richer account and requires CCT as a subpart (something that Omer seems to agree with) then we will need a lot of high quality evidence before we should accept adding DCT to our FL/UG repertoire. For me at least, the threshold has not yet been crossed.

      Btw, the post has already done what I wanted it to do: generate more public discussion and enlightenment. So thx to everyone for playing.

  4. As a side note on the raising to accusative in Sakha: I suggested the proleptic object analysis to Mark, commenting on an earlier version of the paper. Their paper agrees that this kind of structure is possible for some cases, but not all, showing NPI data in (42), (43). Notice, however, that they don’t give such NPI data for the crucial matrix unaccusative examples. Until we have such data, I’m reserving judgement on the Sakha.

  5. This comment has been removed by the author.

  6. @JAL: 

    Re:my original 2, my point is that terms like "standard direct object case" are not the terms we should be speaking in. Icelandic also has quirky "accusative" subjects, that doesn't make anyone rush to give those instances of accusative an analysis as dependent case. So Lithuanian (it's Lithuanian, right?) has a functional head that can assign a case value that gets spelled out in a manner syncretic with its dependent case. Concluding from this that DCT is not the right case theory (or worse, that there is no case theory at all) seems to me to be wildly unwarranted. In fact, I would dare say that there has to be a case theory, otherwise we would see many more case patterns than we actually see. As Norbert and I like to point out in Syntax 2, it is extraordinarily easy to think up case systems that would be way more communicatively useful than anything that is actually attested (for example: a system with only inherent cases, incl. for Agent, Patient, and so forth, so that you can read the thematic roles off of the case morphology). That we never see that needs to be given by some theory, with all due respect to Chomsky's recent musings to the contrary.

    Re:a proleptic object analysis of Sakha, I hardly see how it would help you. So now you're looking at unaccusative clauses that cannot assign "ACC" to their single argument but can do so to a proleptic argument precisely and only when they contain a proleptic argument coindexed with a gap in an embedded adjunct clause. Fine. How does that make life any easier for the CCT? Surely, you are not going to tell me that the presence of a proleptic object indexed with an adjunct licenses "transitive v" on what is still the morphologically anticausative member of a morphologically-marked causative alternation.

  7. "Could it be the case that in some languages the child makes generalizations based on the number of DPs in a domain, while in other languages the child makes generalizations based on the relationship with a functional head? I see no reason why not"

    Yes, and furthermore I'm not sure the theoretical divide between "Case through Merge with a functional head" and "Case through mapping between multiple DP domains" is unbridgeable. After all, the number of DPs in a domain is a computational quantity. If Case-as-mapping holds, it has to be accessible to narrow syntax, and unless we appeal to something I am unaware of, the most likely explanation is that this quantity is computed through an interaction of some sort with a functional head. Conversely, if Case-as-Merge holds, then something must keep track of the Merge past narrow syntax, and you would thus expect dependency effect. Especially if one grants that the *absence* of co-valuation is a quantity that is accessible to narrow syntax, the two mechanisms thus appear to me more like two sides of the same coins than two opposite accounts.

  8. Why would it have to be "transitive v" that licenses a proleptic object in Sakha? Other languages use prepositions/postpositions, it could be an applicative head, etc etc. The point would be that it's no longer an instance of raising to accusative with the matrix being an unaccusative predicate. (Btw, I also object to the "help you" comment. In principle, I think that anything's possible as long as there's evidence to the child to posit it. But for any particular analysis, I still demand proper argumentation.)

    I think you should look at the Lithuanian data before speculating. (Milena's writing up the paper as we type!) There's no reason at all internal to the language to think that this is different from the regular object case; but saying that there's a head that assigns the case is exactly the point.

    I think you're exactly wrong that it's easy to think up case systems that aren't attested. The one example you provide is interesting in that it ignores hierarchy completely. Leaving that aside, what would it look like? So one case for agents and another for patients. Some more semantic-y cases. How many semantic distinctions do you want to make for subjects/objects? You'll need to set up the input properly so that you expect theta-role to be the generalization the child posits. Then, are we sure that this doesn't exist? (I'd look in the active-stative / split-S languages.) More generally, I'd like to remind us all that attested patterns are going to be determined by the generalizations posited by the child in the acquisition process, and by historical accident (both linguistic and non-linguistic, e.g. guns/germs).

  9. The "help you" was short for "help you maintain a CCT-compatible analysis" – and I still don't see how it does. Why does licensing of proleptic arguments by this applicative head depend on an adjunct clause with a gap in it? Look, to paraphrase something I once heard Ad Neeleman say (and which I imagine you already know): outside of mathematics, you seldom are in a position to refute a theory; all you can do is make the opposing theory have to contort itself in such a way that impartial observers will realize that the point of ridiculousness has been reached. The kind of maneuvers required to reconcile the CCT with the Sakha facts (or the well-known Icelandic ones, for that matter) strike me as having reached that point, if not well past it. YMMV.

    I'm very interested in looking at the Lithuanian data and Milena's analysis of it. But I'd like to remind us all that the expressive power of the CCT is proper subset of that of the DCT, and so as long as the analysis there is phrased in terms of the CCT, it cannot, by definition, be an argument against the DCT. Only a non-contorted analysis of Sakha, Icelandic, Shipibo etc. in terms of the CCT can do that.

    As Baker & Bobaljik (in the Oxford Handbook of Ergativity) point out, not even active-stative / Split-S languages actually realize the "case X if and only if Agent" ideal, since even, say, Batsbi / Tsova Tush loses that correlation once one goes beyond intransitives. That no language realizes that ideal is telling. We have plenty of modifiers that are licensed exclusively in the presence of Agents, so children apparently have no problem acquiring those. Yet not so for a given case marking. And yes, most (all?) typological arguments are only as good as the "accidental history" retort, but I don't think that's a good reason not to take the typological landscape seriously.

  10. Indeed, proleptic objects often require an associated position lower in the sentence so that that part is "about" the proleptic object. That's a question about proleptic objects, not about Sakha case. Notice, however, that that's an assumption you're making about the language (or you have data I don't have). We aren't actually shown in the paper whether e.g. `Keskil became sad Aissen' meaning `Keskil became sad about Aissen" is grammatical in the language or not. Again, I don't know what the facts will turn out to be for this language; I'm pointing out that we don't know enough yet to call this raising to dependent accusative.

    Again, you haven't convinced me about the all-inherent case that "no language realizes that ideal", but even if no language does, that's certainly not enough evidence to force us to posit an innate case theory. Not even close.

  11. Notice that I was talking about the much narrower "case X if and only if DP is an Agent" ideal, and apparently even that doesn't exist, furnishing a clear contrast between the typology of case and that of, say, adverbs.

    As for "but even if no language does, that's certainly not enough evidence to force us to posit an innate case theory. Not even close." – I'm honestly not sure what this assertion is supposed to rest on. Something delivers the aforementioned distinction between case markers and adverbials, and it ain't guns and germs. Whatever delivers that, I'm gonna go ahead and call "case theory." You can assert that that's not enough evidence _for you_, but I don't see how that's an argument one way or another.

  12. "But I'd like to remind us all that the expressive power of the CCT is proper subset of that of the DCT" -- only for those like you who have decided that you can take traditional dependent case theory and add traditional case assignment under c-command by functional heads, and still call that dependent case theory. At that point, we agree that anything goes, and it's just confusing nomenclature.

  13. At this point, I think the discussion is becoming less useful. We've made our positions clear; I'm dropping out. Time for someone else to chime in.

  14. What I found striking about Baker & Vinokurova's story about Sakha is that they assume the need of a unified rule for Sakha Accusative case, and use this to argue for DCT. But at the same time, Baker (2015: 12) admits that the Sakha Dative case cannot be subsumed under a single rule – i.e. presumably that there are two distinct cases here. So why shouldn't Sakha have two distinct case features that happen to be exponed identically? One can be subsumed under CCT, the other under something else. Especially if one thinks that only merge is innate (as JAL does), this makes DCT dispensable.

    1. @Martin: If I understood you correctly, you are asserting that the demand for a unified treatment of (what is pre-theoretically called) "accusative" in Sakha plays a crucial role in B&V's argument, logically speaking. If this is indeed your assertion, I think you are mistaken.

      Suppose we grant that (what is pre-theoretically called) "accusative" in Sakha can be a disjunction of things. It is still the case that one of the things in this disjunction has the distribution that the DCT predicts (assigned to any noun phrase that has come into sufficiently close structural proximity with another caseless noun phrase), and that its competitors have to contort themselves to capture. See, in particular, the much discussed raising-to-"accusative" data. Whether that thing is indeed the same as the "accusative" one finds on direct objects in simple, monoclausal transitives in Sakha is a separate matter – but, lo and behold, the behavior of the latter is also exactly what the DCT predicts. (In particular, as it regards the correlation between specificity of the direct object, its position relative to low adverbs, and its case morphology.) So now, _given these facts_, there is an argument that (what is pre-theoretically called) "accusative" in Sakha is a unified category. But that was not a premise of the argument for the need for the DCT.

  15. @Omer: I still have trouble understanding whether the DCT "predicts" anything or rather "lets us expect" some things (predicting would seem to imply excluding certain logical possibilities). Now I surely agree that the CCT has to "contort itself" to capture some of these things, but this then merely means that we have to give up the more restrictive view for a less restrictive view (which, in Baker's view at least, allows both "classical" and dependent case assignment, plus perhaps more). More generally, what I'm wondering whether one can have both: A restrictive theory AND elegant accounts of language-particular facts couched in terms of the representations allowed by the theory. (My own view is closer to JAL's: only merge is innate, and restrictiveness may come from diverse sources, including functional adaptation.)

  16. This comment has been removed by the author.

  17. @Martin: I 100% agree with you that the DCT is less restrictive than the CCT. In fact I have argued that any serious version of the DCT is a proper superset, in terms of its expressive power, of the CCT (see here: Is it so restrictive as to exclude any logically possible patterns? It depends on the level of description you are talking about. Knowing a little bit about your own work, I imagine that you mean something relatively surface-oriented by "pattern"; if so, I think the answer might be "no." But I think that particular answer was already "no" even on the CCT since, given the premises of the latter, you could always posit a phonologically-null functional head that comes and goes with the observed morphology. The DCT is certainly in no worse shape than that. However, I would hazard against jumping from this to the conclusion that there are no predictions to be had here. The predictions are about what the 'predicates' are that are available to the case calculus. So, _c-commanded by another caseless noun phrase_ is an available predicate, as is _c-commanded by a designated functional head_; but, say, _assigned thematic role X_ is predicted to be unavailable as a relevant predicate. And as Baker & Bobaljik show, this seems to be precisely the landscape that one finds. (See my discussion above, with JAL, about the conspicuous under-attestation of the "case X if-and-only-if Agent role" pattern; even languages that have superficially been described in these terms seem not really to behave that way upon closer examination.)

    Depending on your scientific taste, these may not be the kind of predictions that you're interested in, which is fair enough. I find them interesting. And I'm also not denying that even if the typological claims hold, grammar is but one possible source for them. As usual in the sciences, if a fleshed-out counterproposal is presented, we could then weigh the relative pros and cons of each.

  18. I'm interested in all kinds of predictions that can be tested not only in principle, but also in practice, e.g. with an ERC grant of €2.5m (of the sort that Ian Roberts and Giuseppe Longobardi once got). Could one test DCT if one had a lot of money? Because of the many "moving parts" (mentioned by Tim Hunter), that is not so clear to me (and as you note, CCT was not any better, because it had the same amount of moving parts). But I'm glad that you're interested in explaining gaps in the typological distribution, such as the apparent absence of languages with pure inherent case. I think this is probably functionally motivated as well, but I can't go into that here. What is not clear to me though: Wouldn't DCT actually allow such languages?

    1. Not really, except as a fairly significant coincidence. The reason is as follows: thematic roles, properly construed, are semantic entities. Syntax does not traffic in them, except fairly indirectly. So we often talk (somewhat sloppily) about little v being "the assigner of Agent"; but the truth of the matter is that there are always instances of little v that assign other roles (Causer, and maybe even certain instances of Instrument and/or Experiencer). And similarly, nothing in the theory prevents the assignment of a semantically-indistinguishable role elsewhere in the structure. And so, even if little-v in some language assigns inherent case to the argument it introduces, this would not give rise to the "case X if-and-only-if Agent role" pattern, not unless a bunch of other coincidences happened in the grammar.

      And I'd take issue with your suggestion that moving parts preclude predictions. Given an observed (surface) case patterns, the premises of a particular case theory will generate particular entailments about what the underlying syntax of that languages can and cannot look like. We then have other tools at our disposal (I'm sure you're familiar with them) to investigate whether the underlying syntax does or doesn't look in that predicted way. I suspect that, at this juncture, you might point out that there are also "moving parts" at these other junctures, and this is true. That is par for the course, I think, for any scientific endeavor that seeks to explain the causal forces behind a certain observed pattern, and, as such, I don't think it's fundamentally different from a functionalist explanation, which will also have its own moving parts. (Different kinds of parts, of course, which move in different kinds of ways; but moving parts nonetheless.) The two approaches cannot be distinguished on this front, I don't think, and instead must be distinguished in the usual ways (empirical success, explanatory oomph, conciseness, etc. etc.)

  19. You are right that there is always some movement of some of the parts, but ideally, we'd have a few somewhat settled parts after pursuing an approach for while. It depends of course on one's personal expectations and hopes whether one remains optimistic. At the moment, it seems that linguists have settled on two types of things that everyone is fairly happy with: descriptive generalizations (as found in reference grammars and in the early parts of many generative papers), and general typological concepts such as "ergative", "SOV", "focus", or "personal pronoun". The latter are used all the time in generative work, even though the claim is often that they are not "ontologically uniform" (= do not correspond to the same entity at the level of presumed UG). Now if someone claimed that there is a functional explanation that does not need more than this general-typological level (that everyone understands, even if they don't all think that it is "real"), it seems to me there are fewer moving parts, and it's easier to refute such an explanation.

    1. This comment has been removed by the author.

    2. Hmm, I'm not sure I agree with you on some of this (e.g. how much we should value consensus among everyone who self-identifies as a "linguist" as a valid criterion; or what the right timeline is for expecting convergence on what the fundamental parts are; physics, it seems to me, is still waiting...). But I also doubt we will convince one another in the comment section of this blog, and I hope that at least having the conversation in a public forum like this might serve some greater good. (It's certainly useful for me.) Thanks for engaging.

  20. I find this almost painful to read. For me, as a non-native English speaker, "I expect him to kiss her" is clearly modelled on "I expect him".

    You always expect someone. Not a clause (unlike in my native language). And then you expect someone. To do something.

    I simply don't what insights can a theory like DCT offer.

    Cases are weird. Cases merge in the history of languages. New cases get created. Cases get lost. Some verbs flip case use in their history (like the verb "like" in English). In some languages, some verbs can be used in more than one case configuration. And so on.

    1. "I expect him to kiss her" is an AcI, almost certainly calqued from the Latin one given its absence in German, which only allows "I expect that he will kiss her" and restricts the AcI to very few verbs like "see".