Comments

Monday, June 25, 2018

Physics envy blues

These are rough days for those who suffer from Physics Envy (e.g. ME!!!!). It appears (at least if the fights in the popular press are any indication) that there is trouble in paradise and that physicists are beset with “ a growing sense of unease” (see herep. 1).[1]Why? Well for a variety of reasons it seems. For some (e.g. BA) the problem is that the Standard Theory in QM (ST) is unbelievably well grounded empirically (making “predictions verified to within one-in-ten-billion chance of error” (p.1)) and yet there are real questions that ST has no sway of addressing (e.g. “Where does gravity come from? Why do matter particles always possess three, ever heavier copies, with peculiar patterns in their masses, and why does the universe contain more matter than antimatter?” (p.1-2)). For others, the problem seems to be that physicists have become over obsessed with (mathematical) beauty and this has resulted in endless navel gazing and a failure to empirically justify the new theory (see here).[2]

I have no idea if this is correct. However, it is not a little amusing to note that it appears that part of the problem is that ST has just been too successful! No matter where we point the damn thing it gives the right empirical answer, up to 10 decimal points. What a pain!

And the other part of the problem is that it appears that the ways that theorists had hoped to do better (super-symmetry, string theory etc.) have not led to novel empirical results. 

In other words, the theory that we have that is very well grounded doesn’t answer questions to which we would love to have answers and the theories that provide potential answers to the questions we are interested in have little empirical backing. This hardly sounds like a crisis to me. Just the normal state of affairs when we have an excellent effective theory and are asking ever more fundamental questions. At any rate, it’s the kind of problem that I would love to see in linguistics.

However, this is not the way things are being perceived. Rather, the perception is that the old methods are running out steam and that this requires new methods to replace them. I’d like to say a word or two about this.

First off, there seems to be general agreement that the strategy of unification in which “science strives to explain seemingly disparate ‘surface’ phenomena by identifying, theorizing and ultimately proving their shared ‘bedrock’ origin” (BA;2) has been wildly successful heretofore. Or as BA puts in a more restrained manner, “has yielded many notable discoveries.” So the problem is not with the reasonable hope that such theorizing might bear fruit but with the fact that the current crop of ambitious attempts at unification have not been similarly fecund. Again as BA puts it: “It looks like the centuries-long quest for top-down unification has stalled…” (BA:2). 

Say that this is so. What is the alternative? Gelman in discussing similar issues titles a recent post “When does the quest for beauty lead science astray?” (here). The answer should be, IMO, never. It is never a bad idea to look for beautiful theories because beauty is what we attribute to theories that explain(i.e. have explanatory oomph) and as that is what we want from the sciences it can never be misleading to look for beautiful theories. Never.

However, beauty is not the onlything we want from our theories. We want empirical coverage as well. To be glib, that’s part of what makes science different from painting. Theories need empirical grounding in addition tobeauty. And sometimes you can increase a theory’s coverage at the expense of its looks and sometimes you can enhance its looks by narrowing its data coverage. All of this is old hat and if correct (and how could it be false really) then at any given time we want boththeories that are pretty and also cover a lot of empirical ground. Indeed, a good part of what makes a theory pretty is howit covers the empirical ground. Let me explain.

SH identifies (here) three dimensions of theoretical beauty: Simplicity, Naturalness and Elegance (I have capitalized them here as they are the three Graces of scientific inquiry).[3]

Theories are simple when they can “be derived from a few assumptions.” The fewer axioms the better. Unification (showing that two things that appear different are actually underlyingly the same) is a/the standard way of inducing simplicity. Note that theories with fewer axioms that cover the same empirical ground will necessarily have more intricate theorems to cover this ground. So simpler theories will have greater deductive structure, a feature necessary for explanatory oomph. There is nothing more scientifically satisfying than getting something for nothing (or, more accurately, getting two (better still some high N) for the price of one). As such, it is perfectly reasonable to prize simpler theories and treat them as evident marks of beauty.[4]

Furthermore, when it comes to simplicity it is possible to glimpse what makes it such a virtue in a scientific context. Simpler theories not only have greater deductive structure, the are also better grounded empirically in the sense that the fewer the axioms, the more empirical weight each of them supports. You can give a Bayesian rendition of this truism, but it is intuitively evident. 

The second dimension of theoretical beauty is Naturalness. As SH points out, naturalness is an assumption about types of assumptions, not the number. This turns out to be a notion best unpacked in a particular scientific local. So, for example, one reason Chomsky likes to mention “computational” properties when talking about FL is that Gs are computational systems so computational considerations should seem natural.[5]Breakthroughs come when we are able to import notions from one domain into another and make them natural. That is why a good chunk of Gallistel’s arguments against neural nets and for classical computational architectures amounts to arguing that classical CS notions should be imported into cog-neuro and that we should be looking to see how these primitives fit into our neuro picture of the brain. The argument with the connectionist/neural net types revolves around how natural a fit there is between the digital conception of the brain that comes from CS and the physical conception that we get out from neuroscience. So, naturalness is a big deal, but it requires lots of local knowledge to get a grip. Natural needs an index. Or, to put this negatively, natural talk gets very windy unless grounded in a particular problem or domain of inquiry.

The last feature of beauty is Elegance. SH notes that this is the fuzziest of the three. It is closely associated with the “aha-effect” (i.e. explanatory oomph). It lives in the “unexpected” connections a theory can deliver (a classic case being Dirac’s discovery of anti-matter). SH notes that it is also closely connected to a theory’s “rigidity” (what I think is better described as brittleness). Elegant theories are not labile or complaisant. They stand their ground and break when bent. Indeed, it is in virtue of a theory’s rigidity/brittleness that it succeeds having a rich deductive structure and explanatory oomph. Why so? Because the more brittle/rigid a theory is the less room it has for accommodating alternatives and the less things a theory makes possiblethe more it explains when what it allows as possible is discovered to be actual.

We see this in linguistics all the time. It is an excellent argument in favor of one analysis A over another B that A implies C (i.e. A and not-C are in contradiction) whereas B is merely consistent with C (i.e. A and not-C are consistent). A less rigid B is less explanatory than a more rigid A and hence is the superior explanatoryaccount. Note that this reasoning makes sense onlyif one is looking at a theory’s explanatory features. By assumption, a more labile account can cover the same empirical ground as a more brittle one. The difference is in what they excludenot what they cover. Sadly, IMO, the lack of appreciation of the difference between ‘covering the data’ and ‘explaining it’ often leads practitioners to favor “flexible” theories, when just the opposite should be the case. This makes sense if one takes the primary virtue of a theory to be regimented description. It makes no sense at all if one’s aims lie with explanation.

SH’s observations make it clear (at least to me) why theoretical beauty is prized and why we should be pursuing theories that have it. However, I think that there is something missing from SH’s account (and, IMO, Gelman’s discussion of it (here)). It doesn’t bind the three theoretical Graces as explicitly to the notion of explanation as it should. I have tried to do this a little in the comments above, but let me say a bit more, or say the same things one more time in perhaps a slightly different way.

Science is more than listing facts. It trucks in explanations. Furthermore, explanations are tied to the why-questions that identify problems that explanations are in service of elucidating. Beauty is a necessary ingredient in any answer to a why-question but what counts as beautiful will heavily depend on what the particular question at issue is. What makes beauty hard to pin down is this problem/why-question relativity. We want simple theories, but not toosimple? What is the measure? Well, stories just as complicated as required to answer the why-question at issue. Ditto with natural. Natural in one domain wrt to one question might be unnatural in another wrt to a different question. And of course the same is the case with brittle. Nobody wants a theory that is so brittle that it is immediately proven false. However, if one’s aim is explanation then beauty will be relevant and what counts as beautifulwill be contestable and rightly contested. In fact, most theorizing is argument over how to interpret Simple, Natural and Elegant in the domain of inquiry. That’s what makes it so important to understand the subtleties of the core questions (e.g. in linguistics: Why linguistic creativity? Why linguistic promiscuity? How did FL/UG arise?). At the end of the day (and, IMO, also at sunrise, high noon and sunset) the value of one’s efforts must be judged against how well the core questions of the discipline have been elucidated and their core problems explained. And this is a messy and difficult business. There is no way around this.

Actually, this is wrong. There is a way around this. One can confuse science with technology and replace explanation with data regimentation and coverage. There is a lot of that going around nowadays in the guise of Big Data and Deep Learning (see hereand here). Indeed some are calling for a revamped view of what the aim of science ought to be; simulation replacing explanation and embracing the obscurantism of overly complex uninterpretable “explanations.” For these kinds of views, theoretical beauty really is a sideshow. 

At any rate, let me end by reiterating the main point: beauty matters, it is always worth pursuing, it is question relative and cannot really ever be overdone. However, there is no guarantee that the questions we most want answered can be, or that the methods we have used successfully till now will continue to be fruitful. It seems that physicists feel that they are in a rut and that much of what they do is not getting traction. I know how they feel. But why expect otherwise? And what alternative is there to looking for beautiful accounts that cover reasonable amounts of data in interesting and enlightening ways? 


[1]This post is by Ben Allanch from Cambridge University physicist. I will refer to this post as ‘BA.’
[2]Sabine Hossenfelder (SH) has written a new book on this topic Lost in Maththat is getting a lot of play in the popular science blogosphere lately. Hereis a strong recommendation from one of our own Shravan Vasishth who believes that “she deserves to be world famous” for her views. 
[3]The Three Graces were beauty/youth, mirth and elegance (see here). This is not a bad fit with our three. Learning from the ancients, it strikes me that adding “mirth” to the catalogue of theoretical virtues would be a good idea. We not only want beautiful theories, we should also aim for ones that delight and make one smile. Great theories are not somber. Rather they tickle one’s fancy and even occasionally generate a fat smile, an “ah damn” shake of the head and a feeling of overall good feeling. 
[4]SH contrasts this discussion of simplicity with Occam’s claiming that the latter is the injunction to choose the simpler of two accounts that cover the same ground. The conception SH endorses is “absolute simplicity,” not a relative one. I frankly do not understand what this means. Unification makes sense as a strategy because it leads to a simpler theory relative to the non-unified one. Maybe what SH is pointing to is the absence of the ceteris paribus clause in the absolute notion of simplicity. If so, then SH is making the point we noted above: simpler theories might be empirically more challenged and in such circumstances Occam offers little guidance. This is correct. There is no magic formula for deciding what to opt for in such circumstances, which is why Occam is such a poor decision rule most of the time.
[5]See his discussion of Subjacency in On Wh Movementfor example.

Thursday, June 21, 2018

Sad news: a linguistically prominent primate passes

Tim Stowell just notified me that Koko the gorilla just died (here). The obit is a bit over the top. Kokeodid not really teach us much about language and her skills were vastly over-hyped. However, I have a very warm spot in my heart for her as I once cross-dressed and played her in a skit given at MIT for Chomsky's birthday. For about 5 minutes Koko debated Chomsky to a standstill (or that is the way Koko saw it, reports differ). This even let me get to know Koko from the inside as it were (I recommend inhabiting a Gorilla Suit if you ever get the chance). Elan Dresher played Chomsky and Amy Weinberg played Penny Patterson. The names were changed, of course, to protect the innocent. At any rate, she has passed. RIP.

Wednesday, June 20, 2018

Classical and dependent case

This is a long rambling post. It has been a way for me to clarify my own thoughts. here goes.

Every year I co-teach the graduate syntax 2 course at UMD. It follows Howard Lasnik’s classic intro to syntax that takes students from LSLT/Syntactic Structures to roughly the early 1990s, just before the dawn of the Minimalist Program (MP) (personal note: it is perhaps the greatest intro class I have ever had the pleasure to sit in on). Syntax 2 takes the story forward starting with early MP and going up to more or less the present day. 

For most of the last 10 years, I have taught this course with someone else. The last several years I have had Omer Preminger as wingman (or maybe, more accurately, I have been hiswingman) and it has been delightfully educational. He is a lot of fun to argue with. I am not sure that I have come out victorious even once (actually I am but my self- confidence doesn’t allow such a bald admission), but it has been educational and exhilarating. And because he is an excellent teacher, I learn at least as much as the other students do. Why do I mention this? It’s all in service of allowing me to discuss something that arose from Omer’s class discussion of case theory over the last several years and that I hope will provoke him to post something that will help me (and I suspect many others) get clear what the “other” theory of case, Dependent Case Theory (DCT), takes its core empirical and theoretical content to be. 

To help things along, I will outline oneversion of DCT, observe some obvious problems for that interpretation and note another version (spurred by Omer’s remarks in the Syntax 2 class). By way of contrast, I will outline what I take the core features of classical case theory (CCT) to be and, at the end, I will explain why I personally am still a partisan CCT despite the many problems that beset it.[1]So here goes. I start with CCT as background theory.

CCT is primarily a theory of the distribution of lexical nominals (though what “counts” as lexicalis stipulated (phonetically overt material always does, some traces might (A’-traces) and some traces never do (A-traces and maybe PRO)). The theory takes the core licensing relation to be of the X0-YP variety, some set of heads licensing the presence of some set of DPs. CCT, as Vergnaud conceived it, is intended, in the first instance,to unify/reduce the various filters that Chomsky and Lasnik (1977) proposed. The core assumptions are that (i) nominals require case and (ii) certain heads have the power to assign case. Together these two assumptions explain that we find nominals in positions where heads can case mark them. 

Overt morphological case marking patterns no doubt influenced Vergnaud. However, given that CCT was intended to apply to languages like English and French and Chinese where many/most nominals show no apparent case morphology, it was understood from the get-go that CCT’s conception of case was abstract, in the sense that it is present and regulates the distribution of lexical nominals even when it is morpho-phonologically invisible. This abstractness, and the fact that case lives on a head-XP relation are the two central characteristics of the CCT in what follows.

GB extended the empirical reach of CCT beyond the distribution of nominals by making a second assumption. Assume that abstract case can be morphologically realized. In particular, if it is assumed that in grammars/languages with richer morphological case, the core structuralcases (nominative, accusative, etc.)[2]can be made visible then in these languages case theory can explain why some expressions have the morpho-phonological shapes they do. Adding this codicil allows CCT to claim that case marked nominals look as they do in virtue of overtlyexpressing the abstractcases nominals bear universally.[3]

In sum, CCT covers two domains of data: the original explicandum is the distribution of lexical nominals, the second is the morpho-phonological forms found in richer case languages. Importantly, this second empirical domain requires an extra assumption concerning the relationship between abstract case and its morpho-phonological(henceforth ‘overt’) expression. Usefully, these two parts of the theory allow for the investigation of abstract case by tracking the properties of the overtly marked nominals in those languages where it can be seen (e.g. the pronominal system in English).

CCT has other relevant (ahem) subtleties. 

First, it primarily concerns structuralcase. This contrasts with inherent/lexicalcase, which is tied to particular thematic or semantic roles and is idiosyncratically tied to the specific lexical predicates assigning the roles.[4]For example, some languages, most famously Icelandic, have predicates that require that their nominal arguments have a particular overt case marking (dative or genitive or even accusative). 

CCT has almost always distinguished inherent from lexical case, though the two bear some similarities. For example, both are products of head-XP relations (lexical case piggy backing on theta roles which, like case, are assigned by heads). That said, CCT has generally put lexical case to one side and has concentrated on the structural ones, those unrelated to any thematic function.[5]

Second, MP slightly revamped the CCT. The contemporary version takes structural case to alwaysbe a dependency between a functionalhead and an XP  while lexical case, as the name implies, is alwaysmediated by a lexical head. This differs from earlier GB accounts where nominative is assigned by a functional head (finite T) but accusative is assigned by a lexical V (or P) which also, typically (modulo ECM constructions) theta marks the nominal it case assigns. Thus, MP assimilates accusative to the nominative configuration and takes structural accusative, like nominative, to involve a non-local dependency between a functional head (one that does not theta mark the dependent) and a nominal that is not a complement. The relevant functional head for accusative is Agror some flavor of v. What’s crucial is that in MP accounts the case assigner is always significantly “higher” in the phrase marker and so the dependency must be mediated by AGREE or I-Merge (aka, Move). The most interesting consequence of this amendment to the CCT is the consequence that case should affect scope.[6]Surprisingly, there is rather good evidence that it does. As Lasnik and Saito showed (reanalyzing data from Postal) case dependency affects the domain of a nominal’s scope.[7]The dependency between case and scope is one of the deeper discoveries of MP in my opinion, though I will not pursue that claim here.[8]

That’s the sum of what is relevant about CCT for this post. There are well-known problems (wagerverbs being the most conspicuous (see the discussion section of the earlier post for elucidation)) but IMO the main virtue of CCT is that it combines quite smoothly with a Merge based conception of FL/UG. I will say something about this at the end, but first it’s time to look at the DCT.

First off, the DCT is actually not that new. A version was mooted by the Brandeis Bunch of Yip, Mailing and Jackendoff in the mid 1980s.[9]However, it didn’t catch on much until Alec Marantz revived a version in 1991. Since then, I would wager that the DCT has become the “standard” approach to morphological case, though, as we shall see, what it claims exactly, has been (at least to me) elusive.

Here are some points that are fairly clear. 

First, it is exclusively a theory of overt case marking. It has no intention of accounting for nominal distributions. So, DCT is a theory of what CCT takes to be an add-on.[10]This is not in itself a problem for either account, but it is worth noting (yet again). 

So, DCT is a theory of the overtly witnessed case values (or at least it is so in Marantz and Bobaljik and most other advocates (though see Omer’s take below)). The standard version thus aims to explain the distribution of the overt morpho-phonologically case marking found on nominals (in many, though not all, languages). An important virtue is that it extends to bothnominative-accusative case marking and ergative-absolutive case marking systems. Consequently, investigations of DCT have enriched our appreciation of the complexity of case marking patterns cross linguistically.

A second key feature: the standard version of DCT (e.g. Marantz, Bobaljik) treats case marking as a non-syntactic process. More concretely, it is part of post syntactic processing. Omer and others have argued against this conception. But as originally mooted, DCT operations are outside the syntax.

Third, CCT’s core idea is that overt case marking is an inter-nominal dependency. Case marking reflects a relation between nominals within a specified domain. It is not primarily a head-nominal relation. So, for example, accusative is what one finds on DPs that have anothernon-case valued nominal higherup in the samedomain. Overt case, then, is a XP-YP relation (notan X0-YP relation) with the case value of the dependentnominal stemming from the presence of another non-dependentnon-valuednominal in the same domain. The advertising that generally accompanies DCT stresses that for DCT (in contrast to CCT) case is notabstract, but very concrete in that it concerns itself with theovertcase that we can hear/see on nominals, not some abstract feature that only sometimes overtly surfaces. The strongest version of this idea is that the purview of the DCT is the full panoply of morpho-phonologically realized cases. However, this is very clearly too strong. Let me explain.

First, like CCT, DCT agrees that there is an important distinction between inherent/lexical case that is semantically/thematically restricted and structural case that isn’t. Like CCT, it takes the former to be effectively a X0-YP dependency with X0being a theta role assigner. As lexical case is a by-product of theta assignment, this supposition, like the one made in CCT, is not unreasonable. The important point is that like CCT, DCT focuses on a subset of the cases overtly expressed. So, like CCT, the dependent part of the DCT is at besta theory of overt structuralcase.

A strong version of a DCT approach to structural case would have it that all dependent case values in a language are the product of a DP-DP relation holding within a circumscribed domain. So, for example, if accusative is the realization of dependent case in language L then modulo some quirky lexical case instanceswe would find accusative case on dependentDPs alone (nominative in L being what we find on otherwise non-valued non-dependent cases). This would be a strong theory. Unfortunately, it looks to be false.

In particular, we know that there exists languages in which overt structural accusative case occurs on DPs in structures where the DP that carries it cannot bea dependent case. For example, the accusative we find in forinfinitives (For her to leave would be terrible) or the one we find in acc-ing gerunds (her kissing him would be terrible) or in languages like Latin which have acc-infinitive constructions. In all these examples, the overt case of the DP looks like the case we find on the object of transitive verbs, which is the canonical dependent case value. 

Moreover, these are clearly structural cases as they are in no way thematically restricted. And it is clear the accusative we find here is not dependent on any other DP in its domain (we can always find them, for example, on the sole argument of an unaccusative in these settings). So there exists a class of morpho-phonologically overt accusatives that are not overtly marked accusative in virtue of being dependent. What then marks them overtly accusative? Well, the obvious answer is some local head (foror the head of the gerund/infinitive). Thus, DCT must allow that there are two ways of assigning accusative values, only one of which reflects its “dependent” status. Or to put this less charitably, DCT presupposes that the core case marking relation identified by CCT also plays a role in accounting for overt structural case. What DCT denies, then, is that allaccusatives are marked by heads. Some are dependent cases, some are marked by heads. On this construal of the DCT, it appears that CCT is not so much wrong as incomplete. 

Moreoever, it appears that within DCT case is no less abstract than it is in CCT as there is, at best, an indirect relationbetween a specific overt value (e.g. accusative) and being a dependent or a non dependent case.

Nor is this problem restricted to subjects, or so I have been reliably told. Here’s what I mean. I gave a paper at the German LSA this past spring and one of the perks of doing so was hearing Julie Legate give a paper on argument structure and passives. She provided scads of data that showed that in languages that systematically violate Burzio’s generalization we can nonetheless find “objects” that bear accusative case. In other words, there exist many nom-acc languages where we find accusative case sitting on DPs in object positions in sentences where there is every reason to believe that there are no additional arguments (i.e. 1-place predicates where the internal argument is marked accusative). If this is right, then we cannot analyze the accusative as an instance of dependent case and must, instead, trace the witnessed value to some head that provides it (a CCTer would suggest v). 

So, neither subjects nor objects in general can be assumed to bear a dependent case valuein virtue of being a dependent case. The generalization seems to be not that DCT is a theory of overtmorpho-phonological structural case in general, but only of overt structural case that we find most typically in transitive constructions (ones where Burzio’s generalization holds). In other words, the purview of the theory seems, at least to me, much narrower and different than I thought it was from reading the general advertising.

Let me repeat an important point that Omer repeatedly made in class is that there is a good sense in which DCT is no less abstract than CCT. Why? Because, we cannot go from a witnessed case value to the conclusion that the DP that carries it is dependent. Rather we can only assume that if a DP is dependent it will carry the dependent case value though what that overtvalue is remains a free parameter (so far as I can tell).[11]

So, in effect, DCT is effectively a theory of abstract case defines a mapping between some DPs in multiple DP domains (the dependent ones) and some case values (e.g. acc in nom-acc Gs). At best, the theory explains what values one finds in transitive clauses (one with at least two nominals) modulo some specification of the relevant domains.[12]

This last point is worth emphasizing. For illustration consider a sentence like (1):

(1)  DP1V [IPDP2Infinitive V DP3]]

What case should we expect here? Well DP3should get dependent case if DP2is in its domain and DP2is unvalued (at least at the point that DP3’s value is determined). So, if this is a nom-acc language, we should expect to find DP3bearing acc case. What of DP2? This will depend on whether DP1is in the same domain as DP2. If it is, then if DP1has no case value then DP2will also bear acc case. In effect what we will have here is an instance of what we find in English ECMs. DP1will surface with nominative case. 

However, what if DP2is not in the same domain as DP1? If it isn’t then it would get assigned nominative case. What we would to find is something that does not arise in English, the sentential analogue of He expects he to kiss her. So, to block this we need a story that forcesDP2to be in the same domain as DP1, maybe by forcing it to move. What forces this? So far as I know, nothing. We can, for example, stipulate that it is in the same domain or that it move (e.g. by putting some EPP feature on a functional (phase) head in the higher clause), but there is nothing that forces either conclusion. So this would have to be an extra feature. Recall that DCT is nota theory of DP distribution. There is no analogue of the case filter that forces movement. And even if there were, DP2could be assigned nominative case if it did not move. So why must DP2bear acc case? Because we are assuming it moves orbecause the domain of case assignment is the whole clause and case is assigned bottom up or….[13]All of these are free parameters in DCT. At any rate, there is nothing in the theory that in principle prohibits nominative case in ECM constructions. It all depends on how the relevant domains are constructed.[14]

There are other questions as well. For example, given that we can get the same surface forms either via a X0-DP relation or a XP-YP relation learnability issues will most likely arise. Whenever we can do things in various ways then in any given circumstance the child will have to choose, presumably based on PLD. SO there are potential learnability issues with this kind of mixed theory.

In addition, there are theoretical puzzles: why do the very same case values surface in these very different ways? Are there are languages that distinguish the class of morphological cases assigned by CCT mechanisms and those assigned by DCT mechanisms. If there aren’t, why not.[15]

None of these questions are unanswerable (I would assume), but it did surprise me that how much narrower the scope of the DCT was than I had supposed it to be before Omer walked us through the details. It is basically a theory of the distribution of overt case in transitive clauses. It is nota general theory of overt case morphology (as sometimes advertised), or at least the distinctive features are not. From where I sit, it’s a pretty modest part of UG even if correct and it leaves large chunks of CCT potentially intact

Given all of this, I would argue that the empirical advantages of the DCT had better be pretty evident if we are to add it to CCT. Here is what I mean. Given that it presupposes mechanisms and relations very like those found in CCT, then the argument that we need DCT ones in addition to those of CCTmust argue either that CCT cannothandle the data that DCT addresses or that DCT handles it in a far superior manner. The arguments I have seen fall into the second disjunct. It is actually quite hard to argue that noversion of CCT could accommodate the data (after all, there is always a possible (ad hoc) head around to do the CCT dirty work). So, the argument must be that even though CCTcando it, it does not do it nicely or elegantly. But given that adding the DCT to the required mechanics of CCT creates a theoretically more redundant account, then the superiority of DTC accounts had better be overwhelming to pay the cost of such theoretical enrichment. 

As a personal matter, I have not been convinced that the cost is worth it, but others will no doubt differ in their judgments. What seems to me, however, is that given that DCT requires something like CCT to getsomeof the case values right, even a cumbersome CCT account might not be theoretically worse off than a DCT account that incorporates a CCT component.

That said, none of this is why I favor the CCT accounts. The main reason is that I understand how a CCT could be incorporated into a minimalist Merge based account of FL/UG.  Here’s what I mean.

There is a version of MP that I am very fond of. It adopts the following Fundamental Principle of Grammar (FPG):

FPG: For expressions A and B to grammatically interact, A and B must merge

In other words, the only way for expressions to grammatically interact is via Merge. They locally interact under E-merge and non-locally under I-merge (which, as you all know, is one and the same operation). Thus, e.g. selection or subcategorization or theta marking is under E-merge and antecedence or control or movement is under I-merge. I like the FPG. It’s simple. It even strikes me as natural as it enshrines the importance of constituency at its very core.

What’s important here is that it easily accommodates the CCT. A head A case marks a nominal B only if A and B Merge. In MP accounts, this would be I-merge. 

In contrast, I don’t see how to make DCT fit with this picture. The inter-nominal dependency that marks the most fundamental relation is not plausibly a chain relation formed under I-merge.  So, it is not a possible legit relation at all if we buy into something like the FPG. As I have so purchased, the DCT would induce in me a strong sense of caveat emptor.

I know, I know; one thoerists’s modus ponens is another’s modus tollens. But ifyou like the FPG (which I do) then you won’t like replacing CCT with DCT (which I don’t).

Ok, enough. There are many reasons to get off this long post at many points. The real intent here has been to provoke DCTers (Omer!) to specify what they take the theory to be about and whether they agree that it is quite abstract and that it requires something like CCT as a sub-part. And if this is so, why we should be happy about it. So, I hope this provoked, and thx for the course Omer. It was terrifically enlightening.



[1]I’ve reviewed CCT before (here) so some of this will be redundant. That’s the nice thing about blog posts. They can be. We would neversee anything similar in a refereed article where all contents are novel and nobody ever chews their food twice.
[2]Actually what goes into the “etc” is not set. Is dative structural? It can be. Genitive? Can be, but need not be. The clean ones in nom-acc languages are nominative and accusative, though even these can come from another source as we shall presently note.
[3]This does not require assuming that FL have but one morpho-phono realization scheme. There is room for an erg-abs as well as a nom-acc schema. But as these matters are way beyond my competence, I leave them with this briefest mention.
[4]I frankly have never fully grasped the distinction (if there is one) between inherent and lexical case. However, what is critical here is that it is semantically restricted, being the overt manifestation of a semantic role/value. This is decidedly notwhat we see with structural case, which seems in no way restricted thematically (or semantically for that matter).
[5]FWIW, IMO, structural case (both abstract and overt) is the really weird one. Were all cases lexical there would be nothing odd about them. They would just be the outward form of a semantic dependency. But this is precisely what structural case is not. And so why nominals bear or need them is quite unclear. At least it is to me. In fact, I would call it one of the great MP conundra: why case?
[6]This consequence is obvious if one assumes that case is discharged under I-merge. It is less obvious if it case is discharged via Agree. The original Lasnik and Saito analysis assumed the former.
[7]We find similar data in many other languages (e.g. Japanese) where we can predict a DP’s scope properties from the case values it bears. 
[8]Interestingly, this data has been largely ignored by many mainstream minimalists for it remains inexplicable in theories where features “lower” fromto V and case is executed via Agree rather than I-merge.
[9]At the time, they were all Brandeis faculty.
[10]And as such, DCT should be in principle compatible with CCT. However, though this should be possible, my impression is that devotees of the DCT have been disinclined to treat the two in this way. I do not know why.
[11]It is not even clear to me that one can conclude that alldependent cases will realize a single value. This does not follow from the theory so far as I can tell.
[12]Of course, not all clauses with two nominal include a subject, but thee are niceties that I don’t want to get into here.
[13]Note, if we could treat the relevant domain as the whole clause, then there would be no need for DP2to move at all and the correlation between scope and case that Lasnik and Saito discussed would theoretically evaporate. 
[14]Note that the acc feature in ECM in CCT is a product of the particular head we have. The analogue of this stipulation in DCT is the size of the relevant domain. One might try to get around this theoretical clunkiness by assuming the relevant domains are phases (though deciding what is and isn’t a phase is no small theoretical feat nowadays either). But what matters is that the embedded clause is not phase like in (1). Were it such, the problem would reappear. By breaking the link between a theory of the distribution of DPs and their values, DCT has little to say about such cases except that something else must determine when the two subjects are in the same local domain relevant for determining dependent case.
[15]E.g. There are lexical cases (e.g. instrumental, comitative, ablative) that seem to be exclusively lexical. Are there any dependent cases that are exclusively dependent?
            Note that this is theoretically similar to an observation standardly made wrt resumptive pronouns. Why do they always assume the shape of “regular” bound pronouns? One answer is that they arebound pronouns, i.e. that there is no separate class of resumptivesprecisely because if there were we might expect them to have a distinctive morphology somewhere. The absence of a distinction argues for an absence of the category. Similar logic applies in the domain of dependent case vs head assigned case.