Comments

Showing posts with label morphology. Show all posts
Showing posts with label morphology. Show all posts

Wednesday, June 15, 2016

Case & agreement: beware of prevailing wisdom

Someone recently told me a (possibly apocryphal) story about the inimitable Mark Baker. The story involves Mark giving a plenary lecture somewhere on the topic of case. To open the lecture, possibly-apocryphal-Mark says something along the following lines:
Those of you who don't work on case probably have in your heads some rough sketch of how case works. (e.g. Agree in person/number/gender between a designated head and a noun phrase, resulting in that noun phrase being case-marked.) What you need to realize is that basically nobody who actually works on case believes that this is how case works.
Now, whether or not this is really how it all went down, possibly-apocryphal-Mark has a point. In fact, I'm here to tell you that his point holds not only of case, but of agreement, too.

In one sense, this situation is probably not all that unique to case & agreement. I'm sure presuppositions and focus alternatives don't actually work the way that I (whose education on these matters stopped at the introductory stage) think they work, either. The thing is, no less than the entire feature calculus of minimalist syntax is built on this purported model of case & agreement. [If you don't believe me, go read "The Minimalist Program" again; you'll find that things like the interpretable-uninterpretable distinction are founded on the (supposed) behavior of person/number/gender and case (277ff.).] And it is a model of case & agreement that – to repeat – simply doesn't work.

So what model am I talking about? I'm really talking about a pair of intertwined theories of case and of agreement, which work roughly as follows:
  1. there is a Case Filter, and it is implemented through feature-checking: each noun phrase is born with a case feature that, were it to reach the interfaces (PF/LF) unchecked, would cause ungrammaticality (a.k.a., a "crash"); this feature is checked when the noun phrase enters into an agreement relation with an appropriate functional head (T0, v0, etc.), and only if this agreement relation involves the full set of nominal phi features (person, number, gender)
  2. agreement is also based on feature-checking: the aforementioned functional heads (T0, v0, etc.) carry "uninterpretable person/number/gender features"; if these reach the interfaces (PF/LF) unchecked, the result is – you guessed it – ungrammaticality (a.k.a., a "crash"); these uninterpretable features get checked when they are overwritten with the valued person/number/gender features found on the noun phrase
Thus, on this view, case & agreement live in something of a happy symbiosis: agreement between a functional head and a noun phrase serves to check what would otherwise be ungrammaticality-causing features on both elements.

From the vantage point of 2016, however, I think it is quite safe to say that none of this is right. And, in fact, even the Abstractness Gambit (the idea that (1) and (2) are operative in the syntax, but morphology obscures their effects) cannot save this theory.

What follows builds heavily on some of my own work (though far from exclusively so; some of the giants whose shoulders I am standing on include Marantz, Rezac, Bobaljik, and definitely-not-apocryphal Mark Baker) – and so I apologize in advance if some of this comes across as self-promoting.

––––––––––––––––––––

Let's start with (1). Absolutive(=ABS) is a structural case, but there are ABS noun phrases that could not possibly have been agreed with, living happily in grammatical Basque sentences. How do we know they could not possibly have been agreed with (not even "abstractly")? Because we know that (non-clitic-doubled) dative arguments in Basque block agreement with a lower ABS noun phrase, and we can look specifically at ABS arguments that have a dative coargument. (Indeed, when the dative coargument is removed or clitic-doubled, morphologically overt agreement with the ABS – impossible in the presence of the dative coargument – becomes possible.)

So if an ABS noun phrase in Basque has a dative coargument, we know that this ABS noun phrase could not have been targeted for agreement by a head like v0 or T0 (because they are higher than the dative coargument). Notice that this rules out agreement with these heads regardless of whether that supposed agreement is overt or not; it is a matter of structural height, coupled with minimality. The distribution of overt agreement here serves only to confirm what our structural analysis already leads us to expect.

And yet despite the fact that it could not have been targeted for agreement, there is our ABS noun phrase, living its life, Case Filter be damned. [For the curious, note that this is crucially different from seemingly similar Icelandic facts, which Bobaljik (2008) suggests might be handled in terms of restructuring. That is because whether the embedded predicate is ditransitive (=has a dative argument) or monotransitive (=lacks one) cannot, to the best of my knowledge, affect the restructuring possibilities of the embedding predicate one bit.]

If you would like to read more about this, see my 2011 paper in NLLT, in particular pp. 929 onward. (That paper builds on the analysis of the relevant Basque constructions that was in my 2009 LI paper, so if you have questions about the analysis itself, that's the place to look.)

––––––––––––––––––––

Moving to (2), this is demonstrably false, as well. This can be shown using data from the K'ichean languages (a branch of Mayan). These languages have a construction in which the verb agrees either with the subject or with the object, depending on which of the two bears marked features. So, for example, Subj:3sg+Obj:3pl will yield the same agreement marking (3pl) as Subj:3pl+Obj:3sg will. It is relatively straightforward to show that this is not an instance of Multiple Agree (i.e., the verb does not "agree with both arguments"), but rather an instance of the agreeing head looking only for marked features, and skipping constituents that don't bear the features it is looking for. Just like an interrogative C0 will skip a non-[wh] subject to target a [wh] object, so will the verb in this construction skip a [sg] (i.e., non-[pl]) subject to target a [pl] object.

This teaches us that 3sg noun phrases are not viable targets for the relevant head in K'ichean. Ah, but now you might ask: "What if both the subject and the object are 3sg?" The facts are that such a configuration is (unsurprisingly) fine, and an agreement form which is glossed as "3sg" shows up in this case (so to speak; it is actually phonologically null). That's all well and good; but what happened to the unchecked uninterpretable person/number/gender features on the head? Remember, they couldn't have been checked, because everything is now 3sg. And if 3sg things were viable targets for this head, then you could get "3sg" agreement in a Subj:3sg+Obj:3pl configuration, too – by simply targeting the subject – but in actuality, you can't. [This line of reasoning is resistant even to the "but what about null expletives?" gambit: if the uninterpretable phi features on the head were checked by a null expletive, then either the expletive is formally plural or formally singular. If it is singular, then we already know it could not have been a viable target for this head; if it is plural, and it has been targeted for agreement, then we predict plural agreement morphology, contrary to fact. Thus, alternatives based on a null expletive do not work here.]

What about Last Resort? It is entirely possible that grammar has an operation that swoops in should any "uninterpretable features" have made it to the interface unchecked, and deletes the offending features. But now ask yourself this: what prevents this operation from swooping in and deleting the features on the head even when there was a viable agreement target there for the taking (e.g. a 3pl nominal)? i.e., why can't you just gratuitously fail to agree with an available target, and just have the Last Resort operation take care of your unchecked features later? The only possible answer is that the grammar "knows that this would be cheating"; the grammar makes sure the Last Resort is just that – a last resort – it keeps track of whether you could have agreed with a nominal, and only if you couldn't have are you then eligible for the deletion of offending features. Put another way, the compulsion to agree with an available target is not reducible to just the state of the relevant features once they reach the interfaces; it is obligatory independently of such considerations. You see where this is going: if this bookkeeping / independent obligatoriness is going on anyway, uninterpretable features become 100% redundant. They bear exactly none of the empirical burden (i.e., there is no single derivation in the entire grammar that would be ruled out by unchecked features, only by illicit application of the Last Resort operation).

Bottom line: there is no grammatical device of any efficacy that corresponds to this notion of "uninterpretable person/number/gender feature."

––––––––––––––––––––

At this juncture, you might wonder what, exactly, I'm proposing in lieu of (1-2). The really, really short version is this: agreement and case are transformations, in the sense that they are obligatory when their structural description is met, and irrelevant otherwise. (Retro, ain't it?) To see what I mean, and how this solves the problems associated with (1) and (2), I'm afraid you'll have to read some of my published work. In particular, chapters 5, 8, and 9 of my 2014 book. Again, sorry for the self-promotional nature of this.

––––––––––––––––––––

Epilogue:

Every practicing linguist has, in their head, a "toy theory" of various phenomena that are not that linguist's primary focus. This is natural and probably necessary, because no one can be an expert in everything. The difference, when it comes to case and especially when it comes to agreement, is that these phenomena have been (implicitly or explicitly, rightly or wrongly) taken as the exemplar of feature interaction in grammar. And so other members of the field have (implicitly or explicitly) taken this toy theory of case & agreement as a model of how their own feature systems should work.

And lest you think I have constructed a straw-man, let me end with an example. If you follow my own work, you know that I have been involved in a debate or two recently where my position has amounted to "such and such phenomenon X is not reducible to the same mechanism that underlies agreement in person/number/gender." What strikes me about these debates is the following: if A is the mechanism that underlies agreement, these (attempted) reductions are not reductions-to-A at all; they are reductions-to-the-LING-101-version-of-A (e.g. Chomsky's Agree), which – to paraphrase possibly-apocryphal-Mark – nobody who works on agreement thinks (or, at least, nobody who works on agreement should think) is a viable theory of agreement.

Now, it is logically possible that a feature calculus that was invented to capture agreement in person/number/gender (e.g. Agree), and turns out to be ill-suited for that purpose, is nevertheless – by sheer coincidence – the right theory for some other phenomenon (or set of phenomena) X. But even if that turns out to be the case, because the mechanism in question doesn't account for agreement in the first place, there is no "reduction" here at all.


Thursday, July 18, 2013

Why Morphology?


For quite a while now, I’ve been wondering why natural language (NL) has so much morphology.  In fact, if one thinks about morphology with a minimalist mind set one gets to feeling a little like I. I. Rabi did regarding the discovery of the muon. His reaction? “Who ordered that?”.  So too with morphology; what’s it doing and why is there both so much of it in some NLs (Amer-Indian languages) and so little of it in others (Chinese, English)? One thing seems certain, look around and you can hardly miss the fact that this is a characteristic feature of NLs in spades!!

So what’s so puzzling? Two things. First, it’s absent from artificial languages, in contrast to, say, unbounded hierarchy and long distance dependency (think operator-variable binding). Second, it’s not obviously functionally necessary (say to facilitate comprehension). For example, there is no obvious reason to think that Chinese or English speakers (there is comparatively little morphology here) have more difficulty communicating with one another than do Georgian or Athabaskan speakers, despite the comparative dearth of apparent morphology. In sum, morphology does not appear to be conceptually or functionally necessary for otherwise we (I?) might have expected it to be even more prevalent than it is. After all if it’s really critical and/or functionally useful then one might expect it to be everywhere, even in our constructed artificial languages.  Nonetheless, it’s pretty clear that NLs hardly shy away from morphological complexity.

Moreover, it appears that kids have relatively little problem tracking it. I have been told that whereas LADs (language acquisition devices, aka: kids) omit morphology in the early stages of acquisition (e.g. ‘He go’), they don’t produce illicit “positive” combinations (e.g. ‘They leaves’). I have even been told that this holds true for languages with rich determiner systems and noun classes and fancy intricate verbal morphology: it seems that kids are very good at correctly classifying these and quickly master the relevant morphological paradigms.  So, LADs (and LASs; Language Acquisition Systems) are good at learning these horrifying (to an outsider or second language learner) details and at deploying them effectively as native speakers. So, again, why morphology?

Unburdened by any knowledge of the subject matter, I can think of four possible reasons for morphology’s ubiquity within NLs. I should add that what follows is entirely speculative and I hope that this post motivates others to speculate as well. I would love to have some ideas to chase down. So here goes.

The first possibility is that visible morphology is a surface manifestation of a deeper underlying morphology. This is a pretty standard Generative assumption going back to the heyday of comparative syntax in the early 80s. The first version of this was Jean-Roger Vergnaud’s (terrific) theory of abstract case. The key idea is that all languages have an underlying abstract case system that regulates the distribution of nominal expressions.  If we further assume that this abstract system can be phonetically externalized, then the seeds of visible morphology are inherent in the fundamental structure of FL. The general principle then is that abstract morphemes (provided by UG) are wont to find phonetic expression (are mapped to the sensory and motor systems (S&M)), at least some of the time.

This idea has been developed repeatedly. In fact, the following is still a not an unheard of move: We find property P in grammar G overtly, we then assume that something similar occurs in all Gs, at least covertly. This move is particularly reasonable in the context of “Greed” based grammars characteristic of early minimalism. If all operations are “forced” and the force reduces to checking abstract features, then using the logic of abstract case theory, we should not be surprised if a GL expresses these phonetically.

Note that if something like this is correct (observe the if), then the existence of overt morphology is not particularly surprising, though the question remains of why some Gs externalize these abstracta and some remain phonetically more mum.  Of late, however, this Greed based approach has dimmed somewhat (or at least that’s my impression) and generate and filter models of various kinds are again being actively pursued. So…

A second way to try and explain morphology piggy-backs on Chomsky’s recent claims that Gs are not pairings of sound and meaning but pairings of meanings with sound. His general idea is that whereas the mapping from lexical selection to CI is neat and pretty, the externalization to the S&M systems is less straightforward. This comports with the view that the first real payoff to the emergence of grammar was not an enhancement of communication but a conceptual boost expanding the range of cognitive computations in the individual, i.e. thinking and planning (see here). Thus externalization via S&M is a late add-on to an already developed system. This “extra” might have required some tinkering to allow it to hook onto the main lexicon-to-CI system and that tinkering is manifest as morphology. In effect then, morphology is family related to Chomsky and Halle’s old readjustment rules.  From where I sit, some of the work in Distributed Morphology might be understood in this way (it packages the syntax in ways palpable to S&M), though, I am really no expert in these matters so beware anything I say about the topic. At any rate, this could be a second source for morphology, a kluge to get Gs to “talk.”

I can think of a third reason for overt morphology that is at right angles to these sorts of more grammatically based considerations. There are really two big facts about human linguistic facility: (i) the presence of unbounded hierarchical Gs and (ii) the huge vocabulary speakers have.  Though it’s nice to be able to embed, it’s also nice to have lots of words.  Indeed, if travelling to a foreign venue where residents speak V and given the choice of 25 words of V plus all of GV or 25,000 words of V plus just the grammar of simple declaratives (and maybe some questions), I’d choose the second over the first hands down. You can get a lot of distance on a crappy grammar (even no grammar) and a large vocabulary.  So, here’s the thought: might morphology facilitate vocabulary development?  Building a lexicon is tough (and important) and we do it rapidly, very rapidly. Might overt morphology aid this process, especially if word order in a given language (and hence PLD of that language) is not all that rigid?  It could aid this process by providing stable landmarks near which content words could be found. If transitional probabilities are a tool for breaking into language (and the speech stream, as Chomsky proposed in LSLT and later rediscovered by Aislin, Saffran and Newport), then having morphological landmarks that probabilistically vary at different rates than the expressions that sit within these landmarks then it might serve to focus LADs and LASs on the stuff that needs learning; content words. On this story, morphology exists to make word learning easier by providing frames within a sentence for the all-important lexical content material.

There is a second version of this kind of story that I would like to end with. I should warn you that it is a little involved. Here goes. Chomsky has long identified two surprising properties of NLs. The first is unbounded hierarchical recursion, the second is our lexical profligacy. We not only can combine words but we have lots of words to combine. A typical vocabulary is in the 50,000 word range (depending on how one counts). How do we do this. Well, assume that at the very least, each new vocabulary item consists of some kind of tag (i.e. a sound or a hand gesture). In fact, for simplicity say that acquiring a word is simply tagging it (this is Quin’s “museum myth,” which like many myths may in fact be true). Now this sounds like it should be fairly easy, but is it?  Consider manufacturing 50,000 semantically arbitrary tags (remember, words don’t sound the way they do because they mean what the do, or vice versa).  This is hard. To do this effectively requires a combinatoric system, Indeed, something very like a phonology, which is able to combine atomic units into lexical complexes. So, assume that to have a large lexicon we need something like a combinatoric phonology and the products of this system are the atoms that the syntax combines into further hierarchically structured complexes. Here’s the idea: morphology mediates the interactions of these two very different combinatoric systems.  Meshing word structures and sentence structure is hard because the modes of combination of the two kinds of systems are different. Both kinds play crucial (and distinctive) roles in NL and when they combine morphology happens!  So, on this conception, morphology is not for lexical acquisition, but exists to allow words with their structures to combine into phrases with their structures.

The four speculations above are, to repeat, all very speculative and very inchoate. They don’t appear to be mutually inconsistent, but this may be because they are so lightly sketched. The stories are most likely naïve, especially so given my virtually complete ignorance of morphology and its intricacies. I invite those of you who know something about morphology to weigh in. I’d love to have even a cursory answer to the question.