Faculty of Language: lexical acquisition

As Chomsky has repeatedly emphasized, natural language (NL) has two distinctive features; it’s hierarchically recursive and it contains a whole bunch of lexical items (LI) with rather distinctive properties (when compared to what one finds in animal communication systems (e.g. they are not stimulus bound and they are very conceptually labile)). A natural question that arises is whether these two features are related to one another; does the fact that NLs are hierarchically recursive (actually generated by Gs that are recursive and produce hierarchical linguistic objects, or the corresponding I-language is such, but let’s forget the niceities for now) have any causal bearing on the fact that NL lexicons are massive or vice versa?[1] The only proposal that I know of that links these two facts together causally is Lila Gleitman’s hypothesis that vocab acquisition leverages syntactic knowledge. The syntactic bootstrapping thesis is that Gs facilitate (and hence accelerate) vocab acquisition in LADs. By “vocab” I mean “open” class contentful items like ‘shoe’ and ‘rutabaga’ rather than “closed” class items like ‘in’ or ‘the’ or the plural ‘-s’ morpheme. NLs have a surprisingly large number of open class “words” when compared to other animal systems (at least 3 orders of magnitude larger) and it is reasonable to ask why this is so.[2]

Gleitman’s syntactic bootstrapping thesis provides a possible answer: without syntactic leverage, acquiring words is a very very arduous process (see here, here, and here for discussion about how first words are acquired), and as only humans have syntax, only they will have large vocabs. Oh, btw, by acquisition I intend something pretty trivial; tagging a concept with a label (generally phonological, but not exclusively so, think ASL). I don’t think that this is all there is to vocab acquisition,[3] but it turns out that even this is surprisingly difficult to accomplish, (contrary to intimation to the contrary in the philo of language literature, see Quine on the ‘museum myth’) and it requires lots of machinery to pull off (e.g. a way of generating labels, namely, something like a phonology and a way of identifying the things that need tagging).

I mention all of this because I just recently heard a lecture by Anne Christophe (see slides here) that goes into some details about the possible mechanics behind this leveraging process that bears on an earlier question I have been wondering about for a long time: why do NLs have so much phonologically overt morphology? For any English speaker, morphology seems like a terrible idea (it’s a mess and a pain to learn, you should hear my German). However, Christophe argues that morphology and closed class items serve to facilitate the labeling process that drives open class vocab acquisition in humans. In other words, her work (and here I mean that of the work from her lab as there are many contributors here as the slides make clear) sketches the following picture: closed class items (and I including morphological stuff here) provide excellent environments for the identification and tagging of open class expressions. And as closed class items are in fact very closely tied to (correlated with) syntactic structure (think Jabberwocky!!), morphology, broadly construed, is what enables kids to leverage syntax to build vast lexicons. If correct, this forges a close causal connection between the two distinctive properties NLs display; large vocabs and syntactic structure. Let’s call this the Gleitman-Christophe thesis (GCT).

What’s the evidence? Effectively, morphology provides relatively stable landmarks between which content words sit thus allowing for easy identification and tagging. In other words, morphology allows the LAD to zero in on content words by locating them in relatively fixed morpho-syntactic frames. And very young LADs can use this information for they are known to be very good at acquiring this fixed syntactic scaffolding using non-linguistic distributional (and statistical) methods (see slide 18).[4] Closed class items are acquired early (they are ubiquitous, frequent and short) and it has been shown that kids can and do exploit them to “find” content words. Christophe reports on new work that shows how useful this can be for categorizing words into initial semantic classes (e.g. distinguishing Ns (canonically things) from Vs (canonically eventish)). She then goes on to describe how acquiring a few words can further enhance the process of acquiring yet more vocab. The first few words act as “seeds” that once acquired serve to further leverage the acquisition process. In sum, Christophe describes a process in which prosody (which Christophe discusses in the first part of the talk), morphology and syntax greatly facilitate word acquisition, which in turn enables yet more word acquisition. We thus get a virtuous circle anchored in morpho-syntax, which is in turn anchored in epistemologically prior pre-linguistic capacities to find fixed phono-morpho-syntactic landmarks that allow LADs to quickly fix on new words.[5] This all provides good evidence in favor of the GCT. It also allows one to begin to formulate one answer to my earlier question: Why Morphology?

So what’s morphology for? (This is a dumb question really, for it could be for lots of things. However, thinking functionally is natural for ‘why’ questions. See below for a just so story). Well among other things It is there to support large vocabs and this, I would suggest is a very big deal. I once suggested the following thought experiment: I/you am/are going to Hungary (I/you speak NO Hungarian) and am/are made the following offer: I/you can have a vocab of 50k words and no syntax whatsoever or a perfect syntax and a vocab of 10 words. Which would I/you prefer? I/(you?) would take door number 1. One can get a long way on 50k words. Nor is this only for purposes of “communication” (though were there an indirect tie between communicative enhancement and morpho-syntax I would be ok with that). Phenomenologically speaking, tagging a concept has an effect not unlike making an implicit assumption explicit. And explicitness is a very good way to enhance thought (indeed, it feels like it allows one to entertain novel thoughts). Having a word for something allows it to be conceptually accessible and salient in a way that having the concept inchoately does not. In fact, I am tempted to say, that having a word for something can change the way you think (i.e. it can affect cognitive competence, not just enhance performance).[6] So, tagging concepts really helps and tagging a lot of concepts really really helps. Thus, if you are thinking of putting in an Amazon order for a syntax, I would suggest asking for one that also supports large scale vocab acquisition (tagging concepts) and the GCT argues that such a syntax would come with phonologically overt morphology and phonologically overt closed class items that our pre-linguistic proclivities can exploit to build large lexicons quickly. In other words, if you want an NL that is good for thinking (and maybe also for communication) get one that has the kinds of relations between morphology and syntax that we actually find.

Note that this view of morphology leaves (completely?) open the question of how morphology functions inside UG. It is consistent with this view that G operations are driven by feature checking requirements, some of which become realized overtly in the phonology (this is characteristic of early Minimalist proposals). It is also consistent with the view that they are not (e.g. that they are mere by-products of grammatical operations rather than drivers thereof (this is what we find in later EPP based conceptions of grammatical operations in later minimalism). It is consistent with the idea that morphology exists to fit syntax to phonology (readjustment rules), or that it’s not (i.e. it’s a functionally useless ornamentation). All the GCT requires is that there be reliable correlations between overt markers and the syntax so as to allow the LAD to leverage the syntax in order to acquire a content rich lexicon.

If this is right, then it might serve as the start of an account of why there is so much overt morphology and/or closed class items (very frequent items that function to limn syntactic structure) in NLs. In fact, it suggests that there should be no NL that eschews both, though what the mix needs to be can be quite open. So Chinese and English don’t have much morphology, but they have classifiers (Chinese) or Determiners and Verbal morphology (English) and this can serve to do the lexicon building job (Anne C tells me that all the stuff done on French discussed in the slides replicates in Mandarin).

As an aside, IMO, one of the puzzles of morphology is why some languages seem to have so much (Georgian) and some seem to have so little (English, Chinese). If morphology is that important in the grammar either functionally (e.g. for processing) or grammatically (e.g. it drives operations) then why should some NLs express so much of it overtly and some have almost none at all that is visible. The GCT offers a possible way out: morphology per se is not what we should be looking at. Rather it is morphology plus closed class items; items that give you fixed frames for triangulating on content “words.” There needs to be a sufficiency of these to undergird vocab acquisition, but the mix need not be crucial (in fact it is not even clear how much of both is sufficient or if there may be a cost in having too much. Or even if these queries make any sense).

Let me end here. NLs are stuffed with what appears to be “useless” (and as an English speaker, cumbersome) morphology. And useless it may be from a purely grammatical point of view (note I say may, leaving the question open). But GCT suggests that overt grammar dependent fixed points can be very useful for building lexicons given our pre-linguistic capacities. And given the virtues of a good sized lexicon for thinking and communicating, a syntax that can support this should have advantages over one that doesn’t (that’s the just so story, btw). If correct, and the data collected so far is non-trivial, this is nice to know for it serves to possibly bridge two big facts about NLs, facts that to date seem (or more accurately, seemed to me) to be entirely independent of one another.

[1] Nothing I say touches on the fact that NL lexical items have very distinctive properties, at least when compared to symbols in animal systems. In other words, the lexicon presents two puzzles: (i) why is it so big? (ii) why do human lexical items function so differently from non-human ones. What follows tries to say something about the first, but leaves the second untouched. Chomsky has discussed some of these distinctive features and Paul Pietroski has a forthcoming book that discusses the topic of lexicalization.

[2] So far as I know, no animal communication system has any closed class expressions, which, if they have no syntax, would not be a surprise given Gleitman’s conjecture.

[3] Paul Pietroski has a terrific forthcoming book on what lexicalization consists in “semantically” and I intend my views on the matter to closely track his. However, for present purposes, we can ignore the semantic side of the word acquisition process and concentrate solely on the process of phonetically tagging LIs.

[4] That kids are really good at identifying morphology has always surprised me. This is far less the case in second language acquisition if my experience is anything to go on. At any rate, it seems that kids rarely make “errors” of commission morphologically. If they screw up, which is surprisingly infrequently, it manifests as errors of omission. Karin Stromswold has work from a while ago documenting this.

[5] This might also provide a model for how to think of the thematic hierarchy. Is this part of UG or not? Well, one reason for thinking not is that it is very hard to define theta roles so that they apply across a broad class of verbs. Dowty showed how hard it is to define ‘agent’ and ‘patient’ etc. and Grimshaw noted that it is largely irrelevant for syntactic concerns anyhow. When does it matter? Real, theta roles matter for UTAH. We need to know where a DP starts its derivational life. If this is its sole role, then what one needs theta roles for is not semantic interpretation in general but for priming the syntax. In this case, all one needs are a few thematically well behaved verbs (like ‘eat,’ ‘hug,’ ‘hit,’ ‘kiss,’) and to get the syntax off the ground. Once flying, we don’t need thematic information any more for there arise other ways, some of them being morpho-syntactic, to figure our where a DP came from (think case, or fixed word order position or agreement patterns). At any rate, like that morphology case, the thematic hierarchy need not be part of UG to be linguistically very important to the LAD.

[6] Think of this as a very weak (almost truistic) version of the Sapir-Whorf hypothesis.

Faculty of Language

Comments

Tuesday, November 18, 2014

Why Morphology? (part deux)

Contributors