Wednesday, December 12, 2012

Does Anyone Ever Learn Anything?

Let’s follow Jimmy and Judy from birth to about five.  At birth they say precious little. At five you can’t shut them up. What happened in these five years? Answer: they learned their native language, English say. Obvious no?  Yes. But is it right? Did Judy and Jimmy learn English?  Well, to paraphrase a recent political celebrity, it all depends on what you mean by learn and English.  It seems undeniable that Judy and Jimmy developed a capacity absent (or at least invisible[1]) at birth and this capacity can be exercised to converse with some natives, the English speaking ones, but not others, the Mandarin speaking ones. However, does this imply that they learned English? 

Linguists have long understood that labels like English, French, Swahili, Mandarin, etc. are more convenience terms than terms of art (here’s where linguists mention the Weinreich quip that a language is just a dialect with an army and a navy; real cognoscenti adding sotto voce that dialects are just idiolects with epaulettes). However, recent research suggests that we have been far too cavalier about the first half of this doublet.  We can all agree that Judy and Jimmy acquired (competence in) English, but did they learn English? Recent (and, as we shall see, not so recent) research into word learning suggest that here we need to look before we leap, something, it appears, that kids do not do, at least when it comes to early word acquisition. I’ve already discussed some of the research by MSTG (here) which argues (very convincingly in my view) that the early stages of lexical acquisition do not involve the careful statistical weighing of competing alternatives but involves jumping to a conclusion mentally clutched with fierce determination and, with time, forgotten if incorrect, only to set up another ill supported jump into the lexical abyss (boy was that fun to write!). MSTG support this conclusion by considering learning situations less factitious than the contrived set-ups near and dear to the psych lab.  When kid and adults are asked to consider more natural (and hence visually busy) filmed vignettes in which the targets of lexical labeling are not clearly segregated and identified on pristine picture cards, they acquire word meanings all at once or not at all. This result is important for several reasons.

First and foremost, it provides a concrete illustration of why we should not equate acquisition with learning.  MSTG provides diagnostics of learning (multiple hypotheses, statistical weighing of these alternatives, gradual convergence on the right result) and argues that learning so understood fails to hold in more realistic contexts of lexical acquisition. Specific conclusion; in at least one demonstrable case, acquisition does not equal learning.  More general conclusion; it is an empirical question whether in any given acquisition context it is true that learning (now understood to be one mechanism among others for the acquisition of knowledge) is taking place.  In other words, Jimmy and Judy certainly acquired English but whether they learned it is entirely open for empirical grabs. Chomsky’s repeated suggestion that we understand language acquisition as a species of growth rather than learning makes an analogous point.  MSTG makes the case more crisply, I believe, by providing clear diagnostics of learning and showing that there are core cases of “learning” where these signature properties of learning are demonstrably absent.

Second, MSTG provide a rationale for why learning doesn’t hold in their examined cases. Commenting on this (here) I observed that the MSTG results suggest that in such busy contexts the prerequisites for “cross situational learning” do not exist and this is why an alternative acquisition strategy is employed. Following MSTG’s lead, I even suggested that learning requires the kind of structured hypothesis space provided by the contrived set ups that MSTG’s more realistic vignettes argue against. This constituted a kind of compromise position; learning applies where acquirers have well structured hypothesis spaces and leaping to conclusions holds where this fails to hold.[2]  However, (and learn from this you soft-hearted open-minded intellectual compromisers out there) once again Norbert’s natural generosity of spirit and desire for intellectual group-hug kumbaya moments led him astray. It seems that even this compromise position concedes too much to learning mechanisms. In a companion paper, Trueswell, Medina, Hafri and Gleitman (TMHG) extend the MSTG results to include the more stylized learning environments in which options are clearly marked and lexical targets (aka referents) are crisply identified.[3]  Even in the artificial setting of the experimental psych lab kids and adults do the darndest things!

More specifically, like MSTG, TMGH identify the quiddity of “cross situational learning” with the following mechanism:

…keeping track of multiple hypotheses about a word’s meaning across successive learning instances, and gradually converg[ing] on the correct meaning via an intersective statistical process (128).

What they demonstrate is that even in simple stylized psych lab contexts when acquirers “are placed in the initial novice state of identifying word-to-referent mappings across learning instances, evidence for such a multiple hypothesis tracking procedure is strikingly absent (128),” and learners don’t “track the cross-trial statistics in order to build the correct mappings between words and referents (129).”  What acquirers do is “make a single conjecture upon hearing the word used in context and carry that conjecture forward to be evaluated for consistency with the next observed context (129).”  If confirmed, the guess is retained, if disconfirmed, acquirers guess again as if de novo.

TMGH show this (same with MSTG) by considering the dynamics of knowledge fixation: “how learning accuracy unfolded across learning instances (130).” This is very interesting stuff and I strongly recommend that you look at the details. However, just as interesting is the little bit of history that TMGH review. TMGH acknowledge tapping into a long history of criticism of this traditional gradual/comparative conception of learning.  In the late 1950s and early 1960s first Irvin Rock (1957) and then William Estes (1960) used very analogous kinds of arguments to demonstrate that verbal learning was not learning at all but was one-trial guessing. They did not fare so well. As Roediger and Arnold (RA) (2012) put it “the verbal learning establishment rose up to smite down these new ideas (2).”  It is instructive to read the RA paper for it shows the weakness of the counter-arguments used against Rock and Estes that nonetheless carried the day.  Just like today, learning was less a hypothesized mechanism for acquisition than a definitional truth about it.[4] 

The TMHG paper ends with an interesting paradox that I would like to briefly discuss as well. Initial word learning is slow and laborious (35-140 words at 14 months) but unbelievably rapid thereafter (12,000 words by age 6, i.e. roughly 7 words per day for a little under 5 years).  Why the change? TMHG note other work (Lila’s syntactic bootstrapping hypothesis) which proposes that “the acquisition of syntax and other linguistic knowledge by toddlers and young preschoolers during this time period provides a rich database of additional constraints that permit the learning of many additional words.” It is conceivable that with this knowledge in place, cross situational learning might finally become operative, though as TMHG correctly observe, this is decidedly an empirical question and it is “plausible that a propose-but-verify word learning procedure is at work all along the course of word learning throughout most of the life cycle (150).”  It would be interesting in the extreme, in my view were either conclusion correct.

If the latter conclusion proved true (propose and verify all the way down), then language acquisition might have nothing to do with learning in the technical sense. Of course, this is consistent with grammar acquisition being a case of learning, i.e. perhaps we don’t learn words but we do learn parameters. Maybe, but I’d be very skeptical. If even word acquisition isn’t an instance of learning then it seems to me that the burden of proof that any area of language acquisition involves learning would be pretty high.

If, on the other hand, the former option is correct, (viz. that learning only kicks in when grammar is there to buttress it) then its role in accounting for language acquisition would, in my opinion, be quite modest.  Yes, learning plays a role, but really most of the action lies with the constraints that grammars place on the process. This does not mean that we should not study the extras that learning might be adding (though remember the possibility mooted in the prior paragraph), but I doubt that these cognitive titivations will generate much excitement if they only operate in restricted hypothesis spaces. Learning is interesting when there are lots of options that need sifting, not so much when the range of possible end states is highly restricted. 

MSTG and TMHG show us that terminology matters. If we call acquisition “learning” we’ve loaded the research dice. If MSTG and TMHG are right (and I’d bet quite a bit that they are: any suckers out there?) it looks like we’ve repeatedly loaded them to come up snake eyes. It’s time we cut our losses and open our minds to the possibility that ‘language learning’ like ‘Justice Roberts,’ ‘military intelligence,’ and ‘western civilization’ is an oxymoron.

[1] I include this hedge for every week another (usually French speaking) psychologist shows that the youngest kids seem to have the most prodigious knowledge. It seems that we have nothing to teach those little know-it-alls. 
[2] A kind of learning as first resort, jumping to conclusions a last resort, method. C.f. TMHG p. 130.
[3] It is doubtful that the meaning of a lexical item is simply its referent for reasons that Chomsky has belabored (sadly, quite unsuccessfully) over the years. There is far more structure to lexical items than what they refer to. However, for current purposes, this only further dramatizes the inadequacy of learning as a mechanism for lexical acquisition. 
[4] TMHG also cite another paper by Gallistel and friends that I have not yet read but will try to get hold of and blog about when I do. It argues (cited in TMHG 151), that “in most subjects, in most paradigms, the transition from a low level of responding to an asymptotic level is abrupt.” Oh my. I’ll keep you posted.

