Tuesday, December 18, 2012

Lila on Learning (and Norbert gets chastised)

Dear Norbert,

I hesitate to say ANYTHING in response to your last post because its positive tone leaves me glowing, wonderful if everyone thinks the same about our findings and our new work.   But there are some foundational issues where you and I diverge and they're worth a good vetting, I think.   I really was totally surprised at your equating slow/probabilistic/associationistic learning with "learning in general," i.e. that if some change in an organism's internal state of knowledge wasn't attributable to the action of this very mechanism, then it isn't learning at all, but "something else."   My own view of what makes something learning is much more prosaic and excludes the identification of the mechanism:  if you come to know something via information acquired from the outside, it's learning.   So acquiring knowledge of Mickey Mantle's batting average in 1950 or finding out that the pronunciation of the concept 'dog' in English is "dog" both count as learning, to me.   At the opposite extreme of course are changes due entirely, so far as we know, to internal forces, e.g., growing hair on your face or replacing your down with feathers (though let's not quibble, some minimal external conditions have to be met, even there).   The in-between cases are the ones to which Plato drew our attention, and maybe Noam is most responsible in the modern era for reviving:

“But if the knowledge which we acquired before birth was lost by us at birth, and afterwards by the use of the senses we recovered that which we previously knew, will not that which we call learning be a process of recovering our knowledge, and may not this be rightly termed recollection?”
(Plato Phaedo [ca. 412 BCE])

I take UG to be a case, something (postulated) as an internal state, probably a logically necessary one, that is implicit in the organism before knowledge (say, information about English or Urdu) is obtained from the outside, and which guides and helps organize acquisition of that language-specific knowledge.  I have always believed that most language acquisition is consequent on and derivative from this structured, preprogrammed, basis, yet something crucially comes from the outside and yields knowledge of English, above and beyond the framework principles and functions of UG.   Syntactic bootstrapping, for example, is meant to be a theory describing how knowledge of word meaning is acquired within (and "because of") certain pre-existing representational principles, for example that -- globally speaking -- "things" surface as NP's (further divided into the animate and inanimate) and "events/states" as clauses, so the structure NP gorps that S would be licensed for "gorp" iff its semantics is to express a relation between a sentient being and an event, e.g., "knowing" but not "jumping."  

The problems we have in mind to address in the recent work you've been discussing, to my delight, are: how do you ever discover where the NP's etc. are, in English?  That its subjects precede its verbs, roughly speaking.   This knowledge comes from outside (it is not true of all languages) and has to be acquired to concretize the domain-specific procedure in which you learn word meanings, in part at least, by examining the structures for which they're licensed. I have argued that, at earliest stages, you can't make contact with this preexisting knowledge just because you don't know, e.g. where the subject of the sentence is.   To find out, you have to learn a few "seed words" via a domain-general procedure (available across all the species of animals we know perhaps barring the paramecia) .  That procedure has almost always (since Hume, anyhow) been conceived as association (in its technical sense).   As I keep mentioning, success in this procedure is vanishingly rare, it is horrible, because of the complexity and variability of the external world, though reined in to some extent by some (domain-specific) perceptual-conceptual biases (see Markman and Wachtel, inter alia).  Apparently, you can only acquire a pitiful set of whole-object concrete concept labels by this technique.   Though it is so restrictive, we take it as crucial:  it is the only possibility that keeps "syntactic bootstrapping" from being absolutely circular, it provides the enabling data for SB, enough nouns to hint as to where the subject is, hence given this knowledge of "man" and "flower" and the observation of a man watering a flower, you learn not only the verb "water" but the fact that English is SVO.  

So back to the point:  I think word learning starts with a domain-general procedure that acquires "dog" by its cooccurrence with dog-sightings, given innate knowledge of the concept ‘dog,’ and one learns (yes) that English is SVO as a second step, and as above.  This early procedure gives you concretized PS representations, domain-specific, language-specific ones, that allow you to infer "a verb with mental content" from appearance in certain clausal environments.   That's my story.  

What my argument with you, now, is about, is the contrapositive to Plato, I am asking:   "If you have to have information from the outside, information as to the licensing conditions for "dog" (and, ultimately, "bark"), is acquiring that information not rightly termed learning?"    I think it is.   But the mechanism turns out (if we're right) to be more like brute-force triggering than by probabilistic compare-and-contrast across instances.    The exciting thing, as you mention, is that others through the last century (e.g., Rock, Bower, several others) insisted that learning in quite different areas might work that way too.   Most exciting I think is the work starting in the 1940's and continued in the exquisite experimental work of Randy Gallistel, showing that it is probably true of the wasps, the birds and the bees, the rats, as well, even learning stupid things in the laboratory (well, not Mickey Mantle's batting average, but, at least, where the food is hidden in the maze).


  1. I have no conceptual problem using ‘learning’ as you propose but I believe that it will be terminologically better to restrict the term to mechanisms of certain sorts. There is a useful distinction between "triggering" and "shaping." Both are methods for fixing beliefs responsive to "outside" information. They focus, however, on very different mechanisms. As we truck in mechanisms when we try to understand what's going on in human minds, I think that noting this distinction terminologically is worthwhile. Let's restrict 'learning'; for shaping effects. Moreover, I suspect that 'learning' has already been appropriated by the "bad guys." I prefer keeping terminology clean: shaping or triggering (or a bit of both). This at least allows for the issues to be clear and we already have a neutral term viz. ‘acquisition.’

    There is a second reason I prefer this. One reasonably good argument in favor of rich nativist structure is very quick, one trial learning. Or, more correctly, if acquisition is gradual and incremental then a good account for this would be to postulate learning. I assume that's why classical learning curves led people to (wrongly) surmise that this was a poster child example of learning. The naturalness of this conclusion is why Randy and Rock and Estes had to show that this kind of curve would also arise by averaging over individuals all of whom were engaged in one trial acquisition. This observation serves to block the reasonable inference that was made. The classical learning theorists were wrong about the facts, but right about the form of the argument; gradualism does suggest a classical conception of learning in which beliefs are shaped by environmental inputs.

    My upshot: learning is larded with bad vibes in my view. Better to discard the term, use the neutral 'acquisition' and then argue about mechanisms; shaping or triggering? Classical Learning or Fast Mapping? Our “disagreement” is just semantics, but sometimes what you call something restricts how you think about it and thinking of acquisition in terms of learning has made thinking of alternative scenarios very difficult.

  2. Just to add an Aristotelian twist to the Platonic typology...I believe that in some butterfly species, wing color can depend on the ambient temperature during the larval period, in ways that correlate with differing utility in different seasons: wet-dry, cool-warm. (See This seems like a case worth thinking about: a severely constrained kind of variation within a species, allowing each of two "settings of a parameter" to carry information (in Shannon's sense) regarding a contingent property of the environment. Here, I take it, nobody wants to say that larvae are learning the temperature--much less learning what color wings to have, even if they end up matching their peers in this regard. Moving to a case in which a contingent feature of the environment arguably has a more cognitive effect, suppose that young bees can very quickly use a local cue (say, the arc of the sun) to set a "latitude value" when kick-starting whatever system lets them rely on sun position (along with other information) when navigating or communicating. Then one question is whether there is any cash value in describing this process as learning. Another question is where knowledge of language lies on the butterfly-wing/bee-navigation/baseball-trivia scale.

  3. I realize that this post is from quite some time ago, but I'd be keen to hear (from anyone that eventually comes across this belated post) what the domain general mechanisms are supposed to be in Lila's story. She's not very clear on them in this post, and in reading a paper of hers on the topic I haven't gotten a clearer sense.

    From a generativist philosophy of mind, I'm not convinced any domain general mechanisms follow (at least at the psychological level of explanation).