Faculty of Language: Domain Specificity

Monday, October 7, 2013

Domain Specificity

One of the things that seems to bug many about FL/UG is the supposition that it is a domain specific module dedicated to ferreting out specifically linguistic information. Even those that have been reconciled to the possibility that minds/brains come chock full of pre-packaging, seem loath to assume that these natively provided innards are linguistically dedicated. This antipathy has even come to afflict generativists of the MP stripe for there is believed to be an inconsistency between domain specificity and the minimalist ambition to simplify UG by removing large parts of its linguistically idiosyncratic structure. I have suggested elsewhere (here) that this tension is only apparent, and that it is entirely possible to pursue the MP cognitive leveling strategy without abandoning the idea that there is a dedicated FL/UG as part of human biological endowment. In a new paper, Gallistel and Matzel (G&M) (here) argue that domain specificity is the biological default once one rejects associationism and adopts an information processing model of cognition. Put more crudely, allergy to domain specificity is just another symptom of latent empiricism (i.e. a sad legacy of associationism).

I, of course, endorse G&M’s position that there is nothing inconsistent between accepting functionally differentiated modules and the assumption that these are largely constructed using common basic operations. And, I of course love G&M’s position that once one drops any associationist sympathies (and I urge you all to immediately do this for your own intellectual well being!), then the hunt for general learning mechanisms looks, at least in biological domains, ill-advised. Or put more positively: once one adopts an information processing perspective then domain specificity seems obviously correct. Let’s consider G&M’s points in a little detail.

G&M contrasts associationist (A) and information processing (IP) models of learning and memory. The G&M paper is divided into two parts, more or less. The first several pages comprise a concise critique of associationist/neural net models in which learning is “the rewiring of a plastic nervous system by experience, and memory resides in the changed wiring (170).” The second part develops the evidence for an IP perspective on neural computations. The IP models contrast with A-models in distinguishing the mechanisms for learning (whose function is to “extract potentially useful information from experience”) and those for memory (whose function is to “carr[y] the acquired information forward in time in a computationally accessible form that is acted upon by the animal at the time of retrieval”) (170). Here are some of their central points.

A-models are “recapitulative.” What G&M (170) intend here is that learning consists in finding the pattern in the data (see here): “An input that is part of the training input, or similar to it, evokes the trained output, or an output similar to it.” IP models are “in no way recapitulations of the mappings (if any) that occurred during the learning.” This is the classical difference between rationalist vs empiricist conceptions of learning. A- models conceive of environmental input as adapting “behavior” to environmental circumstances. IP-models conceive of learning as building a “representation of important aspects of the experienced world.”

A-models gain a lot of their purchase within neuro-science (and psychology) by appearing to link so directly to a possible neural mechanism; long-term potentiation (LTP). However, G&M argue vigorously that LTP support for A-models entirely evaporates when the evidence linking LTP to A-models is carefully evaluated. G&M walk us slowly through the various disconnects between standard A-processes of learning and LTP function; their time scales are completely different (“…temporal properties of LTP do not explain the temporal properties of behaviorally measured association formation (172).”), their persistence (i.e. how long the changes in LTP vs associations last) is entirely different and so “LTP does not explain the persistence of associative learning (172),” their reactivation schedules are entirely different (i.e. If L is learned and then extinguished, L is reacquired more quickly, but “LTP is neither more easily nor more persistent than it was after previous inductions.”), nor do LTP models provide any mechanism for solving the encoding problem (viz. A-learning is mediated by the comparison of different kinds of temporal intervals and there is no obvious way for LTP nets to do this) except by noting that what gets encoded is emergent (and this amounts to punting on the encoding problem, rather than addressing it).

In short, there is no support for A-models from neural LTP models. Indeed, the latter seem entirely out of synch with what’s needed to explain memory and learning. As G&M put it: “…if synaptic LTP is the mechanism of associative learning- and more generally, of memory- then it is disappointing that its properties explain neither the basic properties of associative learning nor the essential properties of a memory mechanism (173).” So much for the oft insinuated claim that connectionist models are preferable because they are neurally plausible (indeed, obvious!).

General conclusion: A-models have no obvious support from standard LTP models and these standard LTP models are inadequate for handling the simplest behavioral data. In effect, A- and LTP- accounts are the wrong kinds of theories (not wrong in detail, but in conception and hence without much (if any) redeeming scientific value) if one is interested in understanding the neural bases of cognition.

So what’s the right approach? IP-models. In the last parts of the paper G&M go over some examples. They note that biologically plausible IP models will all share some important features:

1. They will involve domain specific computations. Why? “Because no general purpose computation could serve the demands of all types of learning (175),” i.e. domain specificity is the natural expectation for IP models of neuro-cognition.

2. The different computations will apply the same “primitive operations” in achieving functionally different results (175).[1]

3. The IP approach to learning mechanisms “requires an understanding of the rudiments of the different domains in which the different learning mechanisms operate” (175). So, for example, figuring out if A is cause of B, or A is the edge of B will involve different computations from each other and from those that mediate the pairing of meanings and/with sounds.

4. Though the neuro-science is at a primitive stage right now, “…if learning is the result of domain-specific computations, then studying the mechanism of learning is indistinguishable from studying the neural mechanisms that implement computations (175).”

Note that this will hold as much in the domain of language as in navigation and spatial representation. In other words, once one dumps Associationism (as one must as it is empirically completely inadequate and intellectually toxic) then domain specificity is virtually ineluctable. There exist no interesting general purpose learning systems (just as there it no general sensing mechanism, as Gallistel has been wont to observe). That’s the G&M message. Cognitive computation, if it’s to be neurally based, will be quite specifically tailored to the cognitive tasks at hand, even if built from common primitive circuits.

The most interesting part of G&M, at least to me, was the review of the specific neural cells implicated in animal capacities for locating oneself in space and moving around within it. It seems that neuro-scientists are finding “functionally specialized neurons [that] signal abstract properties of the animal’s relation to its spatial environment (185).” These are genetically controlled and, as G&M note, their functional specialization provide “compelling evidence for problem-specific mechanisms.”

Note that the points G&M make above fit very snugly with standard assumptions within the Chomsky version of the generative tradition. In other words, the assumptions that generative linguists make concerning domain specific computations and mechanisms (though not necessarily primitive operations) simply reflects what is, or at least should be, the standard assumption in the study of cognition once Associationism is dumped (may it’s baneful influence soon disappear). If G&M are right, then there are no good reasons from neuro-biology for thinking that the standard assumptions concerning native domain specific structures for language are exotic or untoward. They are neither. The problem is not with these assumptions, but with the unholy alliance between some parts of contemporary neuroscience and the A-models of learning and cognition that neuro types have uncritically accepted.

If you read the whole G&M paper (some parts involve some heavy lifting) and translate it into a linguistics framework, it is very hard to avoid the conclusion that if G&M are correct, (and, in case you’ve missed it, IMO they are) then the Chomskyan conception of language, mind, and brain is both anodyne and the only plausible game in cog-neuro town.

[1] A similar point wrt MP and UG is made here.

25 comments:

Alex ClarkOctober 8, 2013 at 3:37 AM
So what is a good IP model of learning?

I don't really understand where the boundary is meant to be drawn. I see that neural networks are BAD, and Q-learning is BAD, but what sorts of learning algorithms are good?

(Parenthetically, I think it is very strange indeed to start from navigation in ants and rats, and take this to be the right starting point for understanding language acquisition.
Navigation is presumably one of the most ancient and ubiquitous bits of cognition, and seems about as far away from language (recent, human-specific, discrete, learned behavior) as it is possible to get. )
ReplyDelete
Replies
UnknownOctober 8, 2013 at 9:40 AM
Two questions and a comment:

1. Is it possible to post a link to G&M that does not require me to dish out $20?

2. The main argument seems to be: we have either A models [which are not domain specific NDS] or IP models [which are domain specific DS]. A models are bad therefore a good model must be domain specific. This would of course only follow if A models would be the only possible NDS models. How has this been established?

My comment: I do not think that misguided empiricism/behaviourism is the main motivation for proposing NDS models. The main motivation comes from evolution: tinkering with existing structures and recruiting such structures for new purposes is a lot less 'costly' [in terms of evolutionary book-keeping] than generating novel structures AND maintaining these for a single [DS] purpose. So even if it were true that Merge were the result of a single mutation this would not explain why this novel structure did not get excapted for new purposes but remained DS over a fairly long period of time [compare how little time it took for lactose tolerance to 'evolve']. An earlier link by Norbert to [https://www.simonsfoundation.org/quanta/20130904-evolution-as-opportunist/ ] seems to suggest he accepts that evolution does not result in DS by default... And, needless to say, I share Alex's scepticism about the usefulness of analogies from ant or rat navigation...
ReplyDelete
Replies
VilemKodytekOctober 8, 2013 at 11:17 AM
Copy the article title from the $20 page into a browser - and you get it.
ReplyDelete
Replies
AveryAndrewsOctober 8, 2013 at 5:45 PM
It seems to me that the real line between Good and Evil is between people who think that the hypothesis space has some structure that is relevant and can be investigated (even if they think that some implicit procedural characterization that they're working on is the best way to do this), and those who don't get this issue at all, or consider it senseless, completely irrelevant, or otherwise hopeless.

With the first kind of person, you can apply data and thought to the issue of how best to describe the structure of the hypothesis space, whereas with the second, the only option is to have more alcohol and talk about something else.

I see no point in assuming that the hypothesis space for knowledge of language has a rich, innate, domain specific structure unless you have some techniques for finding out what it is, but then, given these, you don't have to assume anything, but just investigate various areas and find out (as Ewan said, a few days ago, in a reply, if I understood him correctly).
ReplyDelete
Replies
Alex ClarkOctober 9, 2013 at 8:25 AM
I read part of the G & M paper because I was interested in the arguments linking the debate about A/IP with the debate about domain-specificity;
the first being the classic PDP versus Turing machines, symbolic processing versus subsymbolic processing that was hashed over fairly thoroughly in the 80s and 90s and the second being for me the more interesting question.
And G & M have some good arguments against the naive neural/associationist/behaviorist view, which I repudiate as well.

But unfortunately G & M don't actually provide any arguments to link these two arguments: they just assert it without reasons.
Unless I am missing something?
The relevant passage is the one that Norbert quotes:
"
Framing learning problems as computational problems leads to the postulation of domain-specific learning mechanisms (Chomsky 1975, Gallistel 1999) because no general-purpose computation could serve the demands of all types of learning.
"
But that is it. There is no further argument. If someone could explain this argument that would be great.

(And the claim is false, at least as far as I understand it. The general computation of the inner-product in a Hilbert space be the basis for a general learning algorithm like SVMs. )

Then, to make matters worse, as one of the *domains* they consider probabilistic learning! So maybe they are using the word "domain" in some completely different way from the way that Norbert and I do.

And then p.176 they completely botch Bayes rule. Look at the formula they provide and try to make sense of it: they multiply when they should divide or something and the notation is completely wrong. I guess this was just a proofing fail but it doesn't inspire confidence.

I didn't find this paper helpful in clarifying the issues at all.
ReplyDelete
Replies

Add comment

Faculty of Language

Comments

Monday, October 7, 2013

Domain Specificity

25 comments:

Contributors