I just read a very interesting shortish paper by Dehaene and
associates (Dehaene, Meyniel, Wacongne, Wang and Pallier (DMWWP) that appeared
in Neuron. I did not find an open
source link, but you can use this
one if you are university affiliated. I recommend it highly, not the least
reason being that Neuron is a very
fancy journal and GG gets very good press there. There is a rumor running
around that Cog Neuro types have dismissed the findings of GG as of little
interest or consequence to brain research. DMWWP puts paid to this and notes,
quite rightly, that the problem lies less with GG than with the current state
of brain science. This is a decidedly Gallistel inspired theme (i.e. the cog
part of cog-neuro is in many domains (e.g. language) healthier and more
compelling than the neuro part and it is time for the neuro types to pay
attention and try to find mechanisms adequate for dealing with the well
grounded cog stuff that has been discovered rather than think it msut be false
because the inadequate and primitive neuro models (i.e. neural
net/connectionist) don’t have ways of dealing with it) and the more places it
gets said the greater the likelihood that CN types will pay attention. So, this
is a very good piece for the likes of us (or at least me).
The goal of the paper is to get Cog-Neuro Science (CNS)
people to start taking the integration of behavioral, computational and neural
as CNS’s main central concern. Here is the abstract:
A sequence of images, sounds, or words can be stored
at several levels of detail, from specific items and their timing to abstract
structure. We propose a taxonomy of five distinct cerebral mechanisms for
sequence coding: transitions and timing knowledge, chunking, ordinal knowledge,
algebraic patterns, and nested tree structures. In each case, we review the
available experimental paradigms and list the behavioral and neural signatures
of the systems involved. Tree structures require a specific recursive neural
code, as yet unidentified by electrophysiology, possibly unique to humans, and
which may explain the singularity of human language and cognition.
I found
the paper interesting in at least three ways.
First,
it focuses on mechanisms, not phenomena. So, the paper identifies five kinds of
basic operations that reasonably underlies a variety of mental phenomena and
takes the aim of CNS to (i) find where in the brain these operations are executed,
(ii) provide descriptions of circuits/computational operations that could
execute such operations and (iii) investigate how these circuits might be/are
neutrally realized.
Second,
it shows how phenomena can be and have been used to probe the structure of
these mechanisms. This is very well done for the first three kinds of
mechanisms: (i) approximate timing of one item relative to the proceeding one,
(ii) chunking items into larger units, and (iii) the ordinal ranking of items.
Things get more speculative (in a good way, I might add) for the more
“abstract” operations: the coding of “algebraic” patterns and nested generated
structures.
Third,
it gives you a good sense of the kinds of things that CNS types want from
linguistics and why minimalism is such a good fit for these desires.
Let me
say a word about each.
The
review of the literature on coding time relations is a useful pedagogical case.
DMWWP reviews the kind of evidence used to show that organisms “maintain
internal representations of elapsed time” (3). It then look for “a
characteristic signature” of this representation and the “killer” data that
supports the representational claim. It then reviews the various brain
locations that respond to these signature properties and review the kind of
circuit that could code this kind of representation, arguing that “predictive
coding” (i.e. ones that “form an internal model of input sequences”) is the
right one in that it alone accommodates the basic behavioral facts (4)
(basically minsmatched negativity effects without an overt mismatch). Next, it
discusses a specific “spiking neuron model” of predictive coding (4) that
“requires a neurophysiological mechanism of “time stamp” neurons that are tuned
to specific temporal intervals,” which have,
in fact, been found in various parts of the brain. So, in this case we get the
full Monte: a task that implicates signature properties of the mechanism, that
demands certain kinds of computational circuits, realized by specific neuronal
models, realized in neurons of a particular kind, found in different parts of
the brain. It is not quite the Barn Owl (see here), but it is very very good.
DMWWP do
this more or less again for chunking, though in this case “the precise neural
mechanisms of chunk formulation remain unknown” (6). And then again for ordinal
representations. Here there are models for how this kind of information might
be neutrally coded in terms of “conjunctive cells jointly sensitive to ordinal
information and stimulus identity” (8). These kinds of conjunctive neurons seem
to be all over the place, with potential application, DMWWP suggests, as
neuronal mechanisms for thematic saturation.
The last
two kinds of mechanisms, those that would be required to represent algebraic
patterns and hierarchical tree-like structures are behaviorally very well-established
but currently pose very serious challenges on the neuro side. DMWWP observes
that humans, even very young ones, demonstrate amazing facility in tracking
such patterns. Monkeys also appear able to exploit similar abstract structures,
though DMWWP suggests that their algebraic representations are not quite like
ours (9). DMWWP further correctly notes that these sorts of patterns and the
neural mechanisms underlying them are of “great interest” as “language, music
and mathematics” are replete with such. So, it is clear that humans can deploy
algebraic patters which “abstract away from the specific identity and timing of
the sequence patterns and to grasp their underlying pattern,” and maybe other
animals can too. However, to date there is “no accepted neural network
mechanism to accomplish this and it looks like “all current neural network
models seem too limited to account for abstract rule-extraction abilities” (9).
So, the problem for CNS is that it is absolutely clear that human (and maybe monkey) brains have algebraic competence though
it is completely unclear how to model
this in wet ware. Now, that is the
right way to put matters!
This
last reiterates conclusions that Gallistel and Marcus have made in great detail
elsewhere. Algebraic knowledge requires the capacity to distinguish variables
from values of variables. This is easy to do in standard computer architectures
but is not at all trivial in connectionist/neural net frameworks (as Gallistel
has argued at length (e.g. see here)). Indeed, one of Gallistel’s main arguments with
such neural architectures is their inability to distinguish variables from
their values, and to store them separately and call them as needed. Neural nets
don’t do this well (e.g. they cannot store a value and later retrieve it), and
that is the problem because we do and we do it a lot and easily. DMWWP
basically endorses this position.
The last
mechanism required is one sufficient to code the dependencies in a nested tree.[1] One of the nice things
about DMWWP is that it recognizes that linguistics has demonstrated that the
brain codes for these kinds of data structures. This is obvious to us, but the
position is not common in the CNS community and the fact that DMWWP is making
this case in Neuron is a big deal. As
in the case of algebraic patterns, there is no good models of how these kinds
of (unbounded) hierarchical dependencies might be neurally coded. The DMWWP
conclusion? The CNS community should start working on the problem. To repeat,
this is very different from the standard CNS reaction to these facts, which is
to dismiss the linguistic data because there are no known mechanisms for
dealing with it.
Before
ending I want to make a couple of observations.
First,
this kind of approach, looking for basic computational mechanisms that are
implicated in a variety of behaviors, fits well with the aims of the minimalist
program (MP). How so? Well, IMO, MP has two immediate theoretical goals: to
show that the standard kinds of dependencies characteristic of linguistic
competence are all different manifestations of the same underlying mechanism
(e.g. are all instances of Merge). Were it possible to unify the various
modules (binding, movement, control, selection, case, theta, etc) as different
faces of the same Merge relation and were we able to find the neural “merge”
circuit then we would have found the neural basis for linguistic competence. So
if all grammatical relations are really just ones built out of merges, then
CNSers of language could look for these and thereby discover the neural basis
for syntax. In this sense, MP is the kind of theory that CNSers of language
should hope is correct. Find one circuit and you’ve solved the basic problem. DMWWP
clearly has bought into this hope.
Second,
it suggests what GGers with cognitive ambitions should be looking for
theoretically. We should be trying to extract basic operations from our
grammatical analyses as these will be what CNSers will be interested in trying
to find. In other words, the interesting result from a CNS perspective is not a
specification of how a complicated set of interactions work, but isolating the
core mechanisms that are doing the interacting. And this implies, I believe,
trying to unify the various kinds of operations and modules and entities we
find (e.g. in a theory like GB) to a very small number of core operations (in
the best case just one). DMWWP’s program aims at this level of grain, as does
MP and that is why they look like a good fit.
Third,
as any MPer knows, FL is not just
Merge. There are other operations. It is useful to consider how we might
analyze linguistic phenomena that are Merge recalcitrant in these terms.
Feature checking and algebraic structures seem made for each other. Maybe
memory limitations could undergird something like phases (see DMWWP discussion
of a Marcus suggestion on p. 11 that something like phases chunk large trees
into “overlapping but incompletely bound subtrees”). At any rate, getting
comfortable with the kinds of mental mechanisms extant in other parts of
cognition and perception might help linguists focus on the central MP question:
what basic operations are linguistically proprietary? One answer is: those
operations required in addition to
those that other animals have (e.g. time interval determination, ordinal
sequencing, chunking, etc.).
This is
a good paper, especially so because of where it appears (a very leading brain
journal) and because it treats linguistic work as obviously relevant to the CNS of language. The project is basically
Marr’s, and unlike so much CNS work, it does not try to shoehorn cognition
(including language) into some predetermined conception of neural mechanism
which effectively pretends that what we have discovered over the last 60 years
does not exist.
[1]
DMWWP notes that the real
problem is dependencies in an unbounded
nested tree. It is not merely the hierarchy, but the unboundedness (i.e.
recursion) as well.
I like the way Dehaene et al. pose the questions of representation and computation, derived from linguistics and Minimalism, as challenges for neuroscience. Super job on that point. I do have some concerns about their characterization of the language network in the brain.
ReplyDeleteIt's overwhelmingly clear that Broca's area / IFG cannot be critical or fundamental for sentence processing. If you destroy that area, people are pretty much fine with respect to basic sentence comprehension and acceptability judgments (Mohr et al., 1978: Linebarger et al., 1983). They do not cite any of this (old) literature, because it's obviously a massive red flag for their account. There may be ways to address this literature and preserve what they're saying, but they don't even try to raise the issue. This is a problematic oversight.
This comment has been removed by the author.
ReplyDelete