In the previous post (here),
I showed how Frankland and Greene identifies a role sensitive region of cortex
and sub-areas within that region that are differentially sensitive to the doer
and done-to roles. In other words, if correct, F&G offers a hypothesis
about where roles like doer and done-to get coded. Finding a region sensitive
to thematic parameters would be a useful contribution given our vast ignorance
concerning the brain bases of anything (see here
discussed
here). Let me repeat this loudly lest I not be heard: FINDING A REGION SENSITIVE TO THEMATIC PARAMETERS WOULD BE A USEFUL
CONTRIBUTION GIVEN OUR VAST IGNORANCE CONCERNING THE BRAIN BASES OF ANYTHING.
However, F&G claims to do a whole lot more than this. Here I want to
consider if it does do more. So the question for what follows: does F&G
explain how the brain codes thematic
information as it appears to claim to do?
No. Not really. The paper may
have identified a region that correlates to role information but F&H’s
claim that it explains how brains code such information
seems to me quite overblown.[1] Here’s what I mean.
What would it mean to show how brains code such information?
F&G tells us. In the abstract, it takes its discovered empirical results to
support the following claim:
At a
high level, these regions may function like topographically defined data
registers, encoding the fluctuating values of abstract semantic variables. This
functional architecture, which in key respects resembles that of a classical
computer, may play a critical role in enabling humans to flexibly generate
complex thoughts.
What’s this mean? Those familiar
with earlier critiques of connectionism should recognize the allusions. People
like Fodor and Pylyshyn, Marcus, and Gallistel argued that brains had a Turing
rather than a connectionist architecture. They provided various arguments for
this, including observations about the systematicity of cognition (in
particular in language), which makes perfect sense if one assumes that brains
embodied read/write memories with variables and valuation of variables, being
key elements. Most of the arguments
provided were behavioral (though see Gallistel for more direct arguments that
brains cannot be connectionist either). F&G is clearly pointing to these
claims in the abstract above (indeed, Fodor and Pylyshyn, Marcus and Pinker are
noted in the bibliography in relation to this). So, F&G clearly intends its
results to be an argument in favor of Turing architectures and a challenge for
connectionist architectures. However, if this is the intent, I don’t see that
F&G’s argument adds anything to the earlier behavioral arguments. Why not?
F&G notes that its results
are consistent with Turing
architectures, but then so are most connectionist models so far as I can tell.
There is nothing in these models that prevents
the hidden layers (appropriately tuned) from isolating doer and done-to roles.
Indeed, this is regularly done in such models for other abstract categories.
So, if F&G intends to use its
results to argue for classical architectures, then it is unclear to me what it
has actually added to the arguments advanced by Fodor & Pylyshyn, Marcus or
Gallistel. Note, I have nothing against the conclusion that connectionist architectures
are bad neural models (less coyly: I am pretty confident that connectionist
architectures suck). What I don’t see is that F&G adds anything to the
previous arguments. I would go further (as you probably knew I would). The
concluding discussion section of F&G notes that there is a “class of models
that use matrix operations to combine spatially distributed representations
into conjunctive representations…that could potentially be augmented…[to]
encode conjunctive representations for distinct semantic roles” (11737). For
the uninitiated, this is connectionist speak. In other words, as F&G notes,
its results do not argue against a
connectionist conception in favor of a more classical Turing view. Or more
correctly, the F&G results do not add
anything to the earlier (completely compelling arguments) arguments. So, if
F&G intends its “how” contribution to consist in an argument for a classical architecture and against a connectionist one, then, by
its own admission, it fails.[2]
What else could the “how” mean?
Another possible contrast is between the kinds of codes the brain uses to track
information; in particular does the brain use a place code or a rate code to
track doers and done-tos. Let me expand a bit.
One line of thinking (that
F&G says its results endorse) exploits geography to code information:
“functional segregation corresponding to spatial segregation” and binding of
variables to values executed by bringing the two into spatial proximity. This
contrasts with another view wherein binding is signaled through temporal
proximity (synchronization) rather than spatial. F&G claims that its
results (my emphasis)
…suggest
that such temporal correlations may be unnecessary in this case
because the bindings may instead be encoded through the
instantiation of distributed patterns of activity in spatially dissociable
patches of cortex devoted to representing distinct semantic variables”
(11736).
However as the paper notes, and
the highlighted mealy-mouthed modals indicate, this conclusion is not particularly
well supported by their experiments. Or, more correctly, F&G’s tools
preclude a strong choice between the two.
As F&G notes, the hunt was conducted using fMRI and because these
have limited temporal resolution (on the order of 1000 ms) fMRI probes cannot
generally “see” rate codes. The best that F&G can conclude is that because it was able to localize roles in
geographically proximate yet distinct locals this suggests that a place coding of roles might be right, though not to the exclusion of rate codes. The logic
is that place codes require segregated (proximate?) geography and this was
found. Hence the finding supports the claim that for role information the brain
uses a place code. But this conclusion does not follow. To establish it firmly
one needs the inverse: if segregated regions then place code. But this is not
obviously true. Moreover, and here I am asking, do neuro people believe that
anytime they can localize functions in different (nearby) places that this is
evidence for place codes? Sounds wrong to me, but, hey, I don’t do this.[3]
I should add that the second
experiment is the crucial one for this conclusion, and it is less robust than
the first as F&G notes. The bifurcation of lmSTC into doer and done-to
areas is quite subtle empirically and some of the participants in the UMD
discussion thought that the data here was quite brittle. Again, this is beyond
my pay grade.
F&G, then, really says very
little (if anything) about the how
question. In fact, it never really addresses it except tangentially. “Where?,”
not “how?”, is what F&G addresses.
Let me squawk about this for a moment.
IMO, neuro types often confuse
how does X work with where is X located. Why they think answering one answers
the other I do not know. I don’t object to the claim that knowing where things
are in the brain might be/is likely to be a good first step in
figuring out how the brain does what it does. But reading F&G (and this
paper is hardly unique) leads me to think that CNers can’t tell the difference
between where and how. And this is a
problem.
One consequence of the confusion
is that it denigrates the cognitive work that it presupposes. F&G relies on
an unanalyzed conception of thematic
roles. In fact, it relies on a truism: that sentences like John saw Mary do not mean the same as Mary saw John and that the difference has something to do with the
fact that what sentences say about John/Mary
in the first sentence is effectively reverses what the second sentence says
about them. This is a truism, or as
close to one as might be imagined.
However, as any linguist knows, there are many different theories to
explain how this truism is true. Some
exploit theta roles, some grammatical roles, some the internal/external
distinction, some first vs second merge, some predicate argument structure with
1st and 2nd argument positions of a predicate, some Deep
Structures, some kernel sentences, etc. When a linguist asks how is thematic information represented,
s/he means how can we distinguish between these apparently different conceptions
all of which code/represent the observed
doer/done-to difference. F&G cannot tell us which of these is right,
nor does it intend to. This “how?” question is beyond the technical reach of
current neuro apparatus. That’s not a
criticism. Here is the criticism: by confusing where with how, F&G
continues the tradition of treating distinctions beyond the range of its probes
as non-questions, rather than as questions beyond the resolution of its
methods. The fact is that cognitive probes into the structure of brains is right now far more powerful than the
currently most fashionable technology in neuro-science. fMRI might generate
pretty pictures, but it's a pretty coarse technology. Right now, behavioral
methods generally allow us to probe brain structure in a far more refined way
than neuro methods do. That CN technology cannot usefully probe well motivated
behaviorally based claims is what we should expect, and is what we find.
A second feature of the
where/how confusion is that it leads one to abstract away from the most serious
question in the neuro-sciences. Call it Gallistel’s question: how do brains
embody mental constructs? For example, how
does wetware code for a variable or a value thereof? How do brains read and
write to memory, bind a variable, distinguish between types and tokens? Nobody knows. In fact, as Gallistel has
observed, most CNers don’t even understand that this is the “how?” question
that needs addressing (see here for discussion). The cognitive literature, including
that in linguistics, has shown that we need these notions. Much of current
neuroscience assumes that brain architectures that cannot do any of this
(indeed that apparently deny, if Gallistel is right, that brains ever do this)
are serviceable. This is partly abetted by the fact that current thinking fails
to distinguish where from how. F&G is another example of this wider
confusion.
I could go on, but I won’t.
F&G makes a contribution: it identifies one possible place for where role
information in some sense (however it is represented and whether it is
specifically linguistic or not) might live. Given the current state of
neuroscience, this is not nothing. However, the paper’s rhetoric (BS really) is
way over the top. The introduction and conclusion motivate the investigation by
pointing to really big issues (in particular recursion and Turing
architecture). It purports to address these issues but in truth it can’t. The
results are neutral wrt them. In the process, F&G sows lots of confusion
and makes lots of simple errors thereby makind it hard to find the useful
kernel in the morass. This leads me to one final observation.
I have heard it argued that
without the overstatement and the BS the paper could never have been published.
This is sometimes said in apparent justification of the BS and hype. If so
neuroscience is in really bad shape. Moreover, I am skeptical that the hype is
necessary, though I am sure that even if it is, it is odious to sling it
nonetheless. Let me vent.
First, I doubt that a more
measured presentation would have prevented publication. The result is not
trivial and could have been presented as relevant to finding where
linguistically/conceptually important concepts live in brain tissue.
Second, wanting to get published
is no excuse for BS. This is not show business. BS goes against the fundamental
values of the scientific enterprise and should not be tolerated, even if it
might be useful career-wise.[4] The big problem is that
such BS is fast becoming part of standard practice. And like all S it greases a slippery slope:
BS facilitates publication, we become more indulgent towards it and this will
serve to further BSify research and publication. There is no excuse for this,
or at least not one that should pass the smell test (and BS does smell).
Whatever, F&G has told us about brains, it is mired in overstatement and
self promotion. That’s the main reason many have reacted so strongly, and
rightly so.[5]
And that’s too bad because F&G does have something to tell us of interest.
[1]
Steve Pinker’s tweet highlights these F&G ambitions as well. It reads: “The
most important paper in cognitive neuroscience in many years: How does the
brain represent who did what to whom.” Note the “how.” I wonder if the tweet
would have had the same impact if we replaced ‘how’ with ‘where.’ I can’t tell,
though I think that the howish version sounds far more interesting. And this is
exactly the problem.
[2]
In the discussion section, F&G observes relations between its results and
some previous findings in the literature. An interesting one relates to deficit
studies that identify insult to the lmSTC results in “who did what to whom”
problems for stimuli presented aurally and visually. This suggests the
possibility, as F&G note, that this area is not linguistically dedicated.
In other words, this area might be part of an “amodal language of thought.” If
this is so, it might be interesting to see if analogous areas in non-linguistically
endowed animals can similarly discriminate doers from done-tos. This might even have some interesting
linguistic significance concerning the theoretical utility of theta roles as
discussed here.
F&G leaves the linguistic status of lmSTC for future research. Hope it gets
done.
[3]
Also, how important is the proximity? Say that doers were found in one area and
done-tos were found several sulci away. Would this be a problem for place
codes? I don’t know. At any rate, the relation between being localizable and
being place coded strikes me as looser than F&G suggests. In fact, I could
imagine that even were rate codes employed to code some functional feature the
sources generating the relevant rates might nonetheless localize somewhat. I don’t know that this is so, but nothing
F&G says leads me to think that this is impossible or even false. So a
question to cognoscenti: is this inference from localizable to place code legit?
[4]
IMO, BS is the most corrosive feature of much current research. As Frankfurt
has argued, it might be even worse than lying for unlike the latter it has no
regard for truth whatsoever. Stan Dehaene was the editor for the paper and he
should really have removed this BS from the paper. He knows better.
[5]
BTW, F&G does not get its BS right either. See the box marked
“Significance” on the first page of the paper. It suggests that the problem of
theta roles is the same as the problem of recursion. This is false. The roles
that F&G addresses have nothing to do with Humboldt’s making infinite use
of finite means. Here we have a finite set of possible sentences templatically
specifiable wrt roles of two arguments. Recursion gives you sentences with many doers and many done tos, in fact
unboundedly many. F&G has nothing to say about where the brain codes this.
No comments:
Post a Comment