I was going to do something grand in praise of the paper I
mentioned in an earlier post by Nai Ding, Lucia Melloni, Hang Zhang, Xiang Tian
and David Poeppel (DMZTP) in
Nature Neurosceince
(here).
However, wiser heads have beaten me to the punch (see the comments sections here).
Still, as Morris Halle once noted, we here discuss not the news but the truth,
and with a mitigated version of this dictum in mind, I want to throw in my 2
cents (which in Canada, where I am writing this now, would amount to exactly 0
cents, given the recent abandonment of the penny (all amounts are rounded to
nearest 0)). So here is my summary judgment (recall, I AM NO EXPERT IN THESE
MATTERS!!!). It is the best neurolinguistics paper I have ever read. IMO, it
goes one step beyond even the best neuro-ling papers in outlining a possible
(as in ‘potential’) mechanism for a linguistically relevant phenomenon. Let me
explain.
The standard good
neuroling paper takes linguistically motivated categories and tries to localize
them in brain geography. We saw an example of this in the Frankland and Greene
paper wrt “theta roles” (see here
and here)
and in the Pallier et. al. paper for Merge (see here).
There are many other fine examples of this kind of work (see comment section here
for other many good references)[1].
However, at least to me, these papers generally don’t show (and even don’t even
aim to show) how brains accomplish
some cognitive task but try to locate where in the brain it is being
discharged. DMZTP also plays the brain geography game, but aims for more. Let
me elaborate.
DMZTP accomplishes several things.
First, it uncovers brain indices of hierarchy building. How
does it do this? It isolates a brain measure of on-line sentence parsing, a
measure that “entrains” to (correlates with) to linguistically relevant
hierarchy independently of prosodic and
statistical properties of the input. DMZTP assume, as any sane person
would, that if brains entrain to G relevant categories during comprehension
then these brains contain knowledge of the relevant categories and structures.
In other words, one cannot use knowledge that one does not have (cannot entrain
to data structures that are not contained in the brain). So, the paper provides
evidence that brains can track
linguistically significant categories and rationally concludes that the brain does so whenever confronted with
linguistic input (i.e. not only in artificial experimental conditions required
to prove the claim, but reflexively does this whenever linguistic material is
presented to it).
Showing this is no easy matter. It requires controlling for
all other sorts of factors. The two prominent ones that DMZTP controls for are
prosodic features of speech and the statistical properties of sub-sentential
inputs. Now, there is little doubt that speech comprehension exploits both
prosodic and statistical factors in parsing incoming linguistic input. The majority
opinion in the cog-neuro of language is that such features are all that the
brain uses. Indeed, many assume that brains are structurally incompatible with
grammatical rules (you know, neural nets don’t do representations) that build
hierarchical structures of the kind that GGers have been developing over the
last 60 years. Of course, such skepticism is ridiculous. We have scads of
behavioral evidence that linguistic objects are hierarchically organized and
that speakers know this and use this on line.[2]
And if dualism is false (and neuro types love to rail against silly Cartesians
who don’t understand that there are no ghosts (at least in brains)), then this
immediately and immaculately implies that brains
code for such hierarchical dependencies as well.[3]
DMZTP recognizes this (and does not
interpret its results Falkland&Greenishly i.e. as finally establishing some
weak-kneed hair brained linguistic’s conjecture). If so, the relevant question
is not whether this is so, but how it is, and this resolves into a series of
other related questions: (i) What are the neural indices of brain sensitivity
to hierarchy? (ii) What parts of the brain generate these neural markers? (iii)
How is this hierarchical information coded in neural tissue? and (iv) How do
brains coordinate the various kinds of linguistic hierarchical information in
online activities? These are hard question. How does DMZTP contribute to
answering them?
DMZTP shows that different brain frequencies track three
different linguistically relevant levels: syllables, phrases and sentences. In
particular, DMZTP shows
that cortical dynamics emerge at all timescales
required for the processing of different linguistic levels, including the
timescales corresponding to larger linguistic structures such as phrases and
sentences, and that the neural representation of each linguistic level
corresponds to timescales matching the timescales of the respective linguistic
level (1).
Not surprisingly, the relevant frequencies go from shorter
to longer. Moreover, the paper shows that the frequency responses can only be
accounted for by assuming that the brain
exploits “lexical, semantic and syntactic knowledge” and cannot be explained in terms of the
brain’s simply tracking prosodic or statistical information in the signal.
The tracking is actually very sensitive. One of the nicest
features of DMZTP is that it shows how “cortical responses” change as phrasal
structure changes. Bigger sentences and phrases provide different (yet similar)
profiles to shorter ones (see figure 4). In other words, DMZTP identifies
neural correlates that track sentence and phrase structure size as well as
type.
Second, DMZTP identifies the brain areas that generate the
neural “entrainment” activity they identified. I am no expert in these matters,
but the method used seems different from what I have seen before in such
papers. They used “intracranial cranial” electrodes (i.e. inside brains!) to localize the generators of the activity. Using
this technique (btw, don’t try this at home, you need hospitals with consenting
brain patients (epileptics in DMZTP’s case) who are ready to allow brain
invasions), DMZTP shows that the areas that generate the syllable, phrase and
sentence “waves” spatially dissociate.
Furthermore, they show that some areas of the brain that
respond to phrasal and sentential structure “showed no significant syllabic
rate response” (5). In the words of the authors:
In other words, there are cortical circuits
specifically encoding larger, abstract linguistic structures without responding
to syllabic-level acoustic features of speech. (5)
The invited conclusion (and I am more than willing to accept
the invitation) is that there are neural circuits tuned to tracking this kind
of abstract linguistic information. Note: This does not imply that these circuits are specifically tuned to exclusively
tracking this kind of information. The linguistic specificity of these brain
circuits has not been established. Nor has it been established that these kinds
of brain circuits are unique to humans. However, as DMZTP clearly knows, this
is a good first (and necessary) step towards studying these questions in more
detail (see the DMZTP discussion section). This, IMO, is a very exciting
prospect.
The last important contribution of the DMZTP lies in a
speculation. Here it is:
Concurrent neural tracking of hierarchical
linguistic structures provides a plausible functional mechanism for temporally
integrating smaller linguistic units into larger structures. In this form of
concurrent neural tracking, the neural representation of smaller linguistic
units is embedded at different phases of the neural activity tracking a higher
level structure. Thus, it provides a
possible mechanism to transform the hierarchical embedding of linguistic
structures into hierarchical embedding of neural dynamics, which may
facilitate information integration in time.
(5) [My emphasis, NH]
DMZTP relates this kind of brain wave embedding to
mechanisms proposed in other parts of cog-neuro to account for how brains
integrate top-down and bottom-up information and allows for the former to
predict properties of the latter. Here’s DMTZP:
For language processing, it is likely that
concurrent neural tracking of hierarchical linguistic structures provides
mechanisms to generate predictions on multiple linguistic levels and allow
interactions across linguistic levels….
Furthermore, coherent synchronization to the
correlated linguistic structures in different representational networks, for
example, syntactic, semantic and phonological, provides a way to integrate
multi-dimensional linguistic representations into a coherent language percept
just as temporal synchronization between cortical networks provides a possible
solution to the binding problem in sensory processing.
(5-6)
So, the DMZTP results are
theoretically suggestive and fit well with other current theoretical speculations
in the neural literature for addressing the binding problem and for providing a
mechanism that allows for different kinds of information to talk to one
another, and thereby influence online computation.
More particularly, the low
frequency responses to which sentences entrain are
… more distributed than high-gamma activity [which
entrain to syllables, NH], possibly reflecting the fact that the neural
representations of different levels of linguistic structures serve as inputs to
broad cortical areas. (5)
And
this is intriguing for it provides a plausible way for the brain to use high
level information to make useful predictions about the incoming input (i.e. a
mechanism for how the brain uses higher level information to make useful
top-down predictions).[4]
There
is one last really wonderful speculation; the oscillations DMZTP has identified
are “related to intrinsic, ongoing neural oscillations” (6). If they are, then
this would ground this speech processing system in some fundamental properties
of brain dynamics. In other words, and this is way over the top, (some of) the
system’s cog-neuro properties might reflect the most general features of brain
architecture and dynamics (“the timescales of larger linguistic structures fall
in the timescales, or temporal receptive windows that the relevant cortical
networks are sensitive to”). Wouldn’t that be amazing![5] Here is DMZTP again:
A long-lasting controversy concerns how the neural
responses to sensory stimuli are related to intrinsic, ongoing neural oscillations.
This question is heavily debated for the neural response entrained to the
syllabic rhythm of speech and can also be asked
for neural activity entrained to the time courses of larger linguistic
structures. Our experiment was not designed to answer this question; however,
we clearly found that cortical speech processing networks have the capacity to
generate activity on very long timescales corresponding to larger linguistic
structures, such as phrases and sentences. In other words, the timescales of
larger linguistic structures fall in the timescales, or temporal receptive
windows that the relevant cortical networks are sensitive to. Whether the
capacity of generating low-frequency activity during speech processing is the
same as the mechanisms generating low-frequency spontaneous neural oscillations
will need to be addressed in the future. (6)
Let
me end this encomium with two more points.
First,
a challenge: Norbert, why aren’t you critical of the hype that has been
associated with this paper, as you were of the PR surrounding the Frankland
& Greene (F&G) piece (see here and here)? The relevant text
for this question is the NYU press release (here). The reason is that,
so far as I can tell, the authors of DMZTP did not inflate their results the
way F&G did. Most importantly, they did not suggest that their work
vindicates Chomsky’s insights. So, in the paper, the authors note that their
work “underscore the undeniable existence of hierarchical structure building
operations in language comprehension” (5). These remarks then footnote standard
papers in linguistics. Note the adjective ‘undeniable.’
Moreover,
the press release is largely accurate. It describes DMZTP as “new support” for
the “decades old” Chomsky theory that we possess an “internal grammar.” It
rightly notes that “psychologists and neuroscientists predominantly reject this
viewpoint” and believe that linguistic knowledge is “based on both statistical
calculations between works and sound cue structures.” This, sadly, is the
received wisdom in the cog-neuro and pysch world, and we know why (filthy
Empiricism!!!). So, the release does not misdescribe the state of play and does
not suggest that neuroscience has finally provided real evidence for a heretofore airy-fairy speculation. In fact, it
seems more or less accurate, hence no criticism from me. What is sad is the noted
state of play in psych and cog-neuro, and this IS sad, very very sad.
Second,
the paper provides evidence for a useful methodological point: that one can do
excellent brain science using G theory that is not at the cutting edge. The G
knowledge explored is of Syntactic
Structures (SS) vintage. No Minimalism here. And that’s fine. Minimalism
does not gainsay that sentences have the kinds of structures that SS
postulated. It suggests different generative mechanisms, but not ones that
result in wildly different structures. So, you out there in cog-neuro land:
it’s ok to use G properties that are not at the theoretical cutting edge. Of
course, there is nothing wrong with hunting for Merge (go ahead), but many
questions clearly do not need to exploit the latest theoretical insight. So no
more excuses regarding how ling theory is always changing and so is so hard to
use and is so complicated yada yada yada.
That’s
it. My 2 cents. Go read the paper. It is very good, very suggestive and, oddly
for a technical piece, very accessible. Also, please comment. Others may feel
less enthralled than I have been. Tell us why.
[1]
I would include some recent papers by Lyna Pylkkanen on adjectival modification
in this group as well.
[2]
These are two different claims: it could be that the linguistic knowledge
exists but is not used online. However, we have excellent evidence for both the
existence of grammatical knowledge and its on-line usage. DMZTP provides yet more evidence that such
knowledge exists and is used online.
[3]
P.S. Most who rail against dualism really don’t seem to understand what the
doctrine is. But, for current purposes, this really does not matter.
[4]
Note, the paper does not claim to explain how
hierarchical information is coded in the brain. It might be that it is actually
coded in neural oscillations. But
DMZTP does not claim this. It claims that these oscillations reflect the
encoding (however that is done) and that they can be used to possibly convey
the relevant information. David Adger makes this point in the comments section
of the earlier post on the DMZTP paper. So far as I can tell, DMZTP commits no
hostages as to how the G information is coded in brains. It is, for example,
entirely consistent with the possibility that a Gallsitel like DNA coding of
this info is correct. All the paper does is note that these oscillations are
excellent indices of such structure, not that they are the neural bases of this knowledge.
[5]
Here’s a completely wild thought: imagine if we could relate phases to the
structure of these intrinsic oscillations? So the reason for the phases we have
is that they correspond to the size of the natural oscillations which subvene
language use. Now that would be something. Of course, at present there is zero reason
to believe anything like this. But then again, why exactly phases exist and are
the ones there are is theoretically ungrounded even within linguistics. That
suggests that wild speculation is apposite.