It’s not news that syntactic structure is perceptible in the
absence of meaningfulness. After all, even though the slithy toves did gyre and
gimble in the mabe, still and all colorless green ideas do sleep furiously. Not
news maybe, but still able to instruct, as I found out in getting ready for
this years Baggett Lectures (see here). The invited speaker
this year is Stanislas
Dehaene and to get ready I’ve read (under the guidance of my own personal
Virgil, Ellen Lau, guide to the several nether circles of neuroscience) some
recent papers by him on how the brain does syntax. One (here)
was both short (a highly commendable property) and absorbing. Like many
neuroscience efforts, this one is collaborative, brought to you by the French
team of Pallier, Devauchelle and Dehaene (PDD). I found the PDD results (and
methods to the degree that I could be made to appreciate them) thought
provoking. Here’s why.
The paper starts from the conventional grammatical assumption
that “sentences are not mere strings of words but possess a hierarchical
structure with constituents nested inside each other” (2522).[1]
PDD’s task is to find out where, if anywhere, the brains tracks/builds this
hierarchy. PDD construct a very clever model that allows them to use fMRI
techniques to zero in on those regions sensitive to hierarchical structure.
Before proceeding, it’s worth noting that this takes quite a
bit of work. Part of what makes the paper fun, is the little model that allows
PDD to index hierarchy to differential blood flows (the BOLD
(blood-oxygen-level-dependent) response, which is what an fMRI tracks). It
predicts a linear relationship between the BOLD response and phrasal size
(roughly indexed to word length (a “useful approximation”) and they use this
relationship to probe ROIs (i.e. regions of interest (Damn I love these
acronyms)) that respond to this predicted relationship using two different
kinds of linguistic probes. The first are strings of words containing phrases
ranging from 1 to 12 words long (actually 12/6/4/3/2/1 e.g. 12: I believe that
you should accept the proposal of your new associate, 4: mayor of the city he
hates this color they read their names). The second are jabberwocky strings
with the same structure (e.g. I tosieve that you should begept the tropufal of
your tew viroate). Here’s what they found:
1. They
found brain regions that responded to these probes in the predicted linear
manner. Four in the STS region, and two in the left inferior gyrus (IFGtri and
IFGorb).
2. The
regions responded differentially to the two kinds of linguistic probes. Thus,
all four regions responded to the first kind of probe (“normal prose”) while
jabberwocky only elicited responses in IFGorb (with some response with a lower
statistical threshold in left posterior STS and IFGtri).
In sum, different brain regions light up exclusively to
phrasal syntax independent of content words. Thus, the brain seems to
distinguish contentful morphemes from more functional ones and it does so by
processing the relevant information in different
regions.
And this is interesting, I believe. Why?
Well, think of what PDD could have found. One possibility is
that all words are effectively the same; the only difference between content
words and functional vocabulary residing in their statistical frequency, the
closed class content words being far more common than the open class contentful
vocab. Were this correct, then we should expect no regional segregation of the
two kinds of vocab, just, say, a bigger or smaller response based on the
lexical richness of the input. Thus, we might have expected all regions to respond equally to all of
the inputs though the size of the response would have differed. But this is not what PDD found. What they found is
differential activity across various regions with one group of sites responding
exclusively to structural input even in the absence of meaningful content. This
sure smells a lot like the wetware analogue of the autonomy of syntax thesis
(yes, I was not surprised, and yes, I
was nonetheless chuffed). PDD (2526) make exactly this autonomy of syntax point
in noting that their results underline “the relative independence of syntax
from lexico-semantic features.”
Second, the results might
have implications for grammar lexicalization (GL) (an idea that Thomas has been
posting about recently). From what I can tell, GL sees grammatical dependencies
as the byproduct of restrictions coded as features on lexical terminals. Grammatical
dependencies on the GL view just are the sum total of lexical dependencies. If
this is correct, then a question arises: what does this kind of position lead
us to expect about grammatical structure in the absence of such lexical
information? I assume that Jabberwocky vocab is not part of our lexicon and so a grammar that exclusively builds
structure based on info coded in lexical terminals will not have access to (at
least some) grammatically relevant information in Jabberwocky input (e.g. how
to combine the and slithy toves in the absence of features
on the latter?). Does this mean that we should not be able to build syntactic
structure in the absence of the relevant terminals or that we will react
differently to input composed of “real” lexemes vs Jabberwocky vocab? Behaviorally,
we know that we can distinguish well- from ill- formed Jabberwocky. So we know
that the absence of a lot of lexical vocab does not impede the construction of
syntactic structure. PDD further shows that neurally there are parts of the
brain that respond to syntactic structure regardless of the presence of featurally
marked terminals (and recall, that this need not have been the case). A non-GL
view of grammar has no problems with this, as grammatical structure is not a by-product
of lexical features[2]
(at least not in general, only the
functional vocab is possibly
relevant).[3]
I cannot tell if this is a puzzle if one takes a fully GL view of things, but
it certainly indicates that parts of the brain seem tuned to structure without
much apparent lexical content. Indeed, it suggests (at least to me) that GL
might have things backwards: it’s not that grammatical structure arises from
lexical specifications but that lexical specifications are by-products of
enjoying certain syntactic relations. Formally,
this might be a distinction without a difference, but psychologically and
neurally, these two ways of viewing the ontogeny of structure look like they
might (note the weasel word here) have very different empirical consequences.
I am sure that I have over-interpreted the PDD results. The
structure they probe is very simple (just right branching phrase structure) and
the results rely on some rather radical simplifications. However, I am told,
that this is the current state of the fMRI art and even with these caveats, PDD
is interesting and thought provoking, and, who knows, maybe more than a little
relevant to how we should think about the etiology of grammar. It seems that brains,
or at least parts of brains, are sensitive to structure regardless of what that
structure contains. This is an old idea, one perfectly expected given a pretty
orthodox conception of the autonomy of syntax. It’s nice to see that this same
conception is leading to new and intriguing work investigating how brains go
about building structure.
[2]
Note that this does not entail that
grammatical information might not be lexicalized for tasks like parsing. It
only implies that grammatical dependencies are more primitive than lexical
ones. The former determine the latter, not vice versa.
[3]
I say ‘possibly’ as the functional vocab might just be the morphological
outward manifestations of grammatical dependencies rather than the elements
from which such dependencies are formed. There is not obvious reason for
thinking that structure is there because
of the functional vocab, though there is reason for assuming that functional
vocab and syntactic structure are correlated, sometimes strongly.
I don't see how these results are any more at odds with GL than any other linguistic formalism. The task is to put nonce words into specific syntactic categories. GL only makes the set of categories more fine grained, but it has no impact on the nature of the task. The basic algorithm: look at the functional words and their subcategorization frame, assign categories to complements accordingly. And this algorithm does not care if you specified subcategorization via features or constraints (if anything, it's easier with the former).
ReplyDeleteWhat I would like to see tested:
1) What happens when functional words are replaced rather than open-class items? Since Merge uses exact feature matching (and functional words are small in number but very high frequency), this task should be just as easy as the first, if not easier. But it's probably harder. Maybe because Merge does not use exact matching, maybe because functional words still encode something like basic semantics ("x did y to z with u during v") and the human parser relies a lot on semantics for pruning down the search space.
2) Related to the semantics issue, are test participants sensitive to structural ambiguity? Do they entertain multiple structures for "I sixed the nark with a lorospok"?
I guess my worry was more general: do we really think that what's going on here is that we are assigning features on the fly that we are then checking. I am not sure that this is the right way to think of it. At any rate, if we can always assign features on the fly then there is no sense in which these features are localized to LIs, or so it seems to me.
DeleteAs for the second question: the material was very primitive. No ambiguities that I could see. Maybe the next set of experiments will go there. At least behaviorally, it strikes me that I find your J sentence ambiguous in the same way as a non J sentence. Don't you?