Monday, October 14, 2013

Syntax in the brain

It’s not news that syntactic structure is perceptible in the absence of meaningfulness. After all, even though the slithy toves did gyre and gimble in the mabe, still and all colorless green ideas do sleep furiously. Not news maybe, but still able to instruct, as I found out in getting ready for this years Baggett Lectures (see here). The invited speaker this year is Stanislas Dehaene and to get ready I’ve read (under the guidance of my own personal Virgil, Ellen Lau, guide to the several nether circles of neuroscience) some recent papers by him on how the brain does syntax. One (here) was both short (a highly commendable property) and absorbing. Like many neuroscience efforts, this one is collaborative, brought to you by the French team of Pallier, Devauchelle and Dehaene (PDD). I found the PDD results (and methods to the degree that I could be made to appreciate them) thought provoking. Here’s why.

The paper starts from the conventional grammatical assumption that “sentences are not mere strings of words but possess a hierarchical structure with constituents nested inside each other” (2522).[1] PDD’s task is to find out where, if anywhere, the brains tracks/builds this hierarchy. PDD construct a very clever model that allows them to use fMRI techniques to zero in on those regions sensitive to hierarchical structure.

Before proceeding, it’s worth noting that this takes quite a bit of work. Part of what makes the paper fun, is the little model that allows PDD to index hierarchy to differential blood flows (the BOLD (blood-oxygen-level-dependent) response, which is what an fMRI tracks). It predicts a linear relationship between the BOLD response and phrasal size (roughly indexed to word length (a “useful approximation”) and they use this relationship to probe ROIs (i.e. regions of interest (Damn I love these acronyms)) that respond to this predicted relationship using two different kinds of linguistic probes. The first are strings of words containing phrases ranging from 1 to 12 words long (actually 12/6/4/3/2/1 e.g. 12: I believe that you should accept the proposal of your new associate, 4: mayor of the city he hates this color they read their names). The second are jabberwocky strings with the same structure (e.g. I tosieve that you should begept the tropufal of your tew viroate). Here’s what they found:

1.     They found brain regions that responded to these probes in the predicted linear manner. Four in the STS region, and two in the left inferior gyrus (IFGtri and IFGorb).
2.     The regions responded differentially to the two kinds of linguistic probes. Thus, all four regions responded to the first kind of probe (“normal prose”) while jabberwocky only elicited responses in IFGorb (with some response with a lower statistical threshold in left posterior STS and IFGtri).

In sum, different brain regions light up exclusively to phrasal syntax independent of content words. Thus, the brain seems to distinguish contentful morphemes from more functional ones and it does so by processing the relevant information in different regions.
And this is interesting, I believe. Why?

Well, think of what PDD could have found. One possibility is that all words are effectively the same; the only difference between content words and functional vocabulary residing in their statistical frequency, the closed class content words being far more common than the open class contentful vocab. Were this correct, then we should expect no regional segregation of the two kinds of vocab, just, say, a bigger or smaller response based on the lexical richness of the input. Thus, we might have expected all regions to respond equally to all of the inputs though the size of the response would have differed. But this is not what PDD found. What they found is differential activity across various regions with one group of sites responding exclusively to structural input even in the absence of meaningful content. This sure smells a lot like the wetware analogue of the autonomy of syntax thesis (yes, I was not surprised, and yes, I was nonetheless chuffed). PDD (2526) make exactly this autonomy of syntax point in noting that their results underline “the relative independence of syntax from lexico-semantic features.”

Second, the results might have implications for grammar lexicalization (GL) (an idea that Thomas has been posting about recently). From what I can tell, GL sees grammatical dependencies as the byproduct of restrictions coded as features on lexical terminals. Grammatical dependencies on the GL view just are the sum total of lexical dependencies. If this is correct, then a question arises: what does this kind of position lead us to expect about grammatical structure in the absence of such lexical information? I assume that Jabberwocky vocab is not part of our lexicon and so a grammar that exclusively builds structure based on info coded in lexical terminals will not have access to (at least some) grammatically relevant information in Jabberwocky input (e.g. how to combine the and slithy toves in the absence of features on the latter?). Does this mean that we should not be able to build syntactic structure in the absence of the relevant terminals or that we will react differently to input composed of “real” lexemes vs Jabberwocky vocab? Behaviorally, we know that we can distinguish well- from ill- formed Jabberwocky. So we know that the absence of a lot of lexical vocab does not impede the construction of syntactic structure. PDD further shows that neurally there are parts of the brain that respond to syntactic structure regardless of the presence of featurally marked terminals (and recall, that this need not have been the case). A non-GL view of grammar has no problems with this, as grammatical structure is not a by-product of lexical features[2] (at least not in general, only the functional vocab is possibly relevant).[3] I cannot tell if this is a puzzle if one takes a fully GL view of things, but it certainly indicates that parts of the brain seem tuned to structure without much apparent lexical content. Indeed, it suggests (at least to me) that GL might have things backwards: it’s not that grammatical structure arises from lexical specifications but that lexical specifications are by-products of enjoying certain syntactic relations. Formally, this might be a distinction without a difference, but psychologically and neurally, these two ways of viewing the ontogeny of structure look like they might (note the weasel word here) have very different empirical consequences.

I am sure that I have over-interpreted the PDD results. The structure they probe is very simple (just right branching phrase structure) and the results rely on some rather radical simplifications. However, I am told, that this is the current state of the fMRI art and even with these caveats, PDD is interesting and thought provoking, and, who knows, maybe more than a little relevant to how we should think about the etiology of grammar. It seems that brains, or at least parts of brains, are sensitive to structure regardless of what that structure contains. This is an old idea, one perfectly expected given a pretty orthodox conception of the autonomy of syntax. It’s nice to see that this same conception is leading to new and intriguing work investigating how brains go about building structure.

[1] This truism, sadly, is not always acknowledged. For example, it is still possible for psychologists to find a publishing outlet of considerable repute for papers that deny this (see here). Fortunately, flat-earthism seems to be loosing its allure in neuroscience.
[2] Note that this does not entail that grammatical information might not be lexicalized for tasks like parsing. It only implies that grammatical dependencies are more primitive than lexical ones. The former determine the latter, not vice versa.
[3] I say ‘possibly’ as the functional vocab might just be the morphological outward manifestations of grammatical dependencies rather than the elements from which such dependencies are formed. There is not obvious reason for thinking that structure is there because of the functional vocab, though there is reason for assuming that functional vocab and syntactic structure are correlated, sometimes strongly.


  1. I don't see how these results are any more at odds with GL than any other linguistic formalism. The task is to put nonce words into specific syntactic categories. GL only makes the set of categories more fine grained, but it has no impact on the nature of the task. The basic algorithm: look at the functional words and their subcategorization frame, assign categories to complements accordingly. And this algorithm does not care if you specified subcategorization via features or constraints (if anything, it's easier with the former).

    What I would like to see tested:
    1) What happens when functional words are replaced rather than open-class items? Since Merge uses exact feature matching (and functional words are small in number but very high frequency), this task should be just as easy as the first, if not easier. But it's probably harder. Maybe because Merge does not use exact matching, maybe because functional words still encode something like basic semantics ("x did y to z with u during v") and the human parser relies a lot on semantics for pruning down the search space.
    2) Related to the semantics issue, are test participants sensitive to structural ambiguity? Do they entertain multiple structures for "I sixed the nark with a lorospok"?

    1. I guess my worry was more general: do we really think that what's going on here is that we are assigning features on the fly that we are then checking. I am not sure that this is the right way to think of it. At any rate, if we can always assign features on the fly then there is no sense in which these features are localized to LIs, or so it seems to me.

      As for the second question: the material was very primitive. No ambiguities that I could see. Maybe the next set of experiments will go there. At least behaviorally, it strikes me that I find your J sentence ambiguous in the same way as a non J sentence. Don't you?