Faculty of Language: Brains and syntax: part 2

Saturday, September 3, 2016

Brains and syntax: part 2

This is the second part of William Matchin's paper. Thanks again to William for putting this down on paper and provoking discussion.

One (very rough) sketch of a possible linking theory between a minimalist grammar and online sentence processing

I am going to try and sketch out what I think is a somewhat reasonable picture of the language faculty given the insights of syntactic theory, psycholinguistics, and cognitive neuroscience. My sketch here takes some inspiration from TAG-based psycholinguistic research (e.g., Demberg & Keller, 2008) and the TAG-based syntactic theory developed by Frank (2002) (thanks to Nick Huang for drawing this work to my attention).

Figure from Frank (2002). The dissociation between the inputs and operations of basic structure building and online processing/manipulation of treelets is clearly exemplified in the grammatical framework of Frank (2002).

The essential qualities of this picture of the language faculty are as follows. Minimalism is essentially a theory of the objects of language, the syntactic representations that people have. These objects are TAG treelets. TAG is a theory of what people do with these objects during sentence processing. TAG-type operations (e.g., unification, substitution, adjunction, verification) may be somehow identifiable with memory retrieval operations, opening up a potentially general cognitive basis for the online processing component of the language faculty, leaving the language-specific component to Merge. This proposal severs any inherent connection between Merge and online processing – although nothing in the proposal precludes the online implementation of Merge during sentence processing, much of sentence processing might proceed without having to implement Merge, but rather TAG operations operating over stored treelets.

I start with what I take to be the essential components of a Minimalist grammar – the lexicon and the computational system (i.e., Merge). Things work essentially as a Minimalist grammar says – you have some lexical atoms, Merge combines these elements (bottom-up) to build structures that are interpreted by the semantic and phonological systems, and there are some principles – some of them part of cognitive endowment, some of them “third factors” or general laws of nature or computation – that constrain the system (Chomsky, 1995; 2005).

The key difference that I propose is that complex derived structures can be stored in long-term memory. Currently, Minimalism states that the core feature of language, recursion, is the ability to treat derived objects as atoms. In other words, structures are treated as words, and as such are equally good inputs to Merge. However, the theory attributes the property of long-term storage only to atoms, and denies long-term storage to structures. Why not make structures fully equivalent to the atoms in their properties, including both Merge-ability AND long-term store-ability?

These stored structures or treelets can either be fully-elaborated structures with the leaves attached, or they might be more abstract nodes, allowing different lexical items to be inserted. It seems important from the psycholinguistic literature to have abstract structural nodes (e.g. NP, VP), so this theory would have to provide some means of taking a complex structure created by Merge and modifying it appropriately to eliminate the leaves (and perhaps many of the structural nodes) of the structure through some kind of deletion operation.

Treelets are the point of interaction between the syntactic system (essentially a Minimalist grammar) and the memory system. It may be the top-down activation of memory retrieval operations that “save” structures as treelets. Memory operations do much of the work of sentence processing – retrieving structures and unifying/substituting them appropriately to efficiently parse sentences (see Demberg & Keller, 2008 for an illustration). Much of language acquisition amounts to refining the attention/retrieval operations as well as the set of treelets and the prominence/availability of such treelets) that the person has available to them.

I think that there are good reasons to think that the retrieval mechanisms and the stored structures/lexical items live in language cortex. Namely, retrieval operations live in the pars triangularis of Broca’s area and stored structures/lexical items live in posterior temporal lobe (somewhere around the superior temporal sulcus/middle temporal gyrus).

This approach pretty much combines the Minimalist generative grammar and the lexicalist/TAG approaches. Note also that retrieving a stored treelet includes the fact that the treelet was created through applications of Merge. So when you look at structure that is finally said by a person, it is both true that the syntactic derivation of this structure is generated bottom-up in accordance with the operations and principles of a minimalist grammar, AND that the person used the thing by retrieving a stored treelet. We can (hopefully) preserve both insights – bottom-up derivation with stored treelets that can be targeted by working memory operations.

One remaining issue is how treelets are combined and lexical items inserted into them – this could be a substitution or unification operation from TAG, but Merge itself might also work for some cases (suggesting some role for Merge in actual online processing).

I think this proposal starts to provide potential insights into language acquisition. Say you’re a person walking around with this kind of system – you’ll want to start directing your attentional/working memory system to all these objects being generated by Merge and creating thoughts. You’ll also (implicitly) realize that other people are saying stuff that connects to your own system of thought, and you’ll start to align your set of stored structures and retrieval operations to match the patterns of what you’re seeing in the external world. This process is language acquisition, and it creates a convergence on the set of features, stored structures, and retrieval operations that are used within a language.

This addresses some of the central questions I posited earlier:

When processing a sentence, do I expect Merge to be active? Or not?

- Not necessarily, maybe minimally or not at all for most sentences.

What happens when people process things less than full sentences (like a little NP – “the dog”)? What is our theory of such situations?

- A little treelet corresponding to that sub-sentence structure is retrieved and interpreted.

Do derivations really proceed from the bottom up, or can they satisfactorily be switched to go top-down/left-right using something like Merge right (Phillips 1996)?

- Syntactic derivations are bottom-up in terms of Merge, but sentence processing occurs left-to-right roughly along the lines of TAG-based parsing frameworks (Demberg & Keller, 2008).

What happens mechanistically when people have to revise structure (e.g., after garden-pathing)?

- De-activate the current structure, retrieve new treelets/lexical items that fit better with what was presented. Lots of activity associated with processing lexical items/structures and memory retrievals, but there may not be an actual activation/implementation of Merge.

Are there only minimal lexical elements and Merge? Or are there stored complex objects, like “treelets”, constructions or phrase-structure rules?

- Yes, there are treelets, but we have an explanation for why there are treelets – they were created through applications of Merge at some point in the person’s life, but not necessarily online during sentence processing.

How does the syntactic system interact with working memory, a system that is critical for online sentence processing?

- The point of interaction between syntax and memory is the treelet. Somehow certain features encoded on treelets have to be available to the memory system.

Now that I have these answers, I can proceed to do my neuroimaging and neuropsychology experiments with testable predictions regarding how language is effected in the brain:

What’s the function of Broca’s area?

- Retrieval operations that are specialized to operate over syntactic representations.

- Which is why when you destroy Broca’s area you are still left with a bunch of treelets that can be activated in comprehension/production that you can use pretty effectively, although you have less strategic control over them.

- We expect patients with damage to Broca’s area to be able to basically comprehend sentences, but really have trouble in cases requiring recovery/revision, long-distance dependencies, prediction, and perhaps second language acquisition

What’s the function of posterior temporal areas?

- Lexical storage, including treelets.

- We expect activation for basic sentence processing, more activation for ambiguity/garden-path sentences when more structural templates are activated.

- We expect patients with damage to posterior temporal damage to have some real problems with sentence comprehension/production).

Where are fundamental structure building operations in the brain, e.g. Merge?

- Merge is a subtle neurobiological property of some kind.

- It might be in the connections between cortical areas, perhaps involving subcortical structures, or some property of individual neurons, but regardless, there isn’t a “syntax area” to be found.

What are the ramifications of this proposal for the standard contrast of sentences > lists that is commonly used to probe sentence processing in the brain?

- This contrast will highlight all sorts of things, likely including the activation of treelets, memory retrieval operations, semantic processing, but it might not be expected to drive activation for basic syntactic operations, i.e. Merge

Here I have tried to preserve Merge as the defining and simple feature of language – it’s the thing that allows people to grow structures. It also clearly separates Merge from the issue of “what happens during sentence processing”, and really highlights the core of language as something not directly tied to communication. Essentially, the theory of syntax becomes the theory of structures and dependencies, not producing and understanding sentences. On this conception of language, there is this Merge machinery creating structures, perhaps new in evolution that can be harnessed by an (evolutionarily older) attentional/memory system for the purposes of producing and comprehending sentences through storing treelets in long term memory. Merge is clearly separate from this communication/memory system, and an engine of thought. Learning a language then becomes a matter of refining the retrieval operations and what kinds of stored treelets you have that are optimized for communicating with others over time.

If this is a reasonable picture of the language faculty, thinking along these lines might start to help resolve some conundrums in the traditional domain of syntax. For example, there is often the intuition that syntactic islands are somehow related to processing difficulty (Kluender & Kutas 1993; Berwick & Weinberg, 1984), but there is good evidence that islands cannot be reduced to online processing difficulty or memory resource demands (Phillips, 2006; Sprouse et al., 2012). One approach might be to attribute islands to a processing constraint that somehow becomes grammaticalized (Berwick & Weinberg, 1984). The present framework provides a way for thinking about this issue, because the interaction between syntax and the online processing/memory system is specified. I have some more specific thoughts on this issue that might take the form of a future post.

At any rate, I would love any feedback on this type of proposal. Do we think this is a sensible idea of what the language faculty looks like? What are some serious objections to this kind of proposal? If this is on the right track, then I think we can start to make some more serious hypotheses about how language is implemented in the human brain beyond Broca’s area = Merge.

Many thanks to Nick Huang (particularly for pointing out relevant pieces of literature), Marta Ruda, Shota Momma, Gesoel Mendes, and of course Norbert Hornstein for reading this and giving me their thoughts. Thanks to Ellen Lau, Alexander Williams, Colin Phillips and Jeff Lidz for helpful discussion on these topics. Any failings are mine, not theirs.

References

Berwick, R. C., & Weinberg, A. S. (1983). The role of grammars in models of language use. Cognition, 13(1), 1-61.

Berwick, R., and Weinberg, A.S. (1984). The grammatical basis of linguistic performance. Cambridge, MA: MIT Press.

Bresnan, J. (2001). Lexical-Functional Syntax Blackwell.

Chomsky, N. (2005). Three factors in language design. Linguistic inquiry, 36(1), 1-22.

Chomsky, N. (1965). Aspects of the Theory of Syntax. MIT press.

Culicover, P. W., & Jackendoff, R. (2005). Simpler syntax. Oxford University Press on Demand.

Culicover, P. W., & Jackendoff, R. (2006). The simpler syntax hypothesis. Trends in cognitive sciences, 10(9), 413-418.

Demberg, V., & Keller, F. (2008, June). A psycholinguistically motivated version of TAG. In Proceedings of the 9th International Workshop on Tree Adjoining Grammars and Related Formalisms. Tübingen (pp. 25-32).

Embick, D., Marantz, A., Miyashita, Y., O'Neil, W., & Sakai, K. L. (2000). A syntactic specialization for Broca's area. Proceedings of the National Academy of Sciences, 97(11), 6150-6154.

Fedorenko, E., Behr, M. K., & Kanwisher, N. (2011). Functional specificity for high-level linguistic processing in the human brain. Proceedings of the National Academy of Sciences, 108(39), 16428-16433.

Fodor, J., Bever, A., & Garrett, T. G. (1974). The psychology of language: An introduction to psycholinguistics and generative grammar.

Frank, R. 2002. Phrase Structure Composition and Syntactic Dependencies. Cambridge, Mass: MIT Press.

Grodzinsky, Y. (2000). The neurology of syntax: Language use without Broca's area. Behavioral and brain sciences, 23(01), 1-21.

Grodzinsky, Y., & Friederici, A. D. (2006). Neuroimaging of syntax and syntactic processing. Current opinion in neurobiology, 16(2), 240-246.

Grodzinsky, Y. (2006). A blueprint for a brain map of syntax. Broca’s region, 83-107.

Jackendoff, R. (2003). Précis of foundations of language: brain, meaning, grammar, evolution. Behavioral and Brain Sciences, 26(06), 651-665.

Joshi, A. K., & Schabes, Y. (1997). Tree-adjoining grammars. In Handbook of formal languages (pp. 69-123). Springer Berlin Heidelberg.

Kluender, R., & Kutas, M. (1993). Subjacency as a processing phenomenon. Language and cognitive processes, 8(4), 573-633.

Lewis, S., & Phillips, C. (2015). Aligning grammatical theories and language processing models. Journal of Psycholinguistic Research, 44(1), 27-46.

Lewis, R. L., & Vasishth, S. (2005). An activation‐based model of sentence processing as skilled memory retrieval. Cognitive science, 29(3), 375-419.

Lewis, R. L., Vasishth, S., & Van Dyke, J. A. (2006). Computational principles of working memory in sentence comprehension. Trends in cognitive sciences, 10(10), 447-454.

Linebarger, M. C., Schwartz, M. F., & Saffran, E. M. (1983). Sensitivity to grammatical structure in so-called agrammatic aphasics. Cognition, 13(3), 361-392.

Lukyanenko, C., Conroy, A., & Lidz, J. (2014). Is she patting Katie? Constraints on pronominal reference in 30-month-olds. Language Learning and Development, 10(4), 328-344.

Matchin, W., Sprouse, J., & Hickok, G. (2014). A structural distance effect for backward anaphora in Broca’s area: An fMRI study. Brain and language, 138, 1-11.

Miller, G. A., & Chomsky, N. (1963). Finitary models of language users.

Mohr, J. P., Pessin, M. S., Finkelstein, S., Funkenstein, H. H., Duncan, G. W., & Davis, K. R. (1978). Broca aphasia Pathologic and clinical. Neurology, 28(4), 311-311.

Momma, 2016 (doctoral dissertation, University of Maryland, department of Linguistics)

Musso, M., Moro, A., Glauche, V., Rijntjes, M., Reichenbach, J., Büchel, C., & Weiller, C. (2003). Broca's area and the language instinct. Nature neuroscience, 6(7), 774-781.

Omaki, A., Lau, E. F., Davidson White, I., Dakan, M. L., Apple, A., & Phillips, C. (2015). Hyper-active gap filling. Frontiers in psychology, 6, 384.

Pallier, C., Devauchelle, A. D., & Dehaene, S. (2011). Cortical representation of the constituent structure of sentences. Proceedings of the National Academy of Sciences, 108(6), 2522-2527.

Phillips, C. (1996). Order and structure (Doctoral dissertation, Massachusetts Institute of Technology).

Phillips, C. (2006). The real-time status of island phenomena. Language, 795-823.

Rogalsky, C., & Hickok, G. (2011). The role of Broca's area in sentence comprehension. Journal of Cognitive Neuroscience, 23(7), 1664-1680.

Santi, A., & Grodzinsky, Y. (2012). Broca's area and sentence comprehension: A relationship parasitic on dependency, displacement or predictability?. Neuropsychologia, 50(5), 821-832.

Santi, A., Friederici, A. D., Makuuchi, M., & Grodzinsky, Y. (2015). An fMRI Study Dissociating Distance Measures Computed by Broca’s Area in Movement Processing: Clause boundary vs Identity. Frontiers in psychology, 6, 654.

Sprouse, J. (2015). Three open questions in experimental syntax. Linguistics Vanguard, 1(1), 89-100.

Sprouse, J., Wagers, M., & Phillips, C. (2012). A test of the relation between working-memory capacity and syntactic island effects. Language, 88(1), 82-123.

Stowe, L. A., Haverkort, M., & Zwarts, F. (2005). Rethinking the neurological basis of language. Lingua, 115(7), 997-1042.

Stowe, L. A. (1986). Parsing WH-constructions: Evidence for on-line gap location. Language and cognitive processes, 1(3), 227-245.

Vosse, T., & Kempen, G. (2000). Syntactic structure assembly in human parsing: a computational model based on competitive inhibition and a lexicalist grammar. Cognition, 75(2), 105-143.

Wilson, S. M., & Saygın, A. P. (2004). Grammaticality judgment in aphasia: Deficits are not specific to syntactic structures, aphasic syndromes, or lesion sites. Journal of Cognitive Neuroscience, 16(2), 238-252.

Zaccarella, E., & Friederici, A. D. (2015). Merge in the human brain: A sub-region based functional investigation in the left pars opercularis. Frontiers in psychology, 6.

17 comments:

Jeff LidzSeptember 4, 2016 at 5:18 AM
You maybe want to check out B Srinivas's work on supertagging
ReplyDelete
Replies
JonSeptember 5, 2016 at 12:29 PM
This picture of the world is extremely appealing to me. I always felt like Frank's (2002) Minimalism/TAG duality didn't get the attention it deserves, and using this duality to represent competence vs. performance seems, at the very least, an interesting path to go down.
ReplyDelete
Replies
Peter JSeptember 5, 2016 at 4:31 PM
Empirical grammatical evidence for the kind of stored structures you suggest comes from phrasal idioms (see, e.g., Marantz 1997); it strikes me that exploring the neurological signature of idioms versus syntactically equivalent compositional structures might be an interesting way of trying to confirm your hypothesis.

Also, I would be remiss as a Berkeleyite if I didn't point out that Construction Grammarians have spent the last 50 years arguing that stored structures associated with stored meanings exist. Of course, straight CG does not explain why these stored structures still follow normal syntactic constraints.
ReplyDelete
Replies
Tim HunterSeptember 6, 2016 at 12:25 PM
I think it's easy to overestimate the degree to which TAG is more "treelet-based" than (say) minimalist syntax. As soon as you move to having things like subcategorization frames be part of a lexical item, then for many purposes you've got something equivalent to a TAG elementary tree. So for example if the verb 'give' comes with a subcategorization frame that says '[__ NP PP]' (and we understand this to mean that if you combine it with an NP and a PP then you've get a VP which has those three constituents as daughters), then it's not that different from writing down a tree which has VP at its root, a V 'give' as one daughter, and daughter-less NP and PP nodes as its other two daughters. In a minimalist context we tend to think of the tree as only appearing once the required merge operations have taken place -- but since the presence of the subcategorization features means that those merge operations are required, i.e. there's no way to use 'give' in a sentence without those merge operations happening, this is almost just a difference in notation. (The merge operations "have to be part of the derivation" as soon as you decide that 'give' is a part of it.)

There are of course differences between TAGs and minimalism in general -- roughly, it's the difference between adding movement to the basic system of merge/substitution described above, or adding TAG-adjunction to it. And since TAG-adjunction is inherently a tree-based operation, it is natural for the system to work with "trees all the way down" (i.e. I guess you can do adjunctions before any substitutions). But the fact remains, I think, that a minimalist-style lexical item adorned with some collection of features dictating that this lexical item in effect "brings with it" certain combinatorial operations, is a "chunk" of approximately the same size as a TAG elementary tree.

The other fly in the ointment here is perhaps the fact that work in TAG (including Bob Frank's) often takes elementary tree size to be bounded by the notion of an *extended projection*, whereas when minimalists write down encodings of the stuff that needs to accompany a head only within its (maximal) projection itself. So it might look like the treelet associated with 'give' goes up to the TP node in a TAG, whereas the treelet that is in effect associated with 'give' in minimalism only goes up to VP. But one still needs to state somewhere that, for example, Ts take VPs as their complements -- it's just that the TAG folks have bitten the bullet and put this information in the same place as all subcategorization information, whereas minimalists (if my impression is correct) are still umm-ing and err-ing a little bit on how to encode extended projections. So again I think appearances are deceiving.

Lastly, it seems relevant to just mention that Bob Frank's motivation for mixing TAG-style mechanics and minimalist-style mechanics in his 2002 book didn't have anything to do with access-via-chunking or predictive processing or anything like that -- it was purely motivated by the fact that minimalist-style mechanics seem to do a good job of dealing with the bounded, relatively local dependencies (approximately, things that were within a kernel sentence), whereas TAG-style mechanics seem to do a good job of dealing with the longer-distance dependencies (approximately, things that had involved generalized transformations right from the very early days). His neat observation was that minimalism's (independently motivated) move to all generalized transformations got rid of the hook on which the distinction between these two kinds of phenomena could be hung. His use of TAG is one way to try to bring back distinct mechanisms for working at the larger scale; phases are another.
ReplyDelete
Replies
UnknownSeptember 11, 2016 at 9:34 PM
Dịch vụ chuyên chuyển phát nhanh tại Việt Nam đi sang Trung Quốc
Đơn vị vận chuyển sản phẩm từ trung quốc về việt nam chất lượng nhất
Đơn vị vận chuyển sản phẩm từ trung quốc về việt nam uy tín chất lượng
ReplyDelete
Replies
UnknownSeptember 12, 2016 at 12:01 PM
Thanks to William Matchin for a fascinating article

I have a small concern regarding an example given of a potential application of a linguistic-neurologic theory. It was mentioned that perhaps islands could be treated as a grammaticalized processing constraint. This seems to me to be stretching the domain of grammaticalization a little far. I may be quite wrong, but my understanding is that For something to be grammaticalized it would need to be manifest in some way in the grammar of a language itself. Furthermore, it would need to be a notion (either syntactic or semantic) that would lend itself to interpretation. So Plurality can be grammaticalized as 's' in english or as 'e/en' in German because plurality is a semantically interpretable notion. Or structural case is grammaticalized as a way of signaling displacement. What exactly would be the nature of the processing constraint if it were to be grammaticalized. something would need to be refined either about the notion of processing constraint or of grammaticalization.
just my two cents
thanks!
ReplyDelete
Replies
MindvalleyJuly 29, 2018 at 11:37 PM
This post is very good. There is motivation for others.
https://blog.mindvalley.com/cognitive-functions
ReplyDelete
Replies

Faculty of Language

Comments

Saturday, September 3, 2016

Brains and syntax: part 2

17 comments:

Contributors