This is the second part of William Matchin's paper. Thanks again to William for putting this down on paper and provoking discussion.
One (very rough) sketch of a possible linking theory between a minimalist grammar and online sentence processing
I am going to try and sketch out what I think is a somewhat reasonable picture of the language faculty given the insights of syntactic theory, psycholinguistics, and cognitive neuroscience. My sketch here takes some inspiration from TAG-based psycholinguistic research (e.g., Demberg & Keller, 2008) and the TAG-based syntactic theory developed by Frank (2002) (thanks to Nick Huang for drawing this work to my attention).
Figure from Frank (2002). The dissociation between the inputs and operations of basic structure building and online processing/manipulation of treelets is clearly exemplified in the grammatical framework of Frank (2002).
The essential qualities of this picture of the language faculty are as follows. Minimalism is essentially a theory of the objects of language, the syntactic representations that people have. These objects are TAG treelets. TAG is a theory of what people do with these objects during sentence processing. TAG-type operations (e.g., unification, substitution, adjunction, verification) may be somehow identifiable with memory retrieval operations, opening up a potentially general cognitive basis for the online processing component of the language faculty, leaving the language-specific component to Merge. This proposal severs any inherent connection between Merge and online processing – although nothing in the proposal precludes the online implementation of Merge during sentence processing, much of sentence processing might proceed without having to implement Merge, but rather TAG operations operating over stored treelets.
I start with what I take to be the essential components of a Minimalist grammar – the lexicon and the computational system (i.e., Merge). Things work essentially as a Minimalist grammar says – you have some lexical atoms, Merge combines these elements (bottom-up) to build structures that are interpreted by the semantic and phonological systems, and there are some principles – some of them part of cognitive endowment, some of them “third factors” or general laws of nature or computation – that constrain the system (Chomsky, 1995; 2005).
The key difference that I propose is that complex derived structures can be stored in long-term memory. Currently, Minimalism states that the core feature of language, recursion, is the ability to treat derived objects as atoms. In other words, structures are treated as words, and as such are equally good inputs to Merge. However, the theory attributes the property of long-term storage only to atoms, and denies long-term storage to structures. Why not make structures fully equivalent to the atoms in their properties, including both Merge-ability AND long-term store-ability?
These stored structures or treelets can either be fully-elaborated structures with the leaves attached, or they might be more abstract nodes, allowing different lexical items to be inserted. It seems important from the psycholinguistic literature to have abstract structural nodes (e.g. NP, VP), so this theory would have to provide some means of taking a complex structure created by Merge and modifying it appropriately to eliminate the leaves (and perhaps many of the structural nodes) of the structure through some kind of deletion operation.
Treelets are the point of interaction between the syntactic system (essentially a Minimalist grammar) and the memory system. It may be the top-down activation of memory retrieval operations that “save” structures as treelets. Memory operations do much of the work of sentence processing – retrieving structures and unifying/substituting them appropriately to efficiently parse sentences (see Demberg & Keller, 2008 for an illustration). Much of language acquisition amounts to refining the attention/retrieval operations as well as the set of treelets and the prominence/availability of such treelets) that the person has available to them.
I think that there are good reasons to think that the retrieval mechanisms and the stored structures/lexical items live in language cortex. Namely, retrieval operations live in the pars triangularis of Broca’s area and stored structures/lexical items live in posterior temporal lobe (somewhere around the superior temporal sulcus/middle temporal gyrus).
This approach pretty much combines the Minimalist generative grammar and the lexicalist/TAG approaches. Note also that retrieving a stored treelet includes the fact that the treelet was created through applications of Merge. So when you look at structure that is finally said by a person, it is both true that the syntactic derivation of this structure is generated bottom-up in accordance with the operations and principles of a minimalist grammar, AND that the person used the thing by retrieving a stored treelet. We can (hopefully) preserve both insights – bottom-up derivation with stored treelets that can be targeted by working memory operations.
One remaining issue is how treelets are combined and lexical items inserted into them – this could be a substitution or unification operation from TAG, but Merge itself might also work for some cases (suggesting some role for Merge in actual online processing).
I think this proposal starts to provide potential insights into language acquisition. Say you’re a person walking around with this kind of system – you’ll want to start directing your attentional/working memory system to all these objects being generated by Merge and creating thoughts. You’ll also (implicitly) realize that other people are saying stuff that connects to your own system of thought, and you’ll start to align your set of stored structures and retrieval operations to match the patterns of what you’re seeing in the external world. This process is language acquisition, and it creates a convergence on the set of features, stored structures, and retrieval operations that are used within a language.
This addresses some of the central questions I posited earlier:
When processing a sentence, do I expect Merge to be active? Or not?
- Not necessarily, maybe minimally or not at all for most sentences.
What happens when people process things less than full sentences (like a little NP – “the dog”)? What is our theory of such situations?
- A little treelet corresponding to that sub-sentence structure is retrieved and interpreted.
Do derivations really proceed from the bottom up, or can they satisfactorily be switched to go top-down/left-right using something like Merge right (Phillips 1996)?
- Syntactic derivations are bottom-up in terms of Merge, but sentence processing occurs left-to-right roughly along the lines of TAG-based parsing frameworks (Demberg & Keller, 2008).
What happens mechanistically when people have to revise structure (e.g., after garden-pathing)?
- De-activate the current structure, retrieve new treelets/lexical items that fit better with what was presented. Lots of activity associated with processing lexical items/structures and memory retrievals, but there may not be an actual activation/implementation of Merge.
Are there only minimal lexical elements and Merge? Or are there stored complex objects, like “treelets”, constructions or phrase-structure rules?
- Yes, there are treelets, but we have an explanation for why there are treelets – they were created through applications of Merge at some point in the person’s life, but not necessarily online during sentence processing.
How does the syntactic system interact with working memory, a system that is critical for online sentence processing?
- The point of interaction between syntax and memory is the treelet. Somehow certain features encoded on treelets have to be available to the memory system.
Now that I have these answers, I can proceed to do my neuroimaging and neuropsychology experiments with testable predictions regarding how language is effected in the brain:
What’s the function of Broca’s area?
- Retrieval operations that are specialized to operate over syntactic representations.
- Which is why when you destroy Broca’s area you are still left with a bunch of treelets that can be activated in comprehension/production that you can use pretty effectively, although you have less strategic control over them.
- We expect patients with damage to Broca’s area to be able to basically comprehend sentences, but really have trouble in cases requiring recovery/revision, long-distance dependencies, prediction, and perhaps second language acquisition
What’s the function of posterior temporal areas?
- Lexical storage, including treelets.
- We expect activation for basic sentence processing, more activation for ambiguity/garden-path sentences when more structural templates are activated.
- We expect patients with damage to posterior temporal damage to have some real problems with sentence comprehension/production).
Where are fundamental structure building operations in the brain, e.g. Merge?
- Merge is a subtle neurobiological property of some kind.
- It might be in the connections between cortical areas, perhaps involving subcortical structures, or some property of individual neurons, but regardless, there isn’t a “syntax area” to be found.
What are the ramifications of this proposal for the standard contrast of sentences > lists that is commonly used to probe sentence processing in the brain?
- This contrast will highlight all sorts of things, likely including the activation of treelets, memory retrieval operations, semantic processing, but it might not be expected to drive activation for basic syntactic operations, i.e. Merge
Here I have tried to preserve Merge as the defining and simple feature of language – it’s the thing that allows people to grow structures. It also clearly separates Merge from the issue of “what happens during sentence processing”, and really highlights the core of language as something not directly tied to communication. Essentially, the theory of syntax becomes the theory of structures and dependencies, not producing and understanding sentences. On this conception of language, there is this Merge machinery creating structures, perhaps new in evolution that can be harnessed by an (evolutionarily older) attentional/memory system for the purposes of producing and comprehending sentences through storing treelets in long term memory. Merge is clearly separate from this communication/memory system, and an engine of thought. Learning a language then becomes a matter of refining the retrieval operations and what kinds of stored treelets you have that are optimized for communicating with others over time.
If this is a reasonable picture of the language faculty, thinking along these lines might start to help resolve some conundrums in the traditional domain of syntax. For example, there is often the intuition that syntactic islands are somehow related to processing difficulty (Kluender & Kutas 1993; Berwick & Weinberg, 1984), but there is good evidence that islands cannot be reduced to online processing difficulty or memory resource demands (Phillips, 2006; Sprouse et al., 2012). One approach might be to attribute islands to a processing constraint that somehow becomes grammaticalized (Berwick & Weinberg, 1984). The present framework provides a way for thinking about this issue, because the interaction between syntax and the online processing/memory system is specified. I have some more specific thoughts on this issue that might take the form of a future post.
At any rate, I would love any feedback on this type of proposal. Do we think this is a sensible idea of what the language faculty looks like? What are some serious objections to this kind of proposal? If this is on the right track, then I think we can start to make some more serious hypotheses about how language is implemented in the human brain beyond Broca’s area = Merge.
Many thanks to Nick Huang (particularly for pointing out relevant pieces of literature), Marta Ruda, Shota Momma, Gesoel Mendes, and of course Norbert Hornstein for reading this and giving me their thoughts. Thanks to Ellen Lau, Alexander Williams, Colin Phillips and Jeff Lidz for helpful discussion on these topics. Any failings are mine, not theirs.
Berwick, R. C., & Weinberg, A. S. (1983). The role of grammars in models of language use. Cognition, 13(1), 1-61.
Berwick, R., and Weinberg, A.S. (1984). The grammatical basis of linguistic performance. Cambridge, MA: MIT Press.
Bresnan, J. (2001). Lexical-Functional Syntax Blackwell.
Chomsky, N. (2005). Three factors in language design. Linguistic inquiry, 36(1), 1-22.
Chomsky, N. (1965). Aspects of the Theory of Syntax. MIT press.
Culicover, P. W., & Jackendoff, R. (2005). Simpler syntax. Oxford University Press on Demand.
Culicover, P. W., & Jackendoff, R. (2006). The simpler syntax hypothesis. Trends in cognitive sciences, 10(9), 413-418.
Demberg, V., & Keller, F. (2008, June). A psycholinguistically motivated version of TAG. In Proceedings of the 9th International Workshop on Tree Adjoining Grammars and Related Formalisms. Tübingen (pp. 25-32).
Embick, D., Marantz, A., Miyashita, Y., O'Neil, W., & Sakai, K. L. (2000). A syntactic specialization for Broca's area. Proceedings of the National Academy of Sciences, 97(11), 6150-6154.
Fedorenko, E., Behr, M. K., & Kanwisher, N. (2011). Functional specificity for high-level linguistic processing in the human brain. Proceedings of the National Academy of Sciences, 108(39), 16428-16433.
Fodor, J., Bever, A., & Garrett, T. G. (1974). The psychology of language: An introduction to psycholinguistics and generative grammar.
Frank, R. 2002. Phrase Structure Composition and Syntactic Dependencies. Cambridge, Mass: MIT Press.
Grodzinsky, Y. (2000). The neurology of syntax: Language use without Broca's area. Behavioral and brain sciences, 23(01), 1-21.
Grodzinsky, Y., & Friederici, A. D. (2006). Neuroimaging of syntax and syntactic processing. Current opinion in neurobiology, 16(2), 240-246.
Grodzinsky, Y. (2006). A blueprint for a brain map of syntax. Broca’s region, 83-107.
Jackendoff, R. (2003). Précis of foundations of language: brain, meaning, grammar, evolution. Behavioral and Brain Sciences, 26(06), 651-665.
Joshi, A. K., & Schabes, Y. (1997). Tree-adjoining grammars. In Handbook of formal languages (pp. 69-123). Springer Berlin Heidelberg.
Kluender, R., & Kutas, M. (1993). Subjacency as a processing phenomenon. Language and cognitive processes, 8(4), 573-633.
Lewis, S., & Phillips, C. (2015). Aligning grammatical theories and language processing models. Journal of Psycholinguistic Research, 44(1), 27-46.
Lewis, R. L., & Vasishth, S. (2005). An activation‐based model of sentence processing as skilled memory retrieval. Cognitive science, 29(3), 375-419.
Lewis, R. L., Vasishth, S., & Van Dyke, J. A. (2006). Computational principles of working memory in sentence comprehension. Trends in cognitive sciences, 10(10), 447-454.
Linebarger, M. C., Schwartz, M. F., & Saffran, E. M. (1983). Sensitivity to grammatical structure in so-called agrammatic aphasics. Cognition, 13(3), 361-392.
Lukyanenko, C., Conroy, A., & Lidz, J. (2014). Is she patting Katie? Constraints on pronominal reference in 30-month-olds. Language Learning and Development, 10(4), 328-344.
Matchin, W., Sprouse, J., & Hickok, G. (2014). A structural distance effect for backward anaphora in Broca’s area: An fMRI study. Brain and language, 138, 1-11.
Miller, G. A., & Chomsky, N. (1963). Finitary models of language users.
Mohr, J. P., Pessin, M. S., Finkelstein, S., Funkenstein, H. H., Duncan, G. W., & Davis, K. R. (1978). Broca aphasia Pathologic and clinical. Neurology, 28(4), 311-311.
Momma, 2016 (doctoral dissertation, University of Maryland, department of Linguistics)
Musso, M., Moro, A., Glauche, V., Rijntjes, M., Reichenbach, J., Büchel, C., & Weiller, C. (2003). Broca's area and the language instinct. Nature neuroscience, 6(7), 774-781.
Omaki, A., Lau, E. F., Davidson White, I., Dakan, M. L., Apple, A., & Phillips, C. (2015). Hyper-active gap filling. Frontiers in psychology, 6, 384.
Pallier, C., Devauchelle, A. D., & Dehaene, S. (2011). Cortical representation of the constituent structure of sentences. Proceedings of the National Academy of Sciences, 108(6), 2522-2527.
Phillips, C. (1996). Order and structure (Doctoral dissertation, Massachusetts Institute of Technology).
Phillips, C. (2006). The real-time status of island phenomena. Language, 795-823.
Rogalsky, C., & Hickok, G. (2011). The role of Broca's area in sentence comprehension. Journal of Cognitive Neuroscience, 23(7), 1664-1680.
Santi, A., & Grodzinsky, Y. (2012). Broca's area and sentence comprehension: A relationship parasitic on dependency, displacement or predictability?. Neuropsychologia, 50(5), 821-832.
Santi, A., Friederici, A. D., Makuuchi, M., & Grodzinsky, Y. (2015). An fMRI Study Dissociating Distance Measures Computed by Broca’s Area in Movement Processing: Clause boundary vs Identity. Frontiers in psychology, 6, 654.
Sprouse, J. (2015). Three open questions in experimental syntax. Linguistics Vanguard, 1(1), 89-100.
Sprouse, J., Wagers, M., & Phillips, C. (2012). A test of the relation between working-memory capacity and syntactic island effects. Language, 88(1), 82-123.
Stowe, L. A., Haverkort, M., & Zwarts, F. (2005). Rethinking the neurological basis of language. Lingua, 115(7), 997-1042.
Stowe, L. A. (1986). Parsing WH-constructions: Evidence for on-line gap location. Language and cognitive processes, 1(3), 227-245.
Vosse, T., & Kempen, G. (2000). Syntactic structure assembly in human parsing: a computational model based on competitive inhibition and a lexicalist grammar. Cognition, 75(2), 105-143.
Wilson, S. M., & Saygın, A. P. (2004). Grammaticality judgment in aphasia: Deficits are not specific to syntactic structures, aphasic syndromes, or lesion sites. Journal of Cognitive Neuroscience, 16(2), 238-252.
Zaccarella, E., & Friederici, A. D. (2015). Merge in the human brain: A sub-region based functional investigation in the left pars opercularis. Frontiers in psychology, 6.