Thursday, November 8, 2018

Guest Post by William Matchin: Reflections on SinFonIJA 11, Part 1

Before posting this, let me encourage others to do what William is doing here. Send me stuff to post on FoL. The fact that I have become fat and lazy does not mean that others need go to seed as well. This is the first of two posts on the conference. 


I thought that I would write my thoughts about SinFonIJA 11 in Krakow, Poland, which just finished this past weekend. It was organized by three professors in the Dept. of English Studies at Jagiellonian University in Krakow: Marta Ruda, a former visiting scholar at UMD Linguistics, Mateusz Urban, and Ewa Willim, who was Howard Lasnik’s student and the recipient of the infamous manuscript ‘On the Nature of Proper Government’[1]. All three of them were gracious hosts and the conference was very well organized, informative, and fun. SinFonIJA is a regional[2]conference on formal linguistic analysis focusing on syntax and phonology, but as a neuroscientist, I felt quite welcome and many of the attendees expressed interest in my work. Kraków is a beautiful city and definitely worth visiting, to boot; if you ever visit, make sure to see the Wieliczka salt mine[3].

I suppose my sense of welcome was helped by the fact that the main theme of the conference was “Theoretical Linguistics within Cognitive science” – I was invited to chair a round table discussion on how linguistics is getting on with the other cognitive sciences these days. Linguistics was a founding member of the modern cognitive sciences during the cognitive revolution in the 50s and 60s – perhaps the founding member, with the work by Chomsky in Generative Grammar stimulating interest in deeper, abstract properties of the mind and articulating an alternative vision of language from the dominant behaviorist perspective. Marta was the key instigator of this theme – this was a frequent topic of discussion between us while we were both at the UMD Linguistics dept., which has a unique capacity to bridge the gaps between formal linguistic theory and other fields of cognitive science (e.g., acquisition, psycholinguistics, neuroscience). The invited keynote speakers comprising the round table addressed foundational questions underlying linguistic theory as well as the relation between formal linguistics and the cognitive sciences in their own talks. The main part of this post will reflect on this topic and the roundtable discussion, but before that I’d like to discuss Zheng Shen’s talk, which highlighted important issues regarding the methods in formal linguistics. Much of what I say here reiterates a previous post of mine on FoL[4].

Methods and data in formal linguistics

Lately there has been noise about the quality of data in formal linguistics, with some non-formal linguists calling for linguists to start acting more like psychologists and report p-values (because if you don’t have p-values, you don’t have good data, naturally). My impressions are that these concerns are greatly exaggerated and a non-sequitur. If anything, my feelings are that formal linguistics, at least of the generative grammar variety, is on a greater empirical footing than psycholinguistics and neurolinguistics. This is because linguistics rightly focuses on theoretical development, with data as a tool to sharpen theory, rather than a fixation on data itself. This is illustrated well by Shen’s talk.

Shen began by discussing his analysis of agreement in right node raising (RNR) and its empirical superiority over other accounts (Shen, 2018[5]). His account rested on a series of traditional informal acceptability judgments, consulting a small number of native speakers of English to derive the patterns motivating his analysis. Interestingly, other authors offered a competing account of agreement in RNR, which was not just an alternative analysis but included conflicting data patterns – the two papers disagreed on whether particular constructions were good and bad (Belk & Neelman, 2018) (see the abstract submitted by Shen for details[6]). Shen then performed a series of carefully designed acceptability judgment experiments to sort out the source of the discrepancy, ultimately obtaining patterns of data from large groups of naïve participants that essentially agreed with his judgments rather than Belk & Neelman’s. 

Psychologists (particularly Ted Gibson & Ev Fedorenko) have been heavily critical of methods in formal linguists of late, claiming that informal acceptability judgments are unreliable and misleading (Gibson & Fedorenko, 2010; 2013; their claim of weak quantitative standards in linguistics has been directly contradicted by the exhaustive research of Sprouse & Almeida, 2012; 2013, which illustrates a replication rate of 95-98% of informal judgments presented in a standard syntax textbook as well as a leading linguistics journal with naïve subjects in behavioral experiments[7],[8]). This disagreement about data with respect to RNR appears to support these attacks on formal linguistics by providing a concrete example.

This critique is invalid. First, the two sets of authors agreed on a large set of data, disagreeing on a small minority of data that happened to be crucial for the analysis. The two competing theoreticalaccounts highlighted the small discrepancy in data, leading to a proper focus on resolving the theoretical dispute via cleaning up the data point.

Second, Shen’s original judgments were vindicated. In other words, the behavioral experiments essentially replicated the original informal judgments. In fact, Shen noted several quite obvious issues with the use of naïve subjects, in that they may not be sensitive to making judgments under particular interpretations – that is, they may judge the string to be acceptable, but not under the crucial interpretation/structural analysis under consideration. It took a large amount of work (and I assume money) to handle these issues with multiple experiments to (in a nutshell) merely replicate informal judgments that were obtained far more rapidly and easily than the experiments. Essentially, no new data points were obtained – only replications. It is not clear why Shen and Belk & Neelman disagreed on the data (potentially because of dialect differences, British vs. American English) – but it certainly the problem was not with Shen’s informal judgments.

These two facts inform us that while large-scale experiments can be useful, they are not the drivers of research. Shen’s hard work provided replications in the context oftwo detailed, competing theoretical analyses. The experimental data were only acquired after the theoretical analyses were proposed, and those analyses were based on informal judgment data. If we take Gibson & Fedorenko’s (2010) demands for eschewing informal judgments entirely, then we would end up with disastrous consequences, namely slavishly collecting mass amounts of behavioral data, and spending inordinate amounts of time analyzing that data, all in the absence of theoretical development (which is one of the drivers of the un-replicability plague of much of social psychology). Theory should drive data collection, not the other way around.

With that said, the next section changes gears and discusses the special topic of the conference.

Theoretical linguistics within cognitive science: a crisis?

First, I will summarize my introduction to the round table and the main feelings driving what I and Cedric Boeckx perceive to be a crisis regarding the place of formal linguistics in the cognitive sciences – from my perspective, cognitive neuroscience specifically. As I pointed out in a previous blog post on Talking Brains[9], this crisis is well-illustrated by the fact that the Society for the Neurobiology of language has never had a formal linguist, or even a psycholinguist, present as a keynote speaker in its 10 years of existence, despite many presentations by neuroscientists and experts on non-human animal communication systems.

I think there are many reasons for the disconnect – paramount among these a lack of appreciation for the goals and insights of linguistic theory, sociological factors such as a lack of people who are knowledgeable of both domains and the objectives of both sides, and likely many others. My main point was not to review all of the possible reasons. Rather, I thought it appropriate when discussing with linguists to communicate what is possible for linguists to do to rekindle the interaction among these fields (when I talk to cognitive neuroscientists, I do the opposite – discuss what they are missing from linguistics). I used my own history of attempting to bridge the gaps among fields, raising what I perceived to be a frustrating barrier - the competence/performance distinction. Consider this line from Aspects (Chomsky, 1965), the authoritative philosophical foundation of the generative grammar research enterprise:

… by a generative grammar I mean simply a system of rules that in some explicit and well-defined way assigns structural descriptions to sentences”

The idea that language is a system of rules is powerful. In the context of the mentalistic theory of grammar, it embodies the rejection of behaviorism in favor of a more realistic as well as exciting view of human nature – that our minds are deep and, in many ways, independent of the environment, requiring careful and detailed study of the organism itself in all of its particularities rather than merely a focus on the external world. It calls for a study of the observer, the person, the machine inside of the person’s head that processes sentences rather than the sentences themselves. This idea is what sparked the cognitive revolution and the intensive connection between linguistics and the other cognitive sciences for decades, and led to so many important observations about human psychology.

For a clear example from one of the conference keynote speakers: the work Ianthi Tsimpli did on Christopher, the mentally impaired savant who apparently had intact (and in fact, augmented) ability to acquire the grammar of disparate languages[10], including British Sign Language[11], in the face of shocking deficits in other cognitive domains. Or my own field, which finds that the formal descriptions of language derived from formal linguistic theory, and generative grammar in particular – including syntactic structures with abstract layers of analysis and null elements, or sound expressions consisting of sets of phonological features that can be more or less shared among speech sounds – have quite salient impacts on patterns of neuroimaging data[12],[13].

However, it is one thing to illustrate that hypothesized representations from linguistic theory impact patterns of brain activity, and another to develop a model for how language is implemented in the brain. To do so requires making claims for how things actually work in real time. But then there is this:

“... agenerative grammar is not a model for a speaker or a hearer ... When we say that a sentence has a certain derivation with respect to a particular generative grammar, we say nothing about how the speaker or hearer might proceed ... to construct such a derivation”.

The lack of investigation into how the competence model is usedposes problems. It is one thing to observe that filler gap dependences – sentences with displaced elements involving the theoretical operation Movement(or internal merge, if you like) – induce increased activation in Broca’s area relative to control sentences (Ben-Shachar et al., 2003), but quite another to develop a map of cognitive processes on the brain. Most definitely it is not the case that Broca’s area “does” movement[14].

It is clearly the case that linguists would like to converge with neuroscience and use neuroscience data as much as possible. Chomsky often cites the work of Friederici (as well as Moro, Grodzinsky, and others). For instance, in Berwick & Chomsky’s recent book Why Only Us they have a central part of the book devoted to the brain bases of syntax, adopting Friederici’s theoretical framework for a neurobiological map of syntax and semantics in the brain. Much of my work has pointed out that Friederici’s work, while empirically quite exceptional and of high quality, makes quite errant claims about how linguistic operations are implemented in the brain.

Now, I think this issue can be worked on and improved upon. But how? The only path forward that I can see is by developing a model of linguistic performance – one that indicates how linguistic operations or other components of the theory are implemented during real-time sentence processing and language acquisition. In other words, adding temporal components to the theory, at least at an abstract level. This was my main point in introducing the round table – why not work on how exactly grammar relates to parsing and production, i.e. developing a performance model?

At the end of Ian Roberts’s talk, which quite nicely laid out the argument for strict bottom-up cyclicity at all levels of syntactic derivation, where there was some discussion about whether the derivational exposition could be converted to a representational view that does not appeal to order (of course it can). Linguists are compelled by the competence/performance distinction to kill any potential thinking of linguistic operations occurring in time. This makes sense if one’s goal is to focus purely on competence. With respect to making connections to the other cognitive sciences, though, the instinct needs to be the reverse – to actually make claims about how the competence theory relates to performance.

Near the end of my talk I outlined three stances on how the competence grammar (e.g., various syntactic theories of a broadly generative type) relates to real-time processing (in this context, i.e. parsing):

1.    The grammar is a body of static knowledge accessed during acquisition, production, and comprehension (Lidz & Gagliardi, 2015).This represents what I take to be the standard generative grammar view – that there is a competence “thing” out there that somehow (in my view, quite mysteriously) mechanistically relates to performance. It’s one thing to adopt this perspective, but quite another to flesh out exactly how it works. I personally find this view to be problematic because I don’t think there are any other analogs or understandings for how such a system could be implemented in the brain and how it constrains acquisition and use of language (but I am open to ideas, and even better – detailed theories).

2.    The grammar is a “specification” of a parser (Berwick & Weinberg, 1984; Steedman, 2000).The idea is that there really is no grammar, but rather that the competence theory is a compact way of describing the structural outputs of the “real” theory of language, the performance models (parser/producer). If this is so, that’s quite interesting, because in my view it completely deprives the competence model of any causal reality, which completely removes its insight into any of the fundamental questions of linguistic theory, such as Plato’s problem – how language is acquired. I do not like this view.

3.    The grammar is a real-time processing device, either directly (Miller, 1962; Phillips, 1996) or indirectly (Fodor et al., 1974; Townsend & Bever, 2001) used during real-time processing and acquisition.I very much like this view. It says that the competence model is a thing that does stuff in real time. It has causal powers and one can straightforwardly understand how it works. While I don’t think that the models advocated for in these citations ultimately succeeded, I think they were spot on in their general approach and can be improved upon.

While I personally heavily favor option (3), I would love to see work that fleshes out any of the above while addressing (or leading the way to address) the core philosophical questions of linguistic theory, as discussed by Cedric Boeckx’s.

Part 2 of this post raises and addresses some of the comments by the keynote speakers on this topic.

[1]If you don’t know this story you would best hear about it from the original participants.
[2]The regional domain consists of the former Austro-Hungarian Empire. This divides the borders of current countries, so Krakow is in but Warsaw is out.
[3]Wielizca is no average mine – it was in parts beautiful and educational. It is way more fun than it sounds.
[5]Doctoral dissertation.
[7]Gibson, E., & Fedorenko, E. (2010). Weak quantitative standards in linguistics research. Trends in cognitive sciences14(6), 233-234; Gibson, E., & Fedorenko, E. (2013). The need for quantitative methods in syntax and semantics research. Language and Cognitive Processes28(1-2), 88-124.; Sprouse, J., & Almeida, D. (2012). Assessing the reliability of textbook data in syntax: Adger's Core Syntax. Journal of Linguistics48(3), 609-652; Sprouse, J., Schütze, C. T., & Almeida, D. (2013). A comparison of informal and formal acceptability judgments using a random sample from Linguistic Inquiry 2001–2010. Lingua134, 219-248.
[8]95-98% is probably an underestimate, because there are likely cases where subjects incorrectly report their judgments without properly making the judgment under particular interpretations, etc. However, even taking the 95-98% number at face value, what do we think the replication rate is in certain fields of social psychology? Are formal linguists really supposed to change their way of doing things to match a field that is notoriousthese days for lack of rigor?
[10]Smith, N. V., & Tsimpli, I. M. (1995). The mind of a savant: Language learning and modularity. Blackwell Publishing.
[11]Smith, N., Tsimpli, I., Morgan, G., & Woll, B. (2010). The signs of a savant: Language against the odds. Cambridge University Press.
[12]Brennan, J. R., Stabler, E. P., Van Wagenen, S. E., Luh, W. M., & Hale, J. T. (2016). Abstract linguistic structure correlates with temporal activity during naturalistic comprehension. Brain and language157, 81-94.
[13]Okada, K., Matchin, W., & Hickok, G. (2018). Phonological Feature Repetition Suppression in the Left Inferior Frontal Gyrus. Journal of cognitive neuroscience, 1-9.
[14]For evidence on this point see the following papers. Wilson, S. M., & Saygın, A. P. (2004). Grammaticality judgment in aphasia: Deficits are not specific to syntactic structures, aphasic syndromes, or lesion sites. Journal of Cognitive Neuroscience16(2), 238-252. Matchin, W., Sprouse, J., & Hickok, G. (2014). A structural distance effect for backward anaphora in Broca’s area: An fMRI study. Brain and language138, 1-11. Rogalsky, C., Almeida, D., Sprouse, J., & Hickok, G. (2015). Sentence processing selectivity in Broca's area: evident for structure but not syntactic movement. Language, cognition and neuroscience30(10), 1326-1338.


  1. At the risk of sounding like a broken record: I really don't think that option (1) for the relationship between competence and performance is all that mysterious. The question of what you need to supplement a grammar with to make a working theory of sentence comprehension depends, of course, on exactly what sort of grammar you start with, and so I think this question inherits an air of mystery from the fact that we often try to answer it without being clear about exactly what our competence grammar consists of. But given any particular concrete proposal about what competence grammars look like, there are answers to be had.

    I think the toy-example case study of how the performance question pans out under the simplifying assumption that competence takes the form of a context-free grammar is informative. I wrote about this in detail in section 3 here:
    (See in particular the first bullet point in section 3.4.)

    When you adopt more linguistically-realistic grammars, things get more complicated, but the issues are not qualitatively different. (There are problems to be solved, but no mysteries.) Stabler's work provides one concrete picture of what the competence grammar looks like, and given that starting point the question of what needs to be added (i.e. what sort of system might parse a sentence by accessing that static knowledge) can also be answered:
    (My usual advice to students is that understanding the CFG case is a very useful stepping stone towards understanding these. See also section 4 of my paper linked above.)

    The general point I want to make doesn't depend on us all buying into Stabler's formalism. It's just one worked-out theory of what the competence might looks like, for which the parsing question is being tackled; the absence of other such models shouldn't be taken as a strike against option (1).

    1. "The question of what you need to supplement a grammar with to make a working theory of sentence comprehension depends, of course, on exactly what sort of grammar you start with, and so I think this question inherits an air of mystery from the fact that we often try to answer it without being clear about exactly what our competence grammar consists of"

      I think this is exactly the problem. Thanks for the specific references. It is most useful to have specific proposals, e.g. for minimalism, that illustrate the ramifications for the causal role of the grammar. And less technical papers, or at least a clear exposition of the model. When Chomsky introduces merge, or any other previous instantiation of a theory of generative grammar, I understand (at least roughly) how it works. When I ask about parsing implementation I get answers that are not of the same form as how the grammatical model is introduced, i.e. they do not seem to make a clear proposal for an integrated grammar/parser. That's the sort of thing I am looking for, and I think that's what would greatly help cognitive (neuro)scientists use linguistic theory to guide their work.

    2. In other words, what is needed is "this is an answer to your question", rather than "this is how one might try to go about answering your question".

    3. Following up on Tim's remark: it's not even clear to me that 1, 2, and 3 are distinct. For the sake of simplicity I'll illustrate this with CFGs, but as Tim said, the general argument also holds for Minimalism, GPSG, TAG, and CCG, among others.

      A parsing algorithm is the combination of

      1. a data structure (how do you store parses), and
      2. a control structure (how do you prioritize parses), and
      3. a parsing system.

      The first two don't have much to do with anything a linguist would consider a grammar, so let's put those aside.

      The parsing system is a collection of rules that tell you what can be inferred and/or conjectured based on the information available so far. The parsing system, in turn, is a specific instantiation of a parsing schema (careful, don't confuse parsing system and parsing schema).

      For instance, the parsing schema for a top-down CFG parser tells you that if you have an X from position i to position j in the string, then you can instead assume that you have some YZ from i to j. In the parsing system, this general rule is replaced by a collection of inferences like the following:

      - [i, VP, j] -> [i, V NP, j]
      - [i, VP, j] -> [i, AdvP VP, j]
      - [i, NP, j] -> [i, Det N, j]
      - [i, NP, j] -> [i, Det AP N, j]

      Since we already know the underlying parsing schema, we can factor it out of these inferences to get the following:

      - VP -> V NP
      - VP -> Adv VP
      - NP -> Det N
      - NP -> Det AP N

      So there's your grammar. The difference between 1 and 2 seems to be whether you take the grammar as your starting point and combine it with the parsing schema to obtain the parsing system, or do what I did above and obtain the grammar by factorization of the parsing system with respect to the parsing schema.

      For what it's worth, we could also keep the parsing system and the grammar fixed and obtain the parsing schema through factorization. So that's option 4: the parser is a static specification of a dynamic processing system for the grammar.

      That leaves option 3. It, too, fits into the general picture above. The crucial link is intersection parsing, which allows you to reduce a grammar to exactly that fragment that only generates the input sentence. Parsing then reduces to generation with this restricted grammar, and a "parser" is just a specification of how the intersection step is carried out on the fly.

      These are all different perspectives on the same object. They only seem different if one takes the view that they describe a cognitive procedure for how the parser is implemented algorithmically - what is static, what is compiled on the fly. But that strikes me as misguided, just like Minimalism's bottom-up derivations don't commit you to the assumption that sentences are built bottom-up. Computation, and by extension cognition, is too abstract an object to be fruitfully described this way.

      So I don't see a good reason to treat these three views as distinct, and more importantly I don't understand what one could gain from doing so. There is no empirical issue that seems to hinge on it. For the purpose of linking theoretical linguistics to neuroscience, it does not matter how you divide up the workload, what matters is that you understand what remains constant irrespective of how the workload is moved around.

    4. Of course I like the analysis of competence being a formal grammar of some type, and performance being a sound and complete parser of some type, but there are a couple of problems, and thinking about may serve to separate the three proposals above.

      First is disambiguation : a parser returns a compact representation (parse forest or whatever) of all the possible parses of the input that are licensed by the grammar, whereas the processing system will return one parse, maybe the most likely parse under some probabilistic model. But any model of the performance systems has to be stuffed full of probabilities or word frequencies or something that does the same job if one has an ideological aversion to probabilities.

      Second, sound and complete parsers are going to be slow, (i.e. worse than linear time) whereas we know that the performance system works in real time and aren't sound and complete since grammaticality (being generated by the grammar) and acceptability (approximately being processed without excessive weirdnesses) aren't the same thing.

    5. I agree that these two problems are crucial if we are building a cognitive plausible theory, but again I'm not sure I see how there is any difference among the three proposals.

      Let’s look at disambiguation. In the decomposition Thomas outlined, this would mostly be part of the control structure, no? Work linking the MG parser to processing effects has mostly followed the idealization of “parser as a perfect oracle”. So: no disambiguation needed. Obviously this is a huge idealization, pushed forward with a precise aim in mind (i.e. effects of sentence structure on memory).

      But we can discard the perfect oracle and add one that is biased by structural/lexical probabilities, or whatever other decision making mechanism we like, while keeping everything else constant in the definition of the parsing system (plus or minus a probability distribution on the MG rules, I guess).

  2. Thanks for the post and discussion! tried to pose a question along this line of thought to Chomsky last year, though not very successfully:

    Chesi & Moro (2015) have recently argued that competence and performance are ac- tually interdependent. I would argue that there are essentially three possible scenarios in which the relation of grammar (G) and a parser as a performance system (P) could work out: (i) G could be independent of P, (ii) G could be accessed by P online during processing, or (iii) it could turn out that G is only implemented in wetware insofar as the totality of P’s mechanisms gives rise to a system behaving in a way that is captured by the description of G. What are your thoughts about this? And how would you describe the relation of linguis- tics to psychology and neuroscience?

    Chomsky: I don’t understand any of this. The study of competence can’t be isolated from psychology because it is part of psychology—unless we (perversely) define “psy- chology” to exclude internally-stored knowledge of language, arithmetic, etc. Psy- cholinguistics, for the past 50 years, has been closely integrated with the study of linguistic competence. How could it be otherwise? Same with neurolinguistics. Linguistic competence is represented in the brain (not the foot, not in outer space) and the same is true of performances that access this stored knowledge of language.
    Speaking personally, I’ve always regarded linguistics, at least the aspects that interest me, as part of psychology, hence ultimately biology. The relation of linguis- tics to psychology is similar to the relation of the theory of vision to psychology: part to whole. And insofar as we are concerned with what is happening in the brain, it’s integrated with neuroscience. In brief, I don’t see how any of these ques- tions even arise except under delimitation of fields that seem quite arbitrary and have never made sense to me.

    Source: Biolinguistics

  3. Perhaps one problem with the 'psychological reality of grammars' issue is that while, on the one hand, we can make rather convincing cases that many grammatical generalizations ought to be captured by the mental structures involved in language use (for example: each language has a system of rules that determine NP structure, which are independent of where the NP is found in the utterance, and of any inflectional features it might possess: if this were not the case, we would expect the apparent NP rules of languages to disintegrate over time, leading to different structures for preverbal and postverbal, or nominative, dative, accusative etc NPs, neither of which seem to happen). [I have the impression that many neuroscientists and pscyhologists are blind to this kind fact]

    On the other hand, to go from a pile of significant generalizations to a formalism that expresses them and can be used for parsing and generation involves a rather large number of arbitrary decisions, so that it is quite reasonable to balk at the claim that the entire grammar is mentally represented. It might be better to say that the entire grammar is a probably pretty bad representation of some mental structures, containing a lot of guesswork and even random choices, but also depicting certain aspects of what the mental structures do.