Tuesday, January 31, 2017

A small addendum to the previous post on Syntactic Structures

Here’s a small addition to the previous post prompted by a discussion with Paul Pietroski. I am always going on about how focusing on recursion in general as the defining property of FL is misleading. The interesting feature of FL is not that it produces (or can produce) recursive Gs but that it produces the kinds of recursive Gs that it does. So the minimalist project is not to explain how recursion arises in humans but how a specific kind of recursion arises in the species. What kind? Well the kind we find in the Gs we find. What kind are these? Well not FSGs nor CFGs, or at least this is what Syntactic Structures (SS) argues.

Let me put this another way: GG has spent the last 60 years establishing that human Gs have a certain kind of recursive structure. In SS, it argued for a transformational grammar arguing that FSGs (which were recursive) were inherently too weak and that PSGs (also recursive) were inadequate empirically. Transformational Gs, SS argued, are the right fit.

So, when people claim that the minimalist problem is to explain the sources of recursion or observe that there may be/is recursion in other parts of cognition thereby claiming to “falsify” the project, it seems to me that they are barking up a red herring (I love the smell of mixed metaphors in the morning!). From where I sit, the problem is explaining how an FL that delivers TGish recursive G arose as this is the kind of FL that we have and the kinds of Gs that it delivers. SS makes clear that “in the earliest days of GG,” not all recursive Gs are created equal and that the FL and Gs of interest have specific properties. It’s the sources for this kind of recursion we want to explain. This is worth bearing in mind when issues of recursion (and its place in minimalist theory) make it to the spotlight.


  1. This is spot on. Recursion by itself is a very weak property in the sense that every G in the Chomsky heirachy is recursive qua enumerative of a a set (an L). But we know that not every such G is empirically adequate. It seems to me that a lot of the recursion 'debate' is due to a focus on HCF at the expense of anything else Chomsky has written, such as SS, and an ignorance of the basics of the relevant maths/logic. I look forward to Lobina's book on this topic.

  2. I think it is a terrible mistake to continue to use "recursion" to mean computable, when there are plenty of other words to use (like computable, for a start). I assume that Norbert is not talking about that sort of recursion in this post.

    (What would a non-recursive grammar look like for example?)

  3. Well, it might be misleading, but it is a traditional way of speaking, respects the Church-Thuring thesis, and, I think, is precisely the way NC often uses the notion (e.g., see, among other places, the dicussion of recursion in relation to an objection from Hintikka in Rules and Reps (pp. 123ff)). I don't see what is 'terrible' about it. Nothing of substance, anyway, is affected if 'computable' is resrved for the general property of determining an r.e. set, and 'recursion' is restricted to some subset of procedures with relevant properties. That way of proceeding, though, certainly distorts the history, and especially the method of SS, which basically tests various recursive devices to settle on a class of transformational grammars. As for what a non-recursive grammar would look like... Exactly! It was precisely Chomsky's acheivement to see that a recursive device was required; there then arises the further question of what kind of such device will prove to be empirically adequate as the basis for a theory of linguistic competence. After all, back in the day, many thought that you didn't need a grammar at all - witness NC's colleagues at the time, such as Skinner and Quine. I'm sure there are many equally benighted people around nowadays.

  4. Could you give an example of a grammar which is not recursive? (In the relevant sense). I don't follow.

    I think there is a confusion about upper bounds versus lower bounds. Possibly only I am confused.

    1. Sorry, my fault. I agree, any grammar will be recursive. My point was that at the time of SS, and maybe even now, there is a position that says, in effect, there are now grammars underlying linguistic competence. So, first one establishes the very need for a computational/grammatical approach, and the ask what kind of grammar is empirically adequate.

    2. Excuse the typos- damned predictive text:)

    3. I completely agree with that; one of Chomsky's most important and fundamental contributions.

      But with recursion -- we are trying to account for an ability which humans have and which non-humans don't have. So it seems like recursion (in the computable sense) is not the right kind of property, because it is an upper bound.
      But recursion in the other sense (a rule calling itself) might be.

    4. I see, yes, talk of recursion being unique to the human line would be trivially false construing recursion as computable; as you say, as an upper bound, it includes lots of procedures hardly unique to us, let alone unique to language. The claim I took Norbert to be making via SS, which is one I endorse, is that we precisely shouldn't be content asking the general question about recursion/computation vis-a-vis language - it is settled - but the more particular question: what species of recursion is set by UG? I think the likelihood for confusion hereabouts can be avoided by, as indicated, pitching the question as one about the kind of recursion involved in language, rather than speaking as if, formally, recursion is just one kind of thing, excluding the upper bound construal. I do now see your point, though.

    5. Yes; that's the right question. I think we already know the answer though -- there is a pretty strong consensus about what the right sort of recursive grammars are, modulo some debate about copying in Yoruba etc... But generative grammarians seem to not be that interested in this question (or maybe in this answer).

    6. You mean the answer is 'mildly context-sensitive' (modulo the weirdness of Yoruba)? I'm not sure why you think GGs seem not to be interested in this answer. I take it, although this might be an illusory impression, that GGs agree that the right species of grammar lives in this logical area, but there remain empirical/conceptual/explanatory issues dividing Merge, TAG, CCG, etc. Or maybe I've misunderstood you.

    7. I was thinking of specific proposals within the MCS literature but that's the general space.

      Why do I think that GGers are not interested in this? I was reading the new Oxford Handbook of Universal Grammar ( which has a wide range of illustrious GGers as authors, and which has almost no references to this literature, and those two or three references that do occur are clearly othered. And that seems to be fairly representative on my albeit rather biased reading patterns.

    8. OK, I see what you mean. I have no real answer, save to say that the kind of substantive debates/disagreements between people who agree on the general logical space would tend not to advert to that space, because it is agreed upon, albeit implicitly (modulo Yoruba). For example, one can disagree about whether there should be traces/copies/variables or not, but that needn't affect the issue of logical space. Obviously, I don't think anyone should doubt the real success of narrowing in on the relevant logical space, which should be trumpeted in a Handbook. I suppose it might just come down to explanatory priorities, but others on here should be able to speak for the field, as it were, in a way an interloper like me can't.

    9. I'd say it's a mixture of

      1) unawareness: in a world where fewer and fewer departments even have time to teach GB, let alone other wide-spread formalisms like HPSG or CCG (which would greatly improve their students' employability in industry), the early ideas of SS and the work that has built on them since then don't get a proper airing anymore.

      2) limited use: if you are trying to account for, say, split agreement patterns in Senaya, the MCS hypothesis does very little for you beyond pointing out that the data can be accounted for and that the account you're proposing isn't crazy from a computational perspective. Formal language theory can give rise to interesting universals, but for various reasons this has worked much better in phonology than syntax so far

      3) high buy-in cost: I think it is easier for an outsider to get the overall gist of a linguistics paper than a formal language theory paper. The latter has more specialized notation, tons of alternative concepts sampled from all areas of mathematics (usually with no explanation or examples), whereas the former has a more focused set of core tools and is written in a more accessible style. So even the most well-meaning linguist will probably struggle with formal language theory papers. If there's no immediate payoff, why bother?

      4) methodological differences: formal language theory cares about complex patterns, whereas linguists are mostly intrigued by complicated patterns. A computationally complex pattern can be very simple (e.g. Chinese number names), and complicated pattern usually aren't particularly complex (e.g. verbal morphology in Navajo). A syntactic example would be very elaborate remnant movement analyses with dozens of movement steps, which wouldn't even make me flinch but would attract a lot of attention (positive and negative) from syntacticians.

      5) abstractness: the theorems of formal language theory are necessarily abstract in nature, that's their very strength. But understanding the relevance of abstract things is difficult, and there aren't a lot of people/workshops/papers that can make the implications apparent. The most profound result in the world won't impress many people if you --- and your whole community --- do a lousy job conveying its importance.

      6) equating generative capacity with E-language: E-language is the belzebub of syntax, with the ubiquitous assertion that E-language is completely irrelevant for I-language. So if weak/strong generative capacity pertain to E-language, nothing can be learned from them. I never understood this argument very well, but here's my utilitarian reply: The fact of the matter is that we have learned quite a bit about I-language from generative capacity, so if a philosophical argument implies that this shouldn't have been possible I'd say there's something wrong with the argument, not formal language theory.

      7) Selective perception: linguists do actually pay attention to formal language theory when it suits their purpose. When Peters & Ritchie showed that Transformational Grammar is an unrestricted formalism, that result was ultimately declared irrelevant (even though Peters & Ritchie's main insight wasn't even about generative capacity). When GPSG was shown to be too weak, GB-ers were pretty happy with that result. So linguists do pay attention, just don't expect them to care about the things you think they should care about.

      8) specificity: I have brought up this point many times on this blog --- linguists like specificity, down to what kind of feature system is used. Abstracting away from such questions is usually considered a detriment, not a virtue.

    10. Thanks Thomas. I don't quite understand point (6). I take the animus towards an E-language conception to be more or less the same (although not quite the same) as the animus towards thinking that weak generative capacity (or mere extensional adequacy) is sufficient for the assessment of a grammar (in effect, this was Quine's position). I take it that no-one has an animus towards the idea of strong generative capacity shedding light on I-language. What philosophical argument do you have in mind?

    11. The argument against E-language is stronger in that strong generative capacity results are also insufficient --- the goal is to describe grammar, not its extension, whether those are strings, trees, graphs, logical formulas, and so on. Since any notion of generative capacity talks about the latter, it is taken to be too weak to address linguistic issues. At least that's how I understand the argument. I don't quite get where the leap from "this is an abstraction of our object of study" to "this cannot tell us anything worthwhile" comes from. There's many people who could give you a more informed answer on this particular point. But from a practical point it's a moot issue because there are many things generative capacity tells you about I-language.

    12. Thanks. Yeah, I see the argument now, but I've never come across it. It seems a bit confused. I mean, you still have sgc even if there are no strings.

    13. This is really a propos Alex's comment. I tried, in my general Wiley Syntax piece ( to have a whole section on mild context sensitivity, and why it was an important upper bound, and the reviewers almost unanimously told me to remove it. Their reasons were broadly of the `this is too abstract/complex/unfamiliar' type Thomas adverts to. None of them raised the E-language argument. Some of it survived in the unpublished version `syntax for cognitive scientists' on Lingbuzz, but the gatekeepers at Wiley Interdisciplinary Reviews (ironically) wouldn't have it.

    14. It is maybe a bit of a problem that the fully formalized and implemented theories with (afaik) the widest empirical coverage, LFG and HPSG, are not mildly context sensitive; eventually this needs to be dealt with somehow, perhaps by showing that the extra power is actually do something, or by constraining the theories.

    15. A fragment of HPSG has been translated to TAG, and finite-copying LFG is mildly context-sensitive. The question is whether the actual research papers can be fit into those restricted versions. That is a major undertaking, just like squeezing 99% of the Minimalist literature into the MG corset took a long time.

    16. I think that one important difference is that in these literatures (LFG and HPSG) the analyses are not at all written in a way that avoids the mechanisms which lead to high generative capacity. Whereas the minimalist literature was already more or less compatible with the SMC.

    17. Sure, but for MGs we also had to figure out things like how to deal with multi-dominance, arbitrary constraints (including transderivational ones), Late Merge, sidewards movement, and so on. In hindsight it all seems kind of obvious, but that's because the way we think about MGs has changed a lot since 1997.

      I don't know a lot of practical HPSG and LFG --- I'd be hard-pressed to quickly sketch a natural LFG account of, say, verb-second --- but in principle I could imagine that the situation is similar to SPE in phonology: the formalism is completely unrestricted and linguists seem to use all its power (arbitrary context specifications, deletion rules), but when you look very carefully, you notice that they implicitly obey an additional constraint that keeps the formalism within the bounds of a much weaker class.

    18. Thomas, Alex, and others have lamented the apparent lack of interest in formal grammars and speculated why that may be the case. I think a more productive question ought to be raised by the formalists themselves: do the formal results make an immediate impact on the day to day operations of linguistic analysis, and how do they resolve outstanding problems? A very similar set of issues arise at the intersection between formal learnability and language acquisition. There are many important and elegant results from the formal studies of learning and inference but in my experience, only those that have very direct implications for the empirical acquisition research generated interest, with the Subset Principle being a prime example. It helped when Berwick did some of the outreach himself.

    19. @Thomas: Stefan Müller told me once that he has a parser which runs blazingly fast on his grammar fragments. (And that this parser was correct for the fragments; i.e. he was not making use of invalid heuristics.) That would support your hope.

      @Charles: I would contend that the same question could (and should) be asked of theoretical linguists by field workers. The theoretical linguist's response is then mine to you. A little less glibly, theoretical linguists tend to identify ad hoc (or substantive) restrictions on grammars (C > T > v > V; etc). Formalists want to trade ad hoc restrictions in for explanatory (or formal) ones. This is what's going on in Thomas' discussion of features; he is wanting to identify logical underpinnings of the restrictions that linguists describe ad hocly. (He wants to figure out what is going on, and he is being told about one arbitrary way of describing it.)

      As for the 'immediate impact', the answer is of course no, just as theoretical results make no immediate impact on the day to day operations of field workers. However, theoretical syntacticians have sometimes worried about the following, the theoretical issues surrounding which have been solved by formalists.
      - ((transderivational) economy) constraints
      - (multiple/cyclic) spellout
      - (multiple/cyclic) LF-interpretation (i.e. direct compositionality)
      - phases
      - head movement
      - numerations/lexical (sub)arrays
      - the nature of structure

    20. @Charles: I don't think anybody is lamenting anything here. If you look at my original post, it's a pretty matter-of-fact list of why formalists don't enjoy more mind share among linguists. Your point is already included there as #2 and, indirectly, #8, so I do not disagree with you at all.

      That said, Greg raises a very important point: linguists themselves are not a monolithic block. Just think of Norbert's linguist VS languist dichotomy. My #2 was mostly about languists, researchers whose primary interest is the correct analysis of empirical phenomena rather than the system that produces them. Greg has given a pretty good list of topics that linguists care about for which strong formal results exist, and many more could be added to it. But even languists can profit from formal language theory becuase it does produce new typological predictions that can be tested empirically.

      This has worked really well in computational phonology. For example, the subregular hierarchy gives you very fine-grained results about what kind of phonological processes you expect to see, explains why phonology doesn't have any majority-rules pattern, gives additional computational evidence that suprasegmental phonology can do more than segmental phonology, makes complexity predictions that were confirmed in artificial language learning experiments, and also gives you a rough idea of what the learning problem with these classes looks like. A ton of stuff has happened in that area over the last ten years, thanks to people like Jeff Heinz, Jane Chandlee, Adam Jardine, and many others.

      Some recent examples that everybody can read without any computational background:

      - McMullin's thesis on computational phonotactics derives the typological differences between vowel harmony and consonant harmony from computational considerations.

      - Bjorkman & Dunbar study cyclic stress in Chamorro, point out a previously unnoticed typological gap, and explain it computationally.

      - I'll also shamelessly plug my 2016 paper with Alena Aksenova and Sedigheh Moradi on morphotactics and why we do not find things like unbounded circumfixation in morphology.

      - I'm also very, very confident that the computational session at CLS this year will have some great empirical work.

      For syntax things are not as tight on the empirical side because i) the formal language hierarchies are still too coarse once you move past the class of regular languages, and ii) tree languages are more interesting, but harder to study empirically. That said, empirical claims have originated from this work, e.g. restrictions on Principle B.

      Overall, this isn't a matter of whether formal language theory could ever be useful, it's purely a question of whether there's enough of a payoff given the time investment of learning the basics.
      In some areas there already is, in others there isn't. I am confident that we'll be able to greatly increase the payoff for everybody in the future.

    21. Oh boy, now that I've named a few examples I feel like I snubbed so many others that are worth mentioning: many of Tim Hunter's papers are a nice example of how computational considerations can inform your Minimalist analysis. Greg also has a few papers along this lines, e.g. on ATB movement as hypothetical reasoning. Hans-Martin Gartner also makes a great effort in his works to combine generative syntax with computational insights. I'd also say that many analyses that originate in formalisms like TAG with the goal of computational simplicity end up with very interesting empirical insights. There's really a lot of interesting stuff.

      On a completely different note, formal language theory also has an indirect advantage: since the results are more abstract and content-agnostic, they transfer more easily to non-linguistic problems. So you end up with TAG models of RNA, models from computational phonology being applied in robotics, and so on. Linguistics used to be a major exporter of ideas: Schleicher's Stammbaum theory inspired Darwin, transformations were a major inspiration for tree transductions (which are now at the heart of compiler design and machine translation), and linguistic research led to the development of the first parsers. Many fundamental theorems of computer science were proven by linguists (e.g. Yuki Kuroda's equivalence proof for CSL and linear bounded automata) or appeared in linguistics journals (Bar-Hillel et al's proof that CFLs are closed under regular intersection, which appeared in a phonetics(!) journal). Anything that helps linguists increase the market share of their ideas is a good thing in my book.