Comments on Faculty of Language: Universals: a consideration of Everett's full argument

One of the (many) things that puzzle me about this...

2016-09-20T16:57:23.762-07:00

One of the (many) things that puzzle me about this debate is why nobody seems to mention Tom Roeper's work on the acquisition of embedding vs iteration (which he calls 'primary recursion', but why don't we try to push the term 'recursion' to the side, and call it iteration). Linguistics is to a large extent a study of distinctions of one kind or another, and Roeper shows that this distinction is important for acquisition, in a way parallel to that in which Fred Karlsson shows that it is important for the structure of texts.

Doing that might be a useful rubble-clearing activity before producing a more widely understandable account of why Everett is probably wrong about UG (in at least some senses of the term), even if he's right about Piraha, which I think is still rather likely, since he has so much more experience with the language than those who say he's wrong.

I completely agree. Nor should we focus the discus...

2016-09-20T07:38:08.432-07:00

I completely agree. Nor should we focus the discussion on whether Piraha does or doesnt have some property diagnostic of recursion. It DOESNT MATTER. We need to stop answering the question "when did Chomsky stop beating his dog" and attack the presupposition behind the question, which is that anything Everett claims to have found regarding Piraha is simply irrelevant. Thats the big point and it remains obscured.

I continue to emphasize that we should not let Eve...

2016-09-20T06:32:43.798-07:00

I continue to emphasize that we should not let Everett get away with incorrectly defining Merge as self-embedding.

I think it is rarely emphasized that these mid-lev...

2016-09-12T21:30:09.872-07:00

I think it is rarely emphasized that these mid-level generalizations are actually incredibly useful for work on understudied languages.

As someone who does this work, I am always struck how easily many of the basic predictions of GG can be tested and confirmed, and hence interesting, even potentially Neptune-like discoveries are made.

Here's an example I'm working on now: in Moro, a language spoken in Sudan that I've worked on for about ten years, you can't form a wh-question on relative clause modifying a normal nominal argument. Interestingly, though, there is a construction similar to Romance pseudorelatives (e.g. Ho visto Gianni che correva) where what look like relative clauses are predicated of objects. Unlike relative clauses elsewhere, pseudorelatives are completely transparent to extraction. So here is an exception to a "universal" internal to a language, and one that has never been described before, but rather than falsifying universals about islands, it raises the important question about why the structure of Moro pseudorelatives is such that they allow extraction but relatives in subject position don't. In other words, there is a gravitation-like effect (islands — in fact the parallel is particularly apropos because we aren't quite sure what causes them...), and their absence in a particular situation to reveal the presence, or absence, of something that is otherwise there...

I'm not sure I follow everything that Tim says...

2016-09-12T17:32:53.698-07:00

I'm not sure I follow everything that Tim says above, but isn't a large part of the problem with g's their slippery connection to observations, especially of language use as opposed to intuitions? (1a) however points in the general direction of an implicational universal, namely, that if a verb specifies the identity of the head of something inside one of its complements, it will fix the identity of all of the heads along the way, which becomes observational and afaik true to the extent that we can pin down what we mean by 'head'. So we have the idiom "X got up Y's nose", but, no such idioms where the preposition is variable but 'nose' is not.

[Part 2 of 2] At least to me, this is the kind of...

2016-09-12T12:45:46.038-07:00

[Part 2 of 2]

At least to me, this is the kind of thing that demonstrates (a) how our theories are falsifiable, and (b) why they're not falsified by a language that doesn't embed sentences inside sentences. The idea of a "Chomsky universal" doesn't seem to be a helpful ingredient. Of course we can use the term to mean "an assumption that the learner comes equipped with", i.e. we'd say that mainstream theory takes (1a) above to be a Chomsky universal, but I don't think this is very helpful. For a start, it's a distraction (sorry) that (1a) says something about "every verb" -- holding across all verbs is not the sense of "universal" people have in mind. But what is the X in "A Chomsky universal is a property that holds of all Xs"? As Norbert frequently points out it's not languages in any external, clearly-observable sense. The best thing we could put there is probably "(mental) grammars", but I still think this is strange when you look closely at it. If the idea is that "P is a Chomsky universal" iff "for every mental grammar g: P(g)", then we end up just saying things like
(2)a. P(g) = "The lexical entry of every verb in g encodes requirements on what its complement can be headed by"
(2)b. P'(g) = "Movement of an adjunct out of a relative clause is illicit in g"
(2)c. P''(g) = "An anaphor cannot c-command its antecedent in g"
The "in g" part doesn't seem to be doing anything here. It's just a roundabout way to turn a statement which, it seems to me, is better thought of as part of the grammar into a statement about the grammar -- just so that "grammars" can be the X in "A Chomsky universal is a property that holds of all Xs".

So, here's my suggestion/question: Why don't we drop the "Chomsky universal" terminology, and just talk about what is and isn't specified by the initial state of FL? Phrasing our ideas in terms of "a different sort of universal" seems to lead to unnecessary confusion. (Perhaps it's just a holdover from days when everyone was less clear on the distinction between the things of interest and Greenberg universals?)

[Part 1 of 2] Although I agree with the gist of N...

2016-09-12T12:45:29.801-07:00

[Part 1 of 2]

Although I agree with the gist of Norbert's post I'm actually not sure that "Chomsky Universals" are the best notion for clarifying the debate. To get to that point, let me start with another point that addresses the (non-)falsifiability issue.

Maybe it would be useful to point out that there are perfectly imaginable claims about UG that would be refuted if it were true that Piraha speakers could not embed a sentence inside another sentence -- just not claims that anyone has made, as far as I know. For example, by looking at boring old English (without leaving the air-conditioning!), we can write down a couple of hypotheses about the way verbs combine with other things:
(1) a. Every verb's lexical entry encodes requirements on what a phrase that combines with that verb can be headed by.
(1) b. The verb 'hit' wants a D as the head of its sister phrase, the verb 'object' wants the word 'to' as the head of its sister phrase, the verb 'know' wants a C as the head of its sister phrase, etc.

One can easily imagine hypothesizing that the particular collection of subcategorization frames specified in (1b) is provided by FL, i.e. the learner comes equipped with the assumption that there must be some verbs that combine with DPs, some that combine with CPs, etc. This seems to be a hypothesis that would be falsified by the finding that some language didn't have sentences analogous to "I know that John left". But that's not the hypothesis that syntacticians typically make about how to parcel out the information in (1) between innate and learned: instead, the idea is basically that (1a) is provided by FL. (Of course there are other ways of parceling things out too, it doesn't have to be a split along (1a)/(1b): for example, one can imagine that FL specifies (1a) *and* that there must be some verbs that combine with DPs, but that's all.)

And it's also easy to imagine evidence against the hypothesis that FL provides (only) (1a): this would be a language where some verb can only combine with DPs that have a PP complement inside them, or some verb can only combine with CPs that have an object DP inside them, or some verb can only combine with phrases that have at least four nodes inside them, or whatever. And this is a point that's made in every intro syntax class.

"there are only very dim ideas about what mig...

2016-09-12T06:54:50.675-07:00

"there are only very dim ideas about what might be in UG, despite decades of vibrant research."

Here we disagree. I think that we have quite a good take on what belongs in UG. We have many rather decent mid-level generalizations that seem rather well grounded. I mentioned a number of them before: anaphors cannot c-command their antecedents, movement is barred out of islands, adjuncts are especially recalcitrant to movement out of islands, reconstruction into islands is forbidden, only specifics can extract out of a wh-island, SCO and WCO effects are rather robust. These are all good mid level generalizations. The exceptions to these have proven very fecund theoretically as well. So, we do not really agree about the state of the art. And I suspect I know the reason why. I see a generalization and then look to the details of the Gs of languages that appear to violate it. I look for a way that idiosyncrasies of that G might "save" the generalization. You see the surface form and conclude that it should not be saved. My strategy is Leverrier's; look for a local fact about the phenomenon of interest to understand how the principle is operative despite living the appearance that it is not. You look at the appearance and conclude that the principle is false. So, we have good cases for Neptune like discoveries (subliminal island effects which I wrote about are a recent great example) if only you are willing to look behind the surface appearances.

I think that bit in quotation marks has been every...

2016-09-12T04:50:33.132-07:00

I think that bit in quotation marks has been everybody's working assumption for a very long time ... without it, our understanding of most languages would be at or below the level of Aristotle's understanding of Greek. Of course you can't apply ideas derived from study of one language to that of another without making ridiculous errors if you have no sense of when they're not working, and need some adjustment. That's an important part of what Chomsky did, giving a more articulate conception that had previously been available of what it was for a grammatical framework to not be fitting the language properly (missing generalizations, producing unnecessarily long/complex grammars).

Indeed, but, there's a presentational gap, whi...

2016-09-12T00:08:20.663-07:00

Indeed, but, there's a presentational gap, which I know there are people working on, but I think it's urgent to get it filled in a way that people like Morten Christiansen, for example, can't ignore. I think small corpus->big corpus regularities are also something to look at, in addition to intuitions. So for example in CHILDES English, doubly recursive possessives (Donna's dog's name) seem to occur at a rate of between 4 and 7 per million words (there are some puzzling episodes which make it hard to really assess better than that, I think), while triply recursive ones don't occur at all, but if the corpus were bigger, they surely would, since children do say stuff like "Zack's mother's boyfriend's sister". Nobody will tell me that 7/mil words is too little for learning, but it's also interesting to note that adjectivally modified possessors (where's the little boy's house) are much more common (mid 30's/mil words in the (very view) CHILDES corpora I've looked at), so maybe they're the real trigger.

I don't think the term "surface" hel...

2016-09-11T22:46:08.293-07:00

I don't think the term "surface" helps us understand the difference between Greenbergian and Chomskyan universals, but the following goes to the heart of the matter: "Modern comparative linguistics within GG assumes as a basic operating principle that you can learn a lot about language A by studying language B". If this assumption were correct, one would think that some successes along the lines of discovering Neptune would have been made, but in fact, there are only very dim ideas about what might be in UG, despite decades of vibrant research. So I think it makes more sense to study universals without making that assumption (i.e. by strictly separating language description from comparison), in the Greenbergian fashion, and to look for explanations primarily in the domain-general aspects of FL. But in contrast to Everett and Tomasello, I don't see this as challenging the Chomskyan philosophy – we may eventually find Neptune, and if there's good independent evidence for it, I'll be happy to accept it.

Avery, there should be no conflict between researc...

2016-09-11T20:07:23.334-07:00

Avery, there should be no conflict between research on corpora and research on systematic judgments. The systematic judgments are one of the outputs of the learning process (among many others), and the corpora are (approximations to) the input that leads learners to arrive at those systematic judgments. The frequent mismatches between the two is precisely what makes this all so interesting.

There's a growing body of work in which people are looking closely at the relation between input corpora and systematic-but-not-obvious judgments, and I recommend it to you. Jeff and his collaborators spend a lot of their time doing exactly this nowadays, and so I recommend you to look at some of their recent output. Similarly, efforts like the parsed CHILDES corpus by Lisa Pearl and Jon Sprouse have made it more feasible to address these questions.

My little addition to this is that if CUs can'...

2016-09-11T19:22:06.231-07:00

My little addition to this is that if CUs can't be converted into probabalistic implicational GUs, they will be of very limited interest to many people, including me. But of course they can be ... from X-bar theory you can for example predict that if a member of a word class appears in some syntactic position, perhaps with a bit of extra stuff such as a determiner or a case-marker, then all the usual satellites of that class will be able to appear there also. This is usually correct, and I'm not aware of any way in which the currently popular sequence- or usage- based approaches can predict it.

But then there are the exceptions, such as prenominal possessors in German, and all adnominal possessors in Piraha, for which I think we need to be open to multiple possible forms of explanation. The avenues that occur to me now are:

a) classical Bayes/MDL (Goldsmith, Chater, Perfors & Clark)
b) better predictions (Ramscar, Dye et al)
c) simpler processing (Yang)

(b, c) strike me currently as better bets, but might be very difficult to distinguish from each other, since anything that yields better predictions of what's coming next will probably make processing simpler and vice-versa. But in all three cases, there is some kind of tradeoff between formal 'simplicity' of the grammar as quantified by an evaluation metric over a notation, and something else of a more functional nature.

Lots of people have been doing heavy foundational work relevant this for a long time, in various ways, too many to list, although Guglielmo Cinque sort of comes to mind ATM, but one point I'd like to make in contrast to Jeff Lidz' recent posting is that I think it's important that some of this work relate to things that can be observed in corpora, rather only ones evident from looking at intuitions about selected examples, because there is also a big population of people who just aren't impressed by that, but who might be able to be brought around to paying more attention to the intuitions if at least some of them had a clear connection to corpus-based observations.