Doing research requires exercising judgment, and doing this
means making decisions. Of the decisions one makes, among the most important
concern what work to follow and what to (more or less) ignore. Like all
decisions, this one carries a certain risk, viz. ignoring that work that one
should have followed and following that work that should have been ignored (the
research analogue of type I and type II errors). However, unless you are a certain
distinguished MIT University Professor who seems to have the capacity (and
tenacity) to read everything, this is the kind of risk you have to run for the
simple reason that there are just so may hours in the day (and not that many if,
e.g. you are a gym rat who loves novels and blog reading viz. me). So how do
you manage your time? Well, you find your favorites and follow them closely and
you develop a cadre of friends whose advice you follow and you try to ensconce
yourself in a community of diversely interesting people who you respect so that
you can pick up the ambient knowledge that is exhaled. However, even with this, it is important to
ask, concerning what you read, what is its value-added. Does it bring
interesting data to the discussion, well-grounded generalizations, novel
techniques, new ideas, new questions? By
the end of the day (or maybe month) if I have no idea why I looked at
something, what it brought to the table, then I reluctantly conclude that I
could have spent my time (both research and pleasure time) more profitably elsewhere,
AND, here’s the place for the policy statement, I note this and try to avoid
this kind of work in the future. In other words, I narrow my mind (aiming for complete
closure) so as to escape such time sinks.
Why do I mention this? Well, a recent paper, out on
LingBuzz, by Legate, Pesetsky and Yang (LPY) (here) vividly brought it to
mind. I should add, before proceeding, that the remarks below are entirely
mine. LPY cannot (or more accurately ‘should not,’though some unfair types just
might) be held responsible for the rant that follows. This said, here goes.
LPY is a reply to a recent paper in Language, Levinson (2013), on recursion. Their paper, IMO, is devastating. There’s
nothing left of the Levinson (2013) article. And when I say ‘nothing,’ I mean
‘nothing.’ In other words, if they are right, Levinson’s effort has 0
value-added, negative really if you count the time lost in reading it and
replying to it. This is the second time I have dipped into this pond (see here),
and I am perilously close to slamming my mind shut to any future work from this
direction. So before I do so, let me add a word or two about why I am thinking
of taking such action.
LPY present three criticisms of Levinson 2013. The first ends
up saying that Levinson (2103)’s claims about the absence of recursion in
various languages is empirically unfounded and that it consistently incorrectly
reports the work of others. In other words, not only are the “facts” cited
bogus, but even the reports on other people’s findings are untrustworthy. I confess to being surprised at this. As my
friends (and enemies) will tell you, I am not all that data sensitive much of
the time. I am a consumer of other people’s empirical work, which I then mangle
for my theoretical ends. As a result,
when I read descriptive papers I tend to take the reported data at face value
and ask what this might mean theoretically, were it true. Consequently, when I read papers by Levinson,
Evans, Everett a.o., people who trade on their empirical rectitude, I tend to
take their reports as largely accurate, the goal being to winnow the empirical
wheat from what I generally regard as theoretical/methodological chaff. What
LPY demonstrate is that I have been too naïve (Moi! Naïve!) for it appears that
not only is the theoretical/methodological work of little utility, even the
descriptive claims must be taken with enough salt to scare the wits out of any mildly
competent cardiologist. So, as far as empirical utility goes, Levinson (2013)
joins Everett (2005) (see Nevins, Pesetsky and Rodrigues (NPR) for an
evisceration) as a paper best left off one’s Must-Read list.
The rest of LPY is no less unforgiving and I recommend it to
you. But I want to make two more points before stopping.
First, LPY discuss an argument form that Levinson (2013)
employs that I find of dubious value (though I have heard it made several
times). The form is as follows: A corpus study is run that notes that some
construction occurs with a certain frequency.
This is then taken to imply something problematic about grammars that
generate (or don’t generate) these constructions. Here’s LPY’s version of this
argument form in Levinson (2013):
Corpus studies have shown that degree-2 center
embedding
"occurs vanishingly rarely in spoken language
syntax", and degree-3 center embedding is hardly observed at all. These
conclusions converge with the well-known psycholinguistic observation that
"after degree 2 embedding, performance rapidly degrades to a point where
degree 3 embeddings hardly occur".
Levinson concludes from this that natural language (NL)
grammars (at least some) do not allow for unbounded recursion (in other words,
that the idealization that NLs are effectively infinite should be
dropped). Here are my problems with this
form of argument.
First, what’s the relevance of corpus studies? Say we concede that speakers in the wild never
embed more than two clauses deep. Why is this relevant? It would be relevant if when strapped to
grammatometers, these native speakers flat-lined when presented with sentences
like John said that Mary thinks that Sam
believes that Fred left, or this is
the dog that chased the cat that ate the rat that swallowed the cheese that I
made. But they don’t! Sure they have
problems with these sentences, after all long sentences are, well, long. But
they don’t go into tilt, like they generally do with word salad like What did you kiss many people who admire
or John seems that it was heard that
Frank left. If so, who cares whether these sentences occur in the
wild? Why should being in a corpus endow
an NL data point with more interest than one manufactured in the ling lab?
Let me be a touch more careful: If theory T says that such
and such a sentence is ill formed and one finds instances of such often enough
in the wild, then this is good prima
facie evidence against T. However, absence of such from a corpus tells us
exactly nothing. I would go further, as in all other scientific domains,
manufactured data, is often the most revealing.
Physics experiments are highly factitious, and what they create in the
lab is imperceptible in the wild. So too with chemistry, large parts of biology
and even psychophysics (think Julesz dot displays or Muller-Lyer illusions, or
Necker cubes). This does not make these
experiments questionable. All that counts is that the contrived phenomena be
stable and replicable. Pari passu being absent from a corpus is no sign of anything. And being manufactured
has its own virtues, for example, being specially designed to address a
question at hand. As in suits, bespoke is often very elegant!
I should add, that LPY question Levinson (2013)’s
assertion that three levels of embedding are “relatively rare,” noting that
this is a vacuous claim unless some baseline is provided (see their
discussion). At any rate, what I wish to reiterate is that the relevant issue
is not whether something is rare in a corpus but whether the data is stable,
and I see no reason to think that judgments concerning multiple embedded
clauses manufactured by linguists are unstable, even if they don’t frequently
appear in corpora.
Second and final point: Chomsky long ago noted the
important distinction “is not the difference between finite and infinite, but
the more elusive difference between too large and not too large”
(LSLT:150). And it seems that it doesn’t
take much to make grammars that tolerate embedding worthwhile. As LPY notes, a
paper by Perfors, Tenenbaum and Regier (2006)
… found that the context-free grammar is favored
[over regular grammars,NH] even when one only considers very simple
child-directed English, where each utterance averages only 2.6 words, and no
utterance contains center embedding or remotely complex structures.
It seems that representational compactness has its own
very large rewards. If embedding be a consequence, it seems that this is not
too high a price to pay (it may even bring in its train useful expressive
rewards!). The punch line: the central
questions in grammar have less to do with unbounded recursion than with
projectability; how one generalizes from a sample to a much larger set. And it
is here that recursive rules have earned their keep. The assumption that NLs
are for all practical purposes infinite simply focuses attention on what kinds
of rule systems FL supports. The infinity assumption makes the conclusion that
the system is recursive trivial to infer. However, finite but large will also
suffice for here too the projection problem will arise, bringing in its wake
all the same problems generative grammarians have been working on since the mid
1950s.
I have a policy: those things not worth doing are not worth
doing well. It has an obvious corollary: those things not worth reading are not
worth reading carefully. Happily, there are some willing to do pro bono work so that the rest of us
don’t have to. Read LPY (and NPR) and draw your own conclusions. I’ve already
drawn mine.
I think there's a possible strenghening of LPY's position that they seem to me to miss, which is that the extreme rarity of complex center embeddings in normal spoken production (and therefore, presumably, in the child's input during the first six years) makes it even more significant when they are occasionally produced and not rejected.
ReplyDeleteWe of course don't know much about the kinds of input that young upper class Classical Athenian kids like Plato got, but it presumably included some instances of the [Det [Det N]:GEN N] pattern that was often used for possession, e.g.
ho tou Philippou hippos
the the(G) Philip(G) horse
'Philip's horse'
That he later came out with the more complex:
ta te:s to:n pollo:n psu:che:s ommata
the the(G) the(G) many(G) soul(G) eyes
'the eyes of the soul of the multitude'
would seem to indicate that his UG (cheat sheet for language acquisition, whatever is on it and whereever it comes from) was biased in favor of a recursive analysis of the simpler structures that were very likely all that occurred in his input.
Plato is perhaps particular interesting because, although Modern Greeks have some somewhat similar structures in certain genres, one could always say that they were just imitating the classical models when they produce them, but Plato was the classical model - center embedded possessors don't occur in Homer, & even if they did that would regress the problem of where they came from in the first place to an earlier period.
That's a very good point -- the rarer the complex constructions are, the more there is to explain. It makes the learning problem harder, so the attempts over the years by various connnectionists/empiricists/non-Chomskyans to say "oh but we never observe this exotic phenomenon in practice, so we don't need it in our theory" seem very confused. They should be arguing that they are frequent and thus can be learned.
ReplyDeleteNice point, Avery (and Alex). And rare constructions are indeed rare.
ReplyDeleteYes, beautiful point! By the way, did anyone notice that Norbert used a construction in his posting that is surely as rare as they come: an instance of VP-ellipsis that fails to match its antecedent in voice, and is center-embedded to boot I have in mind this passage - I've boldfaced the relevant bit:
ReplyDelete"I should add, before proceeding, that the remarks below are entirely mine. LPY cannot (or more accurately ‘should not,’ some unfair types just might) be held responsible for the rant that follows.
Now this kind of mismatch is presumably rare (though we need Charles to tell us whether it's rarer than predicted by the baselines for VP-ellipsis, active VP and passive VP). What is especially exciting about Norbert's version, which makes it very relevant to our discussion, is that it's not only an instance of voice-mismatch VP-ellipsis, it's also center-embedded and cataphoric. Now that does entail it's tough to parse, and this kind of construction must be vanishingly, vanishingly, vanishingly rare. I noticed it because it took me a second to figure out what Norbert meant, but I did get it in the end. As Jason Merchant showed in a beautiful paper (still unpublished?), rare and parsingly troubled though active-passive mismatches may be, its availability is governed by strict laws. For example, mismatches are possible with VP-ellipsis, but not with Sluicing (e.g. *Some policeman arrested Mary, but I don't know by who). Crucially, one family of theories about how voice works, but not another, can explain these laws. And nothing changes if we try center-embedded versions: Some policeman arrested -- I don't know who/*by who -- my friend from the LSA Institute. I would be willing to wager that not only were Norbert-sentences not in my input as a child, but have been missing from my life until yesterday. And yours too. Yet somehow we know that Norbert-sentences with VP-ellipsis are hard, but English, while no such examples can possibly be constructed with Sluicing.
Addendum: I just noticed that Norbert's passage does have a less remarkable parse in which there is no voice mismatch:
ReplyDelete... or more accurately should not [be held responsible], some unfair types just might [be held responsible] ...
But surely that's not what Norbert meant. Being Norbert, he must have meant the more exotic, but real-world-plausible:
...or more accurately should not [be held responsible], some unfair types just might [hold them responsible] ...
A less exotic version of the kind of thing I was talking about might be found in acquisition of English and I believe many other languages, where, in children's input, the complexity of subject NPs tends to be considerably less than that of object NPs (according to what I've read, I have no authentic knowledge), yet the grammar comes out with no difference in the possibilities for complex structure.
ReplyDeleteThis is a contrast with German possessives, where the prenominal ones do seem to be genuinely more restricted than the postnominal ones. Md. Greek might also be interesting, since prenominal possessives seem to tend to be simpler than the postnominal ones, in the genres where they occur at all, but there are probably no fixed rules.
This comment has been removed by the author.
ReplyDeleteNote that Evans & Levinson’s argument against constituency based on Virgil's verse
ReplyDelete(1) Ultima Cumaei venit iam carminis aetas
last-Nom Cumaean-Gen comes now song-Gen age-Nom
The last age of Cumaean song has come now,
(their examples 13 & 14) is another case of center-embeding. They refer to Matthews 1981 who says: “Such ordering would be unusual in other styles [in Latin]... But no rule excludes it.” (p. 255) And he adds that “all the [syntactic] relations are realized by inflections” here.
In Czech a word-by-word translation of (1) can be understood but it's very, very, very marked (today even in poetry). However, the business of poets has always included breaking linguistic norms. Why the English could not have (if they wished) in poetry such extremely marked sentences like “The last of the Cumaean has come now song age”? Is there a rule that excludes it?
It's fair to say that that word order isn't intelligible in English. Md. Greek poets occasionally do things a bit like that (but in, as far as I've noticed, a very limited and stereotypical way), but they don't get translated that way into English by the translators.
ReplyDeleteAn important difference between Greek and Latin is that Greek has definite articles with rules about their distribution, which give evidence about NP structure (there's a lot of literature about this for the modern language).
This comment has been removed by the author.
ReplyDeleteIncluding an MA thesis about discontinuous NPs in Greek and some other languages: http://www.linguistics.ucla.edu/general/MATheses/Ntelitheos_UCLA_MA_2004.pdf. The author also has a paper on which I can't find a date, but since the thesis isn't referred to, it's probably earlier.
ReplyDeleteIt is a require of this analysis (which applies to things that have been written about Polish and Russian) that the first chunk of discontinuous material appears in a left-peripheral discourse position; I know of at least one Md Greek poetic example for which that doesn't seem to work, but this could well be a divergence from the usual grammatical rules.
Thank you (not only) for the reference.
DeleteŽeljko Bošković has been arguing for some time that there is a strong correlation between the ability of a language to scramble left-hand elements out of nominals (i.e. violate the Left Branch Condition) and the absence of overt articles in the language. The story starts with Slavic, where the languages with articles, Bulgarian and Macedonian (let's not start arguing whether they are separate languages or not, please) are also the languages within the group that are like English in utterly lacking the ability to construct Vergil-like sentences ((Cumaean I would now like to sing a song) -- but he has lists of languages outside the group as well. And he has a theory about why this is so, that explains a whole list of other properties that he thinks correlate. A few of these look pretty unlikely to me -- for example, absence of sequence of tense is supposed to correlate with absence of articles (despite the fact that Latin is the sequence-of-tense language from which we get the term itself) -- but others are intriguingly interesting.
DeleteI think the paper in which he first stated the correlations is this one. Željko's proposal does not care about linear order, but rather constituency, so he would not predict that "the first chunk of discontinuous material appears in a left-peripheral discourse position" as in the thesis cited Avery, but would come close to this prediction insofar as linear order and constituency weakly correlate inside nominals (with open questions about suffixal articles as in South Slavic and other Balkan languages).
Thank you very much. Bošković's papers address exactly what I am interested in.
ReplyDelete