Faculty of Language: ‘I’ before ‘E’: unambiguity

Thursday, October 18, 2012

‘I’ before ‘E’: unambiguity

In my next post, I’ll discuss the I-language/E-language distinction that Chomsky introduced in Knowledge of Language. The history of this distinction is illuminating, and it helps explain what the ‘I-’ means. But for today, let me stipulate that an I-language is a generative procedure that connects articulations of some kind (say, sounds or gestures) with meanings of some kind. Let ‘E-language’ be a covering term for anything else—a set of word strings, a social practice, or whatever—that might be called a language.

That’s already enough to make it clear that many alleged “debates” about whether kids learn the languages they acquire (and whether such languages are transformational) aren’t really debates. One “side” argues that humans naturally acquire I-languages that connect articulations with meanings in accord with logically contingent constraints that are not learned. The other “side” shows how a certain kind of learner could acquire an E-language that is like a human I-language in respects other than the ones highlighted by the evidence.

To take a much discussed kind of example, the word-strings indicated with (1-3)

(1) The guest was fed waffles?

(2) The guest fed the parking meter?

(3) The guest who was fed waffles fed the parking meter?

can be used—with rising intonation—to ask yes/no questions, with the corresponding declaratives indicating affirmative answers. With regard to (1), the question can also be asked with (4). But (5) is not another way of asking question (3).

(4) Was the guest fed waffles?

(5) Was the guest who fed waffles fed the parking meter?

On the contrary, (5) is understood as the bizarre question indicated with (6).

(6) The guest who fed waffles was fed the parking meter?

Put another way, (5) is unambiguous: it has the meaning indicated with (6), and it fails to have the meaning indicated (3). This “negative” fact is of interest. One can easily imagine a generative procedure that connects the pronunciation of (5) with both meanings, or just the meaning of (3). But (5) can only be understood as the bizarre question, even though (3) is the more likely question, given what we know about guests and waffles. Similarly, (7)

(7) Was the hiker who lost kept walking in circles?

is understood as indicated with (7a) and not (7b).

(7a) The hiker who lost was kept walking in circles?

(7b) The hiker who was lost kept walking in circles?

So in acquiring English, one acquires an I-language that connects articulations with meanings in a way that makes (5) and (7) unambiguous.

Now perhaps kids somehow learn that their parents and peers use I-languages that

connect articulations with meanings in this constrained way. I doubt it, for reasons that have been reviewed often. (I’ve done my time on such reviews.) But in principle, I can imagine replies of the following form: show how kids could start with a more permissive generative procedure—or a strategy for acquiring I-languages that would support acquisition of I-languages in which (5) is ambiguous—and then use available experience to figure out that the “local” I-languages are more constrained. I have not, however, encountered such replies. What I have encountered (see the reviews just mentioned) are descriptions of machines that can learn to classify strings like (8) as defective, while classifying strings like (9-11) as undefective.

(8) Was the guest who hungry was tired?

(9) The guest was hungry?

(10) Was the guest hungry?

(11) Was the guest who was hungry tired?

Such machines can, in effect, learn to put an asterisk on (8) while leaving (9-11) unmarked.

But that’s beside the point. The phenomenon illustrated with (1-7) is not that kids acquire hard to learn procedures for putting asterisks on strings. The point is that kids acquire I-languages (procedures that connect articulations with meanings) that are constrained in certain ways. Of course, any biologically implemented procedure will be constrained in ways that are unlearned. But the interesting nativist claim—not rebutted by inventing learnable procedures for classifying strings as defective—is that particular constraints (e.g., those characterized in terms of constraints on displacement) are unlearned.

Linguists, or their informants, might mark the oddness of (12) as shown below.

(12) *The guest who fed waffles was fed the parking meter.

But like (7a), (12) can be understood as an English sentence that expresses a crazy thought. Another day, I’ll talk about contrasts with (13) and (14).

(13) *Colorless green ideas sleep furiously.

(14) *I might been have there.

There are complications, and not only because acceptability differs from grammaticality. But whatever we say about asterisks, a child who acquires English acquires an I-language that connects the pronunciation of (12) with the corresponding meaning. Such a child also ends up knowing that (12) is bizarre thing to say. But it’s bizarre because of what (12) means. Likewise for (5). And (5) wouldn’t be bizarre if it could have the meaning of (3).

(5) Was the guest who fed waffles fed the parking meter?

(3) The guest who was fed waffles fed the parking meter?

That raises the question of why of kids don’t acquire I-languages that are more semantically permissive. Building machines that can learn to put an asterisk on (5) doesn’t address this question, much less suggest that kids learn that strings like (5) are unambiguous. One can try to build a machine that classifies (5) as a “generable but deviant” string and (14) as “ungenerable.” But the question remains: why does (5) have one meaning rather than two? Similar remarks apply to (15), which has the meaning of (16) and not (17).

(15) Can pigs that fly talk?

(16) Pigs that fly can talk?

(17) Pigs that can fly talk?

If we want to understand the human capacity to acquire I-languages—procedures that connect articulations with meanings in certain ways—then it’s hard to see the point of inventing machines that learn to classify strings as generable or not. There are, of course, other goals. But to have a debate about I-languages, both sides have to talk about them.

14 comments:

Alex ClarkOctober 30, 2012 at 12:49 PM
On your last argument, that you don't see the point of studying weak learning algorithms -- surely there are several good reasons, among which is that weak learning is more mathematically well defined, that it is easier, that any strong learner will be a fortiori a weak learner and so on.

We may want a strong learner, but given that this is a hard problem to say the least, it seems appropriate to start by solving some well-defined simpler problems; by breaking the large problem into several smaller ones that one can solve one by one. This seems entirely in line with standard scientific methodology; thoroughly Galilean in fact.
ReplyDelete
Replies
Paul PietroskiNovember 2, 2012 at 6:00 PM
OK, lots of convergence now, which is great.

And let me say that I shouldn't have used 'transformational' in my previous comment...since I wasn't thinking explicitly about the Chomsky-hierarchy, and I wasn't assuming that expressions "move." (The copy theory of "movement" is fine by me.) And I agree that many debates about "movement" are terminological. I was just assuming that one way or another, the structure relevant to articulation will sometimes differ from the structure relevant to meaning, but that such "mismatches" (between PF and LF, in minimalist idiom) are evidently quite constrained. And I like Stabler's grammars, as you suspected. (I've only seen Sag's slides, and haven't had a chance to think through the predictions for constraints on quantifier scope.)

I'm not sure that your E1-equivalence is really a kind of *extensional* equivalence. But that probably doesn't matter. For my money, the really interesting question is the one you rightly raise in terms of your E2 notion: to what degree do kids in a community converge on a single I-language that is also used by their parents (peers/teachers/grandparents/etc).

Given creolization and historical cases of fast language change (and Rozz Thornton's work on kids diverging from parental I-languages but in ways that respect constraints respected by other "adult" I-languages), I have doubts about the usual idealizations according to which kids try to acquire a "target" grammar. It might be that kids just keep trying out humanly possible I-languages until they find one that seems to work well enough for communicative purposes, or until they get too old and inflexible to keep trying new ones. (My hunch is that studies of signed languages will yield more insights here than studies of spoken languages. But we'll see.) It would be great to know what proportion of humanly possible I-languages are such that: if you use them, your grandkids will end up acquiring them. Of course, the languages we see in stable crossgenerational use have this property. But that could be, at least in part, a historical fact rather than a manifestation of it being a biological norm for kids to converge to their parents' I-language.

But now focusing on the (many) cases where kids do acquire an I-language that is at least roughly E2-equivalent to the local I-language(s)...you rightly ask (in good Quine-Lewis fashion) why we should think that the acquirers converge to a single I-language, perhaps modulo small variations. In one sense, it will take the next several posts for me to say anything remotely satisfactory about this. (But I'll try...that's where I'm heading). In brief, I think one needs to find new sources of evidence beyond speakers' judgments about potential interpretations for strings. And in that sense, the classical data for linguistic theorizing--like all data--has its limits. But once there is agreement that we're trying to figure out which I-language(s) kids acquire when they acquire English--and that the (hard) task is to specify the procedures that kids use to connect articulations with meanings, and how these procedures are acquired--then I'm all in favor of using whatever tools we can use to answer your question about how much actual I-language variation there is, given a logically possible space of E2-equivalent I-languages. In my own toolkit, I tend to like a mix of minimalist reasoning--to try to identify the "basic operations" that the human language faculty seems to employ--and experimental methods that provide new sources of data that bear on parade cases of intensional equivalence (e.g., the many provably equivalent ways of specifying the extension of a quantificational word like 'most'). But I'm wide open to insights from any corners on how to tease apart hypothetical procedures that pair the same articulations with (all and only) the same meanings/construals.

ReplyDelete
Replies
Dennis O.November 8, 2012 at 8:09 AM
Paul -- glad to see your postings here, very interesting stuff.

I wholeheartedly agree that the "focus on asterisks" you bemoan has been detrimental to the field, probably going back to the wrong-headed simplification that what we're doing is building a machine that can separate the "good" from the "bad" sentences. What we're doing, of course, is build a machine that links sound and meaning in an empirically correct way, *including* all kinds of crazy meanings (and crazy sounds, say illicit ellipsis). Seems to me that this whole misunderstanding, and the overstated importance of "acceptability," goes back to the early days, where the generative procedure was taken to generate the good sentences but not the bad ones. This was clearly inspired by formal-language theory, but commits to the fallacy of illicitly equating acceptability and grammaticality. I think current "crash-proof syntax" models fall prey to the same fallacy.

(I didn't have time to read the discussion above, so apologies in case I'm repeating things that are mentioned there.)
ReplyDelete
Replies
Alex ClarkNovember 9, 2012 at 2:37 AM
I was thinking about this some more, and especially about the issue of assigning meanings to ill-formed sentences like "John seems sleeping".
I think there are two separate issues here: one is whether you consider a joint model of sound/meaning pairs, versus a model just of the sequences of sounds. The second is whether this model is 'categorical' defining a sharp boundary or boundaries between ill-formed and well-formed, versus a 'graded' model, such as a probabilistic one.
Clearly these are independent -- one can have a categorical joint model or a graded joint model or a categorical model of just strings (as in formal language theorey) or a probabilistic model of strings.
But it seems like they are getting conflated here.
ReplyDelete
Replies

Add comment

Faculty of Language

Comments

Thursday, October 18, 2012

‘I’ before ‘E’: unambiguity

14 comments:

Contributors