Faculty of Language: November 2014

Wednesday, November 26, 2014

POS POST PAST

This was prompted by various replies to Jeff's recent post (see below), which triggered many memories of déjà vu. But it got too long, in every relevant sense, for a single comment.

Rightly or wrongly, I think knowledge of language is an interesting example of knowledge acquired under the pressure of experience, but not acquired by generalizing from experience. Rightly or wrongly, I suspect that most knowledge is of this sort. That's one way of gesturing at what it is to be a Rationalist. So I'm curious how far the current wave of skepticism regarding POS-arguments goes, since such arguments are the lifeblood of Rationalism.

Once upon a time, somebody offered an argument that however people acquire knowledge of the Pythagorean Theorem, it isn't a matter of generalizing from observed instances of the theorem. This leads me to wonder, after reading some of the replies to Jeff: is the Platonic form of a POS argument also unpersuasive because (1) it is bound up in "meaty, theory-internal constructs," and (2) the input is impoverished only relative to "an (implicit) superficial encoding and learning algorithm"? If not, what makes the argument that Jeff offered relevantly different? The classic POS arguments in linguistics were based on observations regarding what certain strings of words cannot mean, raising the question of how the relevant constraints could be learned as opposed to tacitly assumed. What's so theory-internal about that?

Moreover, sensible rationalists never denied that thinkers can and often do represent certain "visible things"--drawn on the board, or in the sand--as right triangles, and hence as illustrations of theorems concerning right triangles. The point, I thought, was that any such way of "encoding the data" required a kind of abstraction that is tantamount to adopting axioms from which the theorems follow. If one uses experience of an actual drawing to activate and apply ideas of right angles formed by lines that have no width, then one is using experience in a remarkable way that makes it perverse to speak of "learning" the theorem by generalizing from experience. But of course, if one distinguishes the mind-independent "experienceables" from overtly representational encodings--I believe that Jeff usually stresses the input/intake contrast--then any experience-dependent knowledge acquisition can be described as the result of "generalizing" from encodings, given a suitably rich framework for encodings. Indeed, given a suitably rich framework, generalizing from a single case is possible. (It's worth remembering that we speak of both arithmetic induction and empirical induction. But if knowledge of linguistic constraints turns out to be more like knowledge acquired via arithmetic induction, that's hardly a point against Rationalists who use POS arguments to suggest that knowledge of linguistic constraints turns out to be more like knowledge acquired via arithmetic induction.)

With enough tenacity, I guess one can defend the idea that (pace Descartes) we learn from our encodings-of-experience that the world contains material things that endure through time and undergo change, and that (pace Leibniz) we generalize from observations of what is the case to conclusions about what might be or must be the case, and that (see Dyer and Dickinson, discussed by Gallistel and others) novice bees who were only allowed to forage a few times in late afternoons still generalized from their encodings-of-experience in a way that allowed them to communicate the location of food found on the first (and overcast) morning. Put another way, one can stipulate that all experience-dependent knowledge acquisition is learning, and then draw two consequences: (1) POS-arguments show that a lot of learning--and perhaps all learning--is very very unsuperficial, and (2) a huge part of the enterprise of studying knowledge of language and its acquisition consists in (a) repeatedly reminding ourselves just how unsuperficial this knowledge/acquisition is, and (b) using POS arguments to help discover the mental vocabulary in terms of which encodings of the relevant experience are formulated. But (1) and (2) seem like chapter one and verse of Aspects.

So as usual, I'm confused by the whole debate about POS arguments. Is the idea that with regard to human knowledge of language, but not knowledge of geometry (or bee-knowledge of solar ephemeris), there's supposed to be some residual plausibility to the idea that generalizations of the sort Jeff has pointed to (again) can be extracted from the regularities in experienceables without effectively coding the generalizations in terms of how the "data of experience" gets encoded? If so, is there any better form of the argument that would be accepted as persuasive; or is it that with regard to knowledge of linguistic generalizations, the prior probability of Empiricism (in some suitably nonsuperficial form) is so high that no argument can dislodge it?

Or is the skepticism about POS arguments more general, so that such arguments are equally dubious in nonlinguistic domains? If so, is there any better form of the argument (say regarding geometry, or the bees) that would be accepted as persuasive; or is it that with regard to knowledge of all generalizations, the prior probability of Empiricism (in some suitably nonsuperficial form) is so high that no argument can dislodge it?

Of course, nobody in their right mind cares about drawing a sharp line between Rationalism and Empiricism. But likewise, nobody in their right mind denies that there is at least one distinction worth drawing in this vicinity. Team-Plato, with Descartes pitching and Leibniz at shortstop, uses POS considerations to argue that (3) we encode experience and frame hypotheses in very interesting ways, and (4) much of what we know is due to how we encode experience/hypotheses, as opposed to specific experiences that "confirm" specific hypotheses. There is another team, more motley, whose roster includes Locke, Hume, Skinner, and Quine. They say that while (5) there are surely innate mechanisms that constrain the space of hypotheses available to human thinkers, (6) much of what we know is due to our having experiences that confirm specific hypotheses.

To be sure, (3-6) are compatible. Disagreements concern cases, and "how much" falls under (6). And I readily grant that there is ample room for (6) under the large tent of inquiry into knowledge of language; again, see chapter one of Aspects. Members of Team-Plato can agree that (6) has its place, against the background provided by (3) and (4); though many members of the team will insist on sharply distinguishing genuine cases of "inductive bias," in which one of two available hypotheses is antecedently treated as more likely, from cases that reflect knowledge of how the relevant vocabulary delimits the hypothesis space (as opposed to admitting a hypothesis but assigning a low or even zero prior probability). But my question here is whether there is any good reason for skepticism about the use of POS arguments in support of (3) and (4).

Absent a plausible proposal about how generalizations of the sort Jeff mentions are learned, why shouldn't we conclude that such generalizations fall under (4) rather than (6)?

Sidepoint: it's not like there is any good basis, empirical or conceptual, for thinking that most cases will fall under (6)--or that relegation to (4) should be a last resort. The history of these debates is littered with versions idea that Empiricism is somehow the default/simpler/preferable option, and that Rationalists have some special burden of proof that hasn't yet been met. But I've never met a plausible version of this idea. (End of sidepoint.)

I'm asking because this bears on the question of whether or not linguistics provides an interesting and currently tractable case study of more general issues about cognition. (That was the promise that led me into linguistics; but as a philosopher, I'm used to getting misled.)

If people think that POS arguments are generally OK in the cognitive sciences, but not in linguistics, that's one thing. If they think that POS arguments are generally suspect, that's another thing. And I can't tell which kind of skepticism Jeff's post was eliciting.

Sunday, November 23, 2014

There's no poverty of the stimulus? PISH.

Lately, I’ve been worried that many people – mostly psychologists, but also philosophers, computer scientists and even linguists, do not appreciate the argument from the poverty of the stimulus. They simply do not see what this argument is supposed to show. This difficulty leads to skepticism about the poverty of the stimulus and about generative syntax more generally, which, in turn, interferes with progress towards solving the problem of how children learn language.

An argument from the poverty of the stimulus is based on two observations: (a) there exists a generalization about the grammar of some language and (b) the learner’s experience does not provide sufficient data to support that generalization over a range of alternative generalizations. These two observations support the conclusion that something other than experience must be responsible for the true generalization that holds of the speakers of the language in question. This conclusion invites hypotheses. Typically, these hypotheses have come in the form of innate constraints on linguistic representations, though nothing stops a theorist from proposing alternative sources for the relevant generalization.

But a common response to this argument is that we just didn’t try hard enough. The generalization really is supported in the data, but we just didn’t see the data in the right way. If we only understood a little more about how learners build up their representations of the data, then we would see how the data really does contain the relevant generalization. So, armed with this little bit of skepticism, one can blithely assert that there is no poverty of the stimulus problem based only on the belief that if we linguists just worked a little harder, the problem would dissipate. But skepticism is neither a counter argument nor a counter proposal.

So, I’d like to issue the following challenge. I will show a poverty of the stimulus argument that is not too complicated. I will then show how I looked for the relevant data in the environment and conclude that it really wasn’t there. I will then invite all takers (from whatever field) to show that the correct generalization and none of the alternatives really is available in the data. If someone shows that the relevant data really was there, then I will concede that there was no poverty of the stimulus argument for that phenomenon. Indeed, I will celebrate, because that discovery will represent progress that all students of the human language faculty will recognize as such. Nobody requires that every fact of language derives from innate knowledge; learning which ones do and which ones don’t sounds like progress. And with that kind of progress, I’d be more than happy to repeat the exercise until we discover some general principles.

But, if the poverty of the stimulus is not overturned for this case, then we can take that failure as a recognition that the problem is real and that the way forward in studying the human language faculty is by asking about what property of the learner makes the environmental data evidentiary for building a grammar.

With that preamble out of the way, let’s begin. Consider the judgments in 1-2, which Leddon and Lidz (2006) show with experimentally collected data are reliable in adult speakers of English:

(1) a. Norbert remembered that Ellen painted a picture of herself

b. * Norbert remembered that Ellen painted a picture of himself

c. Norbert remembered that Ellen was very proud of herself

d. * Norbert remembered that Ellen was very proud of himself

(2) a. Norbert remembered which picture of herself Ellen painted

b. Norbert remembered which picture of himself Ellen painted

c. Norbert remembered how proud of herself Ellen was

d. * Norbert remembered how proud of himself Ellen was

The facts in (1) illustrate a very simple generalization: a reflexive pronoun must take its antecedent in the domain of the closest subject. In all of (1a-d) only Ellen can be the antecedent of the reflexive. Let us assume (perhaps falsely) that this generalization is supported by the learner’s experience and that there is no poverty of the stimulus problem associated with it.

The facts in (2) do not obviously fit our generalization about reflexive pronouns. If we take “closest subject” to be the main clause subject, then we would expect only (b) and (d) to be grammatical. If we take “closest subject” to be the embedded subject, then we expect only (a) and (c) to be grammatical. And, if we take “closest subject” to be underspecified in these cases, then we expect all of (a-d) to be grammatical. So, something’s gotta give. What we need is for the “closest” subject to be Ellen in (c-d), but not (a-b). And, we need closest subject to be underspecified in (a-b) but not (c-d). We’ll get back to a way to do that in a moment.

But first we should see how these patterns relate to the poverty of the stimulus. Leddon and Lidz (2006) also showed that sentences like those in (2) are unattested in speech to children. While we didn’t do a search of every sentence that any child ever heard, we did examine 10,000 wh-questions in CHILDES and we didn’t find a single example of a wh-phrase containing a reflexive pronoun, a non-reflexive pronoun or a name. So, there really is no data to generalize from. Whatever we come to know about these sentences, it must be a generalization beyond the data of experience.

One might complain, fairly, that 10,000 wh-questions is not that many and that if we had looked at a bigger corpus we might have found some with the relevant properties. We did search Google for strings containing wh-phrases like those in (2) and the only hits we got were example sentences from linguistics papers. This gives us some confidence that our estimate of the experience of children is accurate.

If these estimates are correct, the data of experience appears to be compatible with many generalizations, varying in whether Norbert, Ellen or both are possible antecedents in the (a-b) cases, the (c-d) cases or both. With these possibilities, there are 8 possible patterns. But out of these eight, all English speakers acquire the same one. Something must be responsible for this uniformity. That is the extent of the argument. It doesn’t really have a conclusion, except that something must be responsible for the pattern. The argument is merely the identification of a mystery, inviting hypotheses that explain it.

Here’s a solution that is based on prior knowledge, due to Huang (1993). The first part of the solution is that we maintain our generalization about reflexives: reflexives must find their antecedent in the domain of the nearest subject. The second part capitalizes on the difference between (2a-b), in which the wh-phrase is an argument of the lower verb, and (2c-d), in which the wh-phrase is the lower predicate itself. In (2a-b), the domain of the nearest subject is underspecified. If we calculate it in terms of the “base position” of the wh-phrase, then the embedded subject is the nearest subject and so only Ellen can be the antecedent. If we calculate it in terms of the “surface position” of the wh-phrase, then the matrix subject is the nearest subject. For (2c-d), however, the closest subject is the same, independent of whether we interpret the wh-phrase in its "base" or "surface" position. This calculation of closest subject follows from the Predicate Internal Subject Hypothesis (PISH): The predicate carries information about its subject wherever it goes. Because of PISH, the wh-phrase [how proud of himself/herself] contains an unpronounced residue of the embedded subject and so is really represented as [how ~~Ellen~~ proud of himself/herself]. This residue (despite not being pronounced) counts as the nearest subject for the reflexive, no matter where the predicate occurs. Thus, the reflexive must be bound within that domain and Ellen is the only possible antecedent for that reflexive. So, as long as the learner knows the PISH, then the pattern of facts in (2) follows deductively. The learner requires no experience with sentences like (2) in order to reach the correct generalization.

Now, this argument only says that the learner must know that the predicate carries information about its subject with it in the syntax prior to encountering sentences like (2). It doesn’t yet require that knowledge to be innate. So, the poverty of the stimulus problem posed by (2) shifts to the problem of determining whether subjects are generated predicate internally.

Our next question is whether we have independent support for PISH and whether the data that supports PISH can also lead to its acquisition. I can think of several important patterns of facts that argue in favor of PISH. The first (due, I believe, to Jim McCloskey) concerns the relative scope of negation and a universal quantifier in subject position. Consider the following sentences:

(3) a. Every horse didn’t jump over the fence

b. A fiat is not necessarily a reliable car

c. A fiat is necessarily not a reliable car

The important thing to notice about these sentences is that (3a) is ambiguous but that neither (3b) nor (3c) is. (3a) can be interpreted as making a strong claim that none of the horses jumped over the fence or a weaker claim that not all of them jumped. This ambiguity concerns the scope of negation. Does the negation apply to something that includes the universal or not? If it does, then we get the weak reading that not all horses jumped. If it does not, then we get the strong reading that none of them did.

How does this scope ambiguity arise? The case where the subject takes scope over negation is straightforward if we assume (uncontroversially) that scope can be read directly off of the hierarchical structure of the sentence. But what about the reading where negation takes wide scope? We can consider two possibilities. First, it might be that the negation can take the whole sentence in its scope even if it does not occur at the left edge of the sentence. But this possibility is shown to be false by the lack of ambiguity in (3c). If negation could simply take wide scope over the entire sentence independent of its syntactic position, then we would expect (3c) to be ambiguous, contrary to fact. (3c) just can’t mean what (3b) does. The second possibility is PISH: the structure of (3a) is really (4), with the struck-out copy of every horse representing the unpronounced residue of the subject-predicate relation:

(4) every horse didn’t ~~[every horse]~~ jump over the fence

Given that there are two positions for every horse in the representation, we can interpret negation as either taking scope relative to either the higher one or the lower one.

Is there evidence in speech to children concerning the ambiguity of (3a)? If there is, then that might count as evidence that they could use to learn PISH and hence solve the poverty of the stimulus problem associated with (2). Here we run into two difficulties. First, Gennari and MacDonald (2005) show that these sentences do not occur in speech to children (and are pretty rare in speech between adults). Second, when we present such sentences to preschoolers, they appear to be relatively deaf to their ambiguity. Julien Musolino and I have written extensively on this topic and the take away message from those papers is (i) that children’s grammars can generate the wide-scope negation interpretation of sentences like (3a), but (ii), it takes a lot of either pragmatic or priming effort to get that interpretation to reveal itself. So, even if such sentences did occur in speech to children, their dominant interpretation from the children’s perspective is the one where the subject scopes over negation (even when that interpretation is not consistent with the context or the intentions of the speaker) and so this potential evidence is unlikely to be perceived as evidence of PISH. And if PISH is not learned from that, then we are left with a mystery of how it comes to be responsible for the pattern of facts in (2).

A second argument (due to Molly Diesing) in favor of PISH concerns the interpretation of bare plural subjects, like in (5):

(5) Linguists are available (to argue with)

This sentence is ambiguous between a generic and an existential reading of the bare plural subject. Under the generic reading, it is a general property of linguists (as a whole) that they are available. Under the existential reading, there are some linguists who are available at the moment.

Diesing observes that these two interpretations are associated with different syntactic positions in German. The generic interpretation requires the subject to be outside of the verb phrase. The existential interpretation requires it to be inside the verb phrase (providing evidence for the availability of the predicate-internal position crosslinguistically). So, Diesing argues that we can capture a cross-linguistic generalization about the interpretations of bare plural subjects by positing that the same mapping between position and interpretation occurs in English. The difference is that in English, the existential interpretation is associated with the unpronounced residue of the subject inside the predicate. This is not exactly evidence in favor of PISH, but PISH allows us to link the German and English facts together in a way that PISH-less theory would not. So we could take it as evidence for PISH.

Now, this one is a bit trickier to think about when it comes to acquisition. Should learners take evidence of existential interpretations of bare plural subjects to be evidence of PISH? Maybe, if they already know something about how positions relate to interpretations. But in the end, the issue is moot because Sneed (2007) showed that in speech to children, bare plural subjects are uniformly used with the generic interpretation. How children come to know about the existential readings is itself a poverty of the stimulus argument (and one that could be solved by antecedent knowledge of PISH and the rules for mapping from syntactic position to semantic interpretation). So, if we think that the facts in (2) follow from PISH, then we still need a source for PISH in speech to children.

The final argument that I can think of in favor of PISH comes from Jane Grimshaw. She shows that it is possible to coordinate an active and a passive verb phrase:

(6) Norbert insulted some psychologists and was censured

The argument takes advantage of three independent generalizations. First, passives involve a relation between the surface subject and the object position of the passive verb, represented here by the invisible residue of Norbert:

(7) Norbert was censured ~~[Norbert]~~

Second, extraction from one conjunct in a coordinated structure is ungrammatical (Ross’s 1968 Coordinate Structure Constraint):

(8) * Who did Norbert criticize the book and Jeff insult

Third, extraction from a conjunct is possible as long as the extracted phrase is associated with both conjuncts (Across The Board extraction):

(9) Who did Norbert criticize and Jeff insult

So, if there were no predicate internal subject position in (6), then we would have the representation in (10):

(10) Norbert [_VP insulted some psychologists] and [_VP was censured ~~[Norbert]~~]

This representation violates the coordinate structure constraint and so the sentence is predicted to be ungrammatical, contrary to fact. However, if there is a predicate internal subject position, then the sentence can be represented as an across the board extraction:

(11) Norbert [_VP ~~[Norbert]~~ insulted some psychologists] and [_VP was censured ~~[Norbert]~~]

So, we can understand the grammaticality of (6) straightforwardly if it has the representation in (11), as required by PISH.

Do sentences like (6) occur in speech to children? I don’t know of any evidence about this, but I also don’t think it matters. It doesn’t matter because if the learner encountered (6), that datum would support either PISH or the conclusion that movement out of one conjunct in a coordinate structure is grammatical (i.e, that the coordinate structure constraint does not hold). If there is a way of determining that the learner should draw the PISH conclusion and not the other one, I don’t know what it is.

So, there’s a potential avenue for the stimulus-poverty-skeptic to show that the pattern in (2) follows from the data. First show that data like (6) occurs at a reasonable rate in speech to children, whatever reasonable means. Then show how the coordinate structure constraint can be acquired. Then build a model showing how putting (6) together with an already acquired coordinate structure constraint will lead to the postulation of PISH and not to the discarding of the coordinate structure constraint. And if that project succeeds, it will be party time; we will have made serious progress on solving a poverty of the stimulus problem.

But for the moment, the best solution on the table is the one in which PISH is innate. This solution is the best because it explains with a single mechanism the pattern of facts in (2), the ambiguity of (3), the interpretive properties of (5) and its German counterparts, and the grammaticality of (6). And it explains how each of these can be acquired in the absence of direct positive evidence. Once learners figure out what subjects and predicates look like in their language, these empirical properties will follow deductively because the learners will have been forced by their innate endowment to build PISH-compatible representations.

One final note. I am confident that no stimulus-poverty-skeptics will change their views on the basis of this post (if any of them even see it). And it is not my intention to get them to. Rather, I am offering an invitation to work on the kinds of problems that poverty of the stimulus arguments raise. It is highly likely that the analyses I have presented are incorrect and that scientists with different explanatory tastes would follow different routes to a solution. But we will all have a lot more fun if we engage at least some of the same kinds of problems and do not deny that there are problems to solve. The charge that we haven’t looked hard enough to find out how the data really is evidentiary is hereby dismissed. But if there are stimulus-poverty-skeptics who want to disagree about something real, linguists are available.

Jeff Lidz, November 23, 2014.

Saturday, November 22, 2014

Brains, minds and modules

I recently heard a talk by Elissa Newport, which reviewed some current thinking on brains and language. It led me to a question. Let me back into it.

Brains it seems are quite plastic. What this means is that if a part of the brain usually tasked with some function is incapable of so performing (e.g. it has been insulted in some way (or even removed (quite an insult!)) other parts of the brain can pitch in to substitute. If the insult is early enough (e.g. in very young kids) the brain substitutes for the lost brain parts are so good that nary a behavioral deficit is visible, at least to the untrained eye (and maybe even to the experts). So, brains are plastic. The limiting case of plasticity is the equipotentiality thesis (ET); the idea that any part of the brain can substitute functionally for any other part.

Now one can imagine how were ET true, one might be led to conclude that minds are also equipotential in the sense that the computations across cognitive domains must be the same. In other words, it strikes me that ET as a brain thesis leads to General Learning theory as a psychological thesis. Why? Well, on the assumption that minds are brains when viewed computationally, then there would be little surprise were any part of the brain able to compute what any other part could if in fact brains only did one kind of computation.

Conversely, if brains are not fully labile (e.g. if they had a more modular design) then this would suggest that the brain carries out many distinct kinds of computation and that some parts cannot do what other parts do. In other words, the reason that they are not fully labile is that they differ computationally. I mention this because it appears that current wisdom is that brains are not completely equipotential. So, though it is true that brains are not entirely fixed in function, it also seems to be the case that there are limits to this. Elissa reported, for example, that language function, which is generally lateralized in the left, migrates to the analogous place in the right hemisphere if the left side is non-functioning. In other words it moves to the same place but on the other side. It cannot, it seems, move just anywhere (say to the hippocampus (by far my favorite word in neuroscience! I makes me think of a university version of this). That the language slack left by the left hemisphere is taken up by the same place in the right hemisphere makes sense if these areas are specialized and this makes sense if they execute their own unique brand of computations.

Or at least that’s the way it seems to me at first blush. So here’s my question: is that the way it seems to you too? Can we argue from brain “modularity” against general learning? Or put another way; does a mind that uses general learning imply a brain that is equipotential and one that is not fully plastic argue in favor of mental modules? I feel that these notions are linked, but I have not been able to convince myself that the link is tight.

Let me add one more point: even modularists could concede that different parts of the mind use partially overlapping cognitive operations. What is denied is that it is all the same all the way down. But let’s forget these niceties for now. What I want to know concerns the pure case: does brain modularity imply mental modularity and vice versa? Or are the two conceptions entirely unrelated?

Friday, November 21, 2014

The 2nd Hilbert Question: Barbara Citko on Diagnostics for Multidominance

What are the Diagnostics of a Multidominant Structure?

Multidominant structures (doubly rooted structures of the kind given in (1)) have been invoked as a solution to a number of both empirical and theoretical puzzles.

(1) XP YP

/ \ / \

X ZP Y

The idea that such structures exist spans several decades and frameworks, going back at least to the seventies and the work of Sampson (1975) on raising and control and Williams (1978) on Across-the-Board wh-questions. Since then, many different mechanisms have been proposed to generate such structures, including (but not limited to): factorization of Williams (1978), Parallel Merge of Citko (2005), behindance-Merge of De Vries 2005, grafting of Van Riemsdijk (2000, 2006a,b), banyan trees of Svenonius (2005), sharing of Gračanin-Yuksek (2007), union of phrase markers of Goodall (1987), node contraction of Chen-Main (2006) and tree linking of Gärtner (2002). Interestingly, while the issue of linearization and interpretation of multidominant structures has received a fair amount of attention in the literature, the very fundamental issue of how to diagnose a multidominant structure has not. We have reliable ways to diagnose A versus A-bar positions, heads versus phrases, specifiers versus complements, covert versus overt movement; what still seems to be lacking is an adequate diagnostic (or set of diagnostics) of multidominance, something akin to crossover as a diagnostic of A-bar dependencies.

Explication

The landscape of multidominance is quite diverse and includes both coordinate and non-coordinate structures, as evidenced by the far from complete list of constructions (coordinate ones in (2) and non-coordinate ones in (3)) that have been analyzed in a multidominant fashion. For the purpose of the question of how to diagnose multidominance, it is immaterial whether a multidominant analysis is the correct one for all of them or just a subset thereof; all that matters is that the grammar allows such structures. Simply put, if they exist, we need to know how to find them.

(2) a. Across-the-board wh-questions (Williams 1978, Goodall 1987, Citko 2005,

2011, De Vries 2009, among others)

b. Right Node Raising (Citko 2011, McCawley 1982, Goodall 1987, Wilder 1999, De Vries 2009, Kluck 2009, Johnson 2007, among many, many others)

c. gapping (and determiner sharing) (Kasai 2007, Citko 2006, 2011, 2012)

d. Questions with coordinated wh-pronouns (Gracanin-Yuksek 2007, Citko 2013,

Citko and Gracanin-Yuksek 2013, among others)

(3) a. Serial verb constructions (Hiraiwa and Bodomo 2008)

b. Free relatives (Haider 1988, Citko 2000, Van Riemsdijk 1998, 2000, 2006)

c. Parasitic Gaps (Kasai 2007)

d. Amalgams (De Vries 2013)

e. Parentheticals (McCawley 1982, De Vries 2005)

f. Appositives (McCawley 1982, Heringa 2009)

g. Comparatives (Moltmann 1992)

h. Discontinuous idioms (Svenonius 2005)

i. movement in general (Chomsky 2004, Gärtner 2002, among others)

A natural way to proceed in the search for a reliable multidominant diagnostic (or set of diagnostics) is to ask what property (or set of properties) characterizes these constructions to the exclusion of others. Let us thus look at some that at various times been associated with multidominance. Intuitively, coordination might seem like a plausible candidate. After all, the two conjuncts in a coordinate structure are parallel. Thus perhaps multidominance is a way to capture this parallelism. However, it is clear that it cannot be one due to the simple the fact that there do exist non-coordinate multidominant structures, i.e. the ones given in (3). Likewise, ellipsis cannot be the right diagnostic, in spite of the intuitive appeal of the idea that perhaps what (some) cases of ellipsis involve is the non-pronunciation of one occurrence of the multiply dominated element. While movement is sometimes analyzed in a multidominant fashion, not all of the constructions in (2-3) involve movement. Similarly, while parentheticals of various types (appositives, amalgams) have been claimed to involve multidominance, there exist enough multidominant yet non-parenthetical constructions to be doubtful of a one-to-one correlation between the two.

What the multidominant constructions listed in (2-3) seem to have in common is the idea (or intuition) that a single element has to simultaneously fulfill the requirements imposed on it by the two elements between which it is shared. In other words, it has to match them in some relevant sense. If so, could matching be a diagnostic we are after? In order to answer this question, we need to further ask what kind of matching multidominant structures require, and what kinds of mismatches they tolerate (if they tolerate mismatches at all). If we limit our attention to constructions in which a nominal element is shared between two nodes, the question becomes whether this shared nominal has to match both in morphological case, Abstract case, thematic role, or (relative) thematic prominence. Logically speaking, mismatches could be due to syncretism effects, proximity effects (or the reverse, anti-proximity effects) or hierarchy effects. The ameliorating effects of case syncretism have been documented pretty well in the relevant literature. However, it is not the case that mismatches due to factors other than syncretism are never tolerated. Citko (2011), for example, points out that in Polish ATB wh-questions tolerate mismatches only with syncretic forms, whereas Right Node Raising tolerate mismatches that suggest proximity is at issue, as shown by the contrast between the ATB question in (4a) and the RNR construction in (5b). In both of them, the verbs inside the two conjuncts impose different case requirements on their objects: the verb lubić ‘like’ requires accusative case whereas the verb ufać ‘trust’ requires dative case. Furthermore, in both of them the object of these two verbs (bolded in (4a-b)) is shared between the two clauses. Interestingly, in the ATB case, neither the accusative nor the dative form of the shared (fronted) wh-pronoun yields a grammatical result, whereas in the RNR case, the dative form of the shared object is possible.

(4) a. *Kogo/*komu Jan lubi a Maria ufa? [Polish]

who.acc/who.dat Jan likesand Maria trusts

‘Who does Jan like and Maria trust?’ atb question

b. Jan lubi a Maria ufa tej koleżance/*tȩ koleżankȩ z pracy.

Jan likes and Maria trusts this.dat friend.dat/this friend.acc from work

‘Jan liked and everyone avoided this friend from work.’ RNR

The fact that not all multidominant constructions are subject to the same kind of case matching requirement casts doubt on the correlation between multidominance and matching, and, consequently, on matching as a multidominance diagnostic. Thus, the search for a reliable diagnostic of a multidominant structure continues.

References

Chen-Main, Joan. 2006. On the Generation and Linearization of Multi-Dominance Structures. Ph.D. dissertation, Johns Hopkins University.

Chomsky, Noam. 2004. ‘Beyond Explanatory Adequacy.’ In Structures and Beyond: The Cartography of Syntactic Structures, ed. by A. Belletti, 104-131. Oxford: Oxford University Press.

Citko, Barbara. 2000. Parallel Merge and the Syntax of Free Relatives. Ph.D. thesis. Stony Brook University.

Citko, Barbara. 2005. “On the Nature of Merge: External Merge, External Merge, and Parallel Merge,” Linguistic Inquiry 36: 475–497.

Citko, Barbara. 2011. Symmetry in Syntax: Merge, Move and Labels. Cambridge: CUP.

Citko, Barbara. 2012. ‘A Parallel Merge Solution to the Merchant/Johnson Paradox,’ In Ways of Structure Building, ed. by M. Uribe-Etxebarria and V. Valmala. Oxford: Oxford University Press.

Citko, Barbara. 2013. ‘The Puzzles of Wh-Questions with Coordinated Wh-Pronouns,’ In Principles of Linearization, ed. by T. Biberaurer and I. Roberts. Berlin: Mouton de Gruyter.

Citko, Barbara and Martina Gračanin-Yuksek. 2013. ‘Towards a New Typology of Wh-Questions with Coordinated Wh-Pronouns.’ Journal of Linguistics 49-1-32.

Gärtner, Hans-Martin. 2002. Generalized Transformations and Beyond: Reflections on Minimalist Syntax. Berlin: Akademie Verlag.

Goodall, Grant 1987. Parallel Structures in Syntax. Cambridge: Cambridge University Press.

Gracanin-Yuksek, Martina 2007. About Sharing. PhD thesis, MIT.

Haider, Hubert 1988. ‘Matching Projections.’ In Constituent Structure: Papers from the 1987 GLOW Conference, ed. by A. Cardinaletti, G. Cinque and G. Giusti, 101-123. Dordrecht: Foris.

Heringa, Herman. 2009. A Multidominance Approach to Appositional Constructions. Ms., University of Groningen.

Hiraiwa, Ken & Adams Bodomo. 2008. ‘Object-Sharing and Symmetric Sharing: Predicate-Clefting and Serial Verbs in Daagare.’ Natural Language and Linguistic Theory 26: 795–832.

Johnson, Kyle. 2007. ‘LCA+Alignment=RNR,’ Handout of a talk presented at the Workshop on Coordination, Subordination and Ellipsis, Tubingen, June 2007.

Kasai, Hironobu. 2007. Multidominance in Syntax. PhD thesis, Harvard University.

Kluck, Marlies. 2009. ‘Good Neighbours or Far Friends. Matching and Proximity Effects in Dutch Right Node Raising.’ Groninger Arbeiten zur Germanistischen Linguistik 48.

McCawley, James 1982. ‘Parentheticals and Discontinuous Constituent Structure.’ Linguistic Inquiry 13: 91–106.

Moltmann, Frederike. 1992. Coordination and comparatives. Ph.D. thesis. MIT.

Riemsdijk, Henk van 1998. “Trees and scions - science and trees,” manuscript, Fest-Web-Page for Noam Chomsky.

Riemsdijk, Henk van 2000. ‘Free relatives inside out: transparent free relatives as grafts.’ Proceedings of the 8th Annual Conference of the Polish Association for the Study of English.

Riemsdijk, Henk van 2006a. ‘Free Relatives.’ in Everaert and van Riemsdijk (eds.), pp. 338–382.

Riemsdijk, Henk van 2006b. ‘Grafts Follow from Merge.’ in Frascarelli (ed.), pp. 17–44.

Svenonius, Peter 2005. ‘Extending the Extension Condition to Discontinuous Idioms.’ Linguistic Variation Yearbook 5: 227-263.

Sampson, Geoffrey. 1975. ‘The Single Mother Condition.’ Journal of Linguistics 1:1-11.

Vries, Mark de. 2005. ‘Coordination and Syntactic Hierarchy.’ Studia Linguistica 59, 83–105.

Vries, Mark de. 2009. ‘On Multidominance and Linearization’ Biolinguistics 3:344–403.

Vries, Mark de. 2013. ‘Unconventional Mergers.’ In Ways of Structure Building, ed. by Myriam Uribe-Etxebarria & Vidal Valmala. Oxford: Oxford University Press.

Wilder, Chris 1999. ‘Right Node Raising and the LCA.’ WCCFL 18 Proceedings, 586-598.

Williams, Edwin 1978. ‘Across-the-board Rule Application.’ Linguistic Inquiry 9: 31–43.

Faculty of Language

Comments