Monday, August 12, 2013

Explaining Camels

There are certain kinds of explanations, which when available, are particularly satisfying.  What makes them such is that they not only explain the facts in front of you, but do so in ways that make the facts inevitable.  How do they do this? Well, one way is by rendering the non-extant alternatives not merely false, but inconceivable.  A joke/riddle that I like to tell my students displays the quality I have in mind.

One physicist/mathematician asks another: why are there 1-humped camels and 2-humped camels, but no N-humped camels, N>2? Answer: Because camels are convex or concave; no other models available.

I love this answer. It’s perfect. How so? By shifting the relevant predicates (from natural numbers to simple curves) the range of possible camels is reduced to two, both of which are attested!  Concave/convex, exhaust the space of options.  And once you think of things in this way, it is clear why 1 and 2 are the only possible values for N.

Let me put this another way: one gets a truly satisfying explanation if one can embed it in concepts that obviate further why questions. How are they obviated? By exhausting the range of possibilities. Why are there no 3-humped camels? Because 3-humped camels are neither concave nor convex and these are the only shapes camels can come in.

The joke/riddle has a second useful attribute. I displays what I take to be a central aim of theoretical research: to redescribe the conceivable alternatives in such as way as to restrict the range of available alternatives to what one actually sees. The aim of theory is not merely to cover the data, but to explain why the data falls in the restricted range it does, and this requires carefully observing what doesn’t happen (what Paul Pietroski calls ‘negative’ facts (e.g. here)).

So, do we have any of these kinds of explanations within syntax? I think we do, or at least there have been attempts to provide such. Let me illustrate.

One current example is Chomsky’s proposed account for why grammatical operations are structure dependent. This is in Problems of Projection (which I would link to, but it is behind a Lingua paywall with an exorbitant price so I suggest that you just get a copy from someplace else). Here’s what we want to explain: given that rules that move T-to-C (as in Y/N questions in English) target the “highest” Ts and not the linearly closest Ts (i.e. leftmost), why must they target these Ts, (i.e. why can’t they target the linearly most proximate Ts)?

The answer that Chomsky gives is that grammatical operations cannot use notions like linear proximity, because linguistic objects are not linearly specified until Spell Out, i.e. the final rule of the syntax. So why can’t grammatical operations be linearly dependent (i.e. non-structure dependent)? Because the syntactic manipulanda (i.e. phrase markers) contain no linear (i.e. left-right order) information. Thus, if grammatical rules manipulate phrase markers and these don’t contain linear information then there is no way to state linearly-dependent rules over these objects.  In other words, why do such rules not appear to exist? Because they can’t be specified for the objects over which the grammatical rules operate, and that’s why linear dependent syntax rules don’t exist.

Or, put positively: why are all syntactic rules structure dependent? Because that’s the only way they can be. In other words, once the impossible options are eliminated all that’s left coincides with what we find.  In this way, the actual is explained via the possible and explanatory oomph is attained. Indeed, I suspect (believe!) that the best way to explain anything is by showing how the plausible alternatives are actually conceptually impossible when thought about in the right way.

Here’s another minimalist example. It comes from current conceptions of Spell Out and how they’ve been used to account for phase impenetrability (i.e. the prohibition against forming dependencies across a phase head).  Here’s the question: why are phases impenetrable? Answer: because Spell Out “sends” phase head complements to the interfaces thereby removing their contents from the purview of the syntax/computational system. In effect, dependencies across phase heads are not possible because complements of phase heads (and hence their contents) are not syntactically “there” to be related.

Note the similarity to the first account: just as linear information is not there to be exploited and hence only structure dependent operations are stateable, so too phase complement information is not there and so it cannot be exploited. In both cases, the “reason” the condition holds is that it cannot fail to hold. There really is only one option when properly conceptualized.

Here’s another example from an earlier era: one of the most interesting arguments in favor of dispensing with constructions as grammatical primitives came from considering a conundrum relating to examples like (1c).

(1)  a. John is likely to have kissed Mary
b. John was seen/believed by Mary
c. John is seen/believed to have kissed Mary

The puzzle is the following in the context of a construction-based conception of grammatical operations (i.e. a view of FL in which the basic operations are construction based rules like Passive, Raising etc).[1] In (1a), John moves from the post verbal position to the subject position via Raising. In (1b) the operation that moves John to the top is Passivization. Question: what is the rule that moves John in (1c)?  Is this Passive or Raising?  There is no determinate answer.

Eliminating constructions gives a simple answer to the otherwise unanswerable question: it’s neither, as these kinds of rules don’t exist. ‘Move alpha’ is the sole transformation and it applies in producing both Raising and Passive constructions. Of course, if this is the only (movement) transformation, then the question is it Raising or Passive dissolves. It’s move alpha and only move alpha. As the earlier question (Raising or Passive?) had no good answer, a conception of grammar where the question dissolves has its charms.

One last example, this one from an undergrad research thesis by Noah Smith (of CMU fame; yes he was once a joint ling/CS student). He wrote when Jason Merchant’s work on ellipsis was first emerging and he asked the following question: Given that Merchant has shown that ellipsis is deletion and not interpretation, why can’t it be the latter?  He rightly (in my view) surmised that this could not be a data driven fact as the relevant data for determining this was very subtle. For Jason it amounted to some case and preposition stranding correlations in sluiced and non-sluiced constructions. On the reasonable assumption that these fall outside the PLD, the fact that ellipsis was deletion could not have been a data driven outcome. Say this is correct. Noah asked why it had to be correct; why ellipsis had to be deletion and could not be interpretation a la Edwin Williams (i.e. ellipsis amounted to filling in the contents of null phrase markers with null terminals at LF).[2] Noah’s answer? Bare Phrase Structure (BPS). BPS replaces the earlier combo of phrase structure + lexical insertion rules. This has the effect of eliminating the distinction between the content and position of a lexical item. As such, he argued, the structures that the interpretive theory of ellipsis presupposed (phrases with no lexical terminals) were conceptually unavailable and this leaves the deletion analysis as the only viable option. So why is ellipsis deletion rather than interpretation? Because the interpretive theories required structures that BPS rendered impossible. I confess to always liking this story.

One caveat before concluding: I am not here proposing that the proposed explanations above are correct. I have some questions regarding the Spell Out explanation of the PIC for example and there are empirical challenges to Merchant’s evidence in favor of deletion analyses of ellipsis. However, the kinds of proposals mentioned above are interesting and important for, if correct, they explain (rather than describe) what we find. And explanation is (or should be) what scientific inquiry aims for.

To end: one of the aims of theoretical work is to find ways of framing questions in such a way that all and only the conceptually possible answers are actualized.  This requires finding a vocabulary that not only accommodates/describes the actual, but renders the non-attested impossible, i.e. unstateable. This makes the consideration of systematic absences (viz. negative data) central to the theoretician’s task. Explanation lies with the dogs that don’t bark, the things that though logically possible, don’t occur. Theories explain what happens in terms of what can happen. This means keeping one’s eyes firmly focused on what we don’t find, the actual simply being the residue once the impossible has been pared away.

[1] This is not the place to go into this, but note that rejecting constructions as grammatical primitives does not imply that constructions might not be derived objects of possible psycholinguistic interest.  I discuss this a bit (here) in the last chapter.
[2] There was an interesting and animated debate about ellipsis between Edwin Williams and Ivan Sag, the former defending an interpretive conception (trees with null terminals filled in at LF) vs the latters deletion analysis (similar to Merchant’s contemporary approach).


  1. The price for purchasing 'Problems of Projection is $19.95. That is, if your library doesn't have a subscription to Lingua. Most libraries do, and most likely the article is automatically accessible from your laptop computer at the office. Elsevier also makes Lingua accessible for free in various developing countries through their Hinari programs. Admittedly, Elsevier makes money from us. But $19.95, while unusual, is hardly exorbitant. What did you pay for your cappucino's today?

    1. I can let others decide on how high the price is. I've registered my vote.

  2. This comment has been removed by the author.

  3. This comment has been removed by the author.

  4. Why is it more of an "explanation" to say that a camel's back can be convex or concave than to say that it can have one or two humps? It's certainly no more parsimonious.

  5. Because concave and convex exhaust the options. 1,2 do not. It is precisely in finding ways to frame issues so that the extant exhausts the possible and no more that subserves explanation.

  6. Why assume one can be +/- concave? At any rate, there are always llamas!

    The point, as you no doubt know, is that the concave/vex distinction exhausts the possibilities. It is A or B and these are +/- values of one another. That's nice, and that's what makes for a nice explanation. The way the question is posed is that there are integral values possible, hence it is reasonable to ask why not 3, 4, etc. The reframing pre-empts this question and leaves the only possible answers as also the right ones. Explanation, as I see it, looks to find just these kinds of predicates, ones that frame questions in the right way.

  7. 1 or 2 humps exhaust the possibilities, in the same way as convex or concave, since you're positing two possibilities in both case, rather than somehow deriving those possibilities. Each "explanation" is as stipulative as the other, until you can explain *why* humps are either convex or concave. As David hints at, [-concave] = [+convex] only if you stipulate that concave and convex are the only possible values (i.e., it could also be flat, wavy, and so on), but then we're back where we started. We might as well stipulate that 1 or 2 humps are the only possible values, and have the convex/concave facts fall out of that.

    Sorry if this comes across as obtuse (I get the feeling it does from the annoyance your reply exudes), it's not meant to be :)

    1. Sorry if I sounded short, unintentional. I guess I have been presupposing that different concepts cut up the conceptual space differently. Theory then aims to find these and get the ones that divide the space "naturally," this meaning into cells that exhaust the conceptual possibilities exhaustively. If one is asking about curves, convex/concave do this, flat lines being those that are neither convex or concave. If one thinks of the space as defined by 'concavity', I.e. +/- concave then I guess one can think of flat camels and convex camels as both -concave or flat camels and concave camels as +concave. That would be fine with me. Then one might expect 0-humped camels as well. The general point, I hope still holds: the space is exhausted by these three options, unlike what happens when one conceives of it in terms on numerically distinguished humps. We can, of course, stipulate that N=0, 1, or 2 and no more, but it is just this stipulation that fails to provide explanatory traction. How? By inviting the obvious question; why 0-2 but no more? Pre-emptying the question lends the other conceptualization explanatory heft. Indeed one might say that it explains why 0-2 were the only correct values for N and in this way it explains what was otherwise stipulated.

      Hope this helps.

    2. Thanks for the longer reply. I understand the point you're making, I just don't think the example works as well as you'd like. My point about -convex is that (unless you stipulate that it means +concave) it could also mean "wavy" (i.e., any number of humps), so it's not the case that either +/-convex or +/-concave (but not both) predict 1 or 2 humps, but no more.

      Anyway, your point is taken.

    3. I think wavy would be both +convex and -convex. I was assuming +/- values on the same feature were not ok. Point taken.

    4. >>>flat lines being those that are neither convex or concave

      If you are using the mathematical definitions, which one would assume given how the discussion was introduced, then note that flat lines are convex sets. So humpless camels are to be expected without any other statement beyond concave/convex.

    5. Aren't both concave and convex curves convex sets? If so doesn't that suggest that this is not a useful definition? That said, I take the point. I was thinking of convexity in terms of minima and maxima of curves. A convex curve having a minimum value, concave having a maximum. This way of thinking of things leaves straight lines out, I believe. But all of this, interesting as it is, and I am not being facetious, might be a bit of overkill. The joke/riddle was useful to indicate the main point: that which predicates you use is important in defining the range of alternatives and this is an important, nay critical, part of the theoretical game.

    6. >>>Aren't both concave and convex curves convex sets?

      In general, convex/concave functions are not convex sets, i.e. it is not generally the case that for any two points of f, any point on the line segment joining those two points is also in f.

      I have no issue with the function interpretation (it's a joke after all), but the same point applies since any linear function is both a convex and a concave function; hence, flat-backed camels would still be predicted. It seems from your last reply to me that you really had in mind **strictly** convex/concave functions SCX/SCV. If that is what you meant, then yes, of course, flat-backed camels would be excluded by definition.

      The linear functions/flat backed camel case is just one example of the general problem with the claimed superiority of SCX/SCV over {1,2} as argued by Andy Farmer. Even under the SCX/SCV theory, the objection originally raised by Andy remains: Is SCX/SCV really superior to the {1,2} theory? In the SCX/SCV case, the relevant space was stipulated by you to be just the set S={SCX,SCV}, which does not exhaust the 'naturally' available space F of, say, all continuous functions (including e.g., linear functions and "wavy" functions). In the {1,2} case, you assumed without argument that the relevant space is N. But this stacks the deck against the {1,2} theory; the {1,2} theorist could just as well stipulate that the relevant space is B={1,2}. Put differently, the relevant correspondence would seem to be between F and N, or between S and B, but not between S and N.

      From this perspective, it looks like S={SCX,SCV} is nothing more than a cute relabelling of B={1,2}. To make your case would require a principled argument, which as far as I could see, you did not provide (and which cannot be legitimately obtained from the joke :-).

      OK this surely really is overkill since I doubt anyone would disagree with your general point. It's been an entertaining discussion.

  8. A little bit more overkill :) From my rusty memory, convex sets are those objects in which every point can be joined to another point by a straight line that never strays outside the object. In other words, convex sets are path-connected. A straight line trivially fits that definition. Concave objects don't, since they have a hollow through which you can run a straight line between two points.

    1. Someone could argue that both 1-humped and 2-humped camels, when very thirsty, may become totally flat. Of course, in such a case, the 2-humped one becomes convex so it seems that being concave is not inherent to him, that he can switch easily from convex over to concave and vice versa (whatever uneasy he may feel when convex). But, as a matter of fact, this metamorphosis is quite uninteresting from the theoretical point of view for it is concerned solely with the camel's performance. Competence-wise, he's, no doubt, inherently concave.

      Now, what about some more linguistic issues?