There are certain kinds of explanations, which when available, are particularly satisfying. What makes them such is that they not only explain the facts in front of you, but do so in ways that make the facts inevitable. How do they do this? Well, one way is by rendering the non-extant alternatives not merely false, but inconceivable. A joke/riddle that I like to tell my students displays the quality I have in mind.
One physicist/mathematician asks another: why are there 1-humped camels and 2-humped camels, but no N-humped camels, N>2? Answer: Because camels are convex or concave; no other models available.
I love this answer. It’s perfect. How so? By shifting the relevant predicates (from natural numbers to simple curves) the range of possible camels is reduced to two, both of which are attested! Concave/convex, exhaust the space of options. And once you think of things in this way, it is clear why 1 and 2 are the only possible values for N.
Let me put this another way: one gets a truly satisfying explanation if one can embed it in concepts that obviate further why questions. How are they obviated? By exhausting the range of possibilities. Why are there no 3-humped camels? Because 3-humped camels are neither concave nor convex and these are the only shapes camels can come in.
The joke/riddle has a second useful attribute. I displays what I take to be a central aim of theoretical research: to redescribe the conceivable alternatives in such as way as to restrict the range of available alternatives to what one actually sees. The aim of theory is not merely to cover the data, but to explain why the data falls in the restricted range it does, and this requires carefully observing what doesn’t happen (what Paul Pietroski calls ‘negative’ facts (e.g. here)).
So, do we have any of these kinds of explanations within syntax? I think we do, or at least there have been attempts to provide such. Let me illustrate.
One current example is Chomsky’s proposed account for why grammatical operations are structure dependent. This is in Problems of Projection (which I would link to, but it is behind a Lingua paywall with an exorbitant price so I suggest that you just get a copy from someplace else). Here’s what we want to explain: given that rules that move T-to-C (as in Y/N questions in English) target the “highest” Ts and not the linearly closest Ts (i.e. leftmost), why must they target these Ts, (i.e. why can’t they target the linearly most proximate Ts)?
The answer that Chomsky gives is that grammatical operations cannot use notions like linear proximity, because linguistic objects are not linearly specified until Spell Out, i.e. the final rule of the syntax. So why can’t grammatical operations be linearly dependent (i.e. non-structure dependent)? Because the syntactic manipulanda (i.e. phrase markers) contain no linear (i.e. left-right order) information. Thus, if grammatical rules manipulate phrase markers and these don’t contain linear information then there is no way to state linearly-dependent rules over these objects. In other words, why do such rules not appear to exist? Because they can’t be specified for the objects over which the grammatical rules operate, and that’s why linear dependent syntax rules don’t exist.
Or, put positively: why are all syntactic rules structure dependent? Because that’s the only way they can be. In other words, once the impossible options are eliminated all that’s left coincides with what we find. In this way, the actual is explained via the possible and explanatory oomph is attained. Indeed, I suspect (believe!) that the best way to explain anything is by showing how the plausible alternatives are actually conceptually impossible when thought about in the right way.
Here’s another minimalist example. It comes from current conceptions of Spell Out and how they’ve been used to account for phase impenetrability (i.e. the prohibition against forming dependencies across a phase head). Here’s the question: why are phases impenetrable? Answer: because Spell Out “sends” phase head complements to the interfaces thereby removing their contents from the purview of the syntax/computational system. In effect, dependencies across phase heads are not possible because complements of phase heads (and hence their contents) are not syntactically “there” to be related.
Note the similarity to the first account: just as linear information is not there to be exploited and hence only structure dependent operations are stateable, so too phase complement information is not there and so it cannot be exploited. In both cases, the “reason” the condition holds is that it cannot fail to hold. There really is only one option when properly conceptualized.
Here’s another example from an earlier era: one of the most interesting arguments in favor of dispensing with constructions as grammatical primitives came from considering a conundrum relating to examples like (1c).
(1) a. John is likely to have kissed Mary
b. John was seen/believed by Mary
c. John is seen/believed to have kissed Mary
The puzzle is the following in the context of a construction-based conception of grammatical operations (i.e. a view of FL in which the basic operations are construction based rules like Passive, Raising etc). In (1a), John moves from the post verbal position to the subject position via Raising. In (1b) the operation that moves John to the top is Passivization. Question: what is the rule that moves John in (1c)? Is this Passive or Raising? There is no determinate answer.
Eliminating constructions gives a simple answer to the otherwise unanswerable question: it’s neither, as these kinds of rules don’t exist. ‘Move alpha’ is the sole transformation and it applies in producing both Raising and Passive constructions. Of course, if this is the only (movement) transformation, then the question is it Raising or Passive dissolves. It’s move alpha and only move alpha. As the earlier question (Raising or Passive?) had no good answer, a conception of grammar where the question dissolves has its charms.
One last example, this one from an undergrad research thesis by Noah Smith (of CMU fame; yes he was once a joint ling/CS student). He wrote when Jason Merchant’s work on ellipsis was first emerging and he asked the following question: Given that Merchant has shown that ellipsis is deletion and not interpretation, why can’t it be the latter? He rightly (in my view) surmised that this could not be a data driven fact as the relevant data for determining this was very subtle. For Jason it amounted to some case and preposition stranding correlations in sluiced and non-sluiced constructions. On the reasonable assumption that these fall outside the PLD, the fact that ellipsis was deletion could not have been a data driven outcome. Say this is correct. Noah asked why it had to be correct; why ellipsis had to be deletion and could not be interpretation a la Edwin Williams (i.e. ellipsis amounted to filling in the contents of null phrase markers with null terminals at LF). Noah’s answer? Bare Phrase Structure (BPS). BPS replaces the earlier combo of phrase structure + lexical insertion rules. This has the effect of eliminating the distinction between the content and position of a lexical item. As such, he argued, the structures that the interpretive theory of ellipsis presupposed (phrases with no lexical terminals) were conceptually unavailable and this leaves the deletion analysis as the only viable option. So why is ellipsis deletion rather than interpretation? Because the interpretive theories required structures that BPS rendered impossible. I confess to always liking this story.
One caveat before concluding: I am not here proposing that the proposed explanations above are correct. I have some questions regarding the Spell Out explanation of the PIC for example and there are empirical challenges to Merchant’s evidence in favor of deletion analyses of ellipsis. However, the kinds of proposals mentioned above are interesting and important for, if correct, they explain (rather than describe) what we find. And explanation is (or should be) what scientific inquiry aims for.
To end: one of the aims of theoretical work is to find ways of framing questions in such a way that all and only the conceptually possible answers are actualized. This requires finding a vocabulary that not only accommodates/describes the actual, but renders the non-attested impossible, i.e. unstateable. This makes the consideration of systematic absences (viz. negative data) central to the theoretician’s task. Explanation lies with the dogs that don’t bark, the things that though logically possible, don’t occur. Theories explain what happens in terms of what can happen. This means keeping one’s eyes firmly focused on what we don’t find, the actual simply being the residue once the impossible has been pared away.
 There was an interesting and animated debate about ellipsis between Edwin Williams and Ivan Sag, the former defending an interpretive conception (trees with null terminals filled in at LF) vs the latters deletion analysis (similar to Merchant’s contemporary approach).