Thursday, June 16, 2016

Filters and bare output conditions

I penned this post about a week ago but futzed around with it till today. Of course, this means that much of what I have to say has already been said better and more succinctly by several commentators to Omer’s last post (here, you will note similarities to claims made by Dan Milway, David Adger and Omer). So if you want a verbose rehearsal of some of the issues they touched on, read ahead.

GB syntax has several moving parts. One important feature is that it is a generate and filter syntax (rather than a crash proof theory); one in which rules apply “freely” (they need meet no structural conditions as in, for example, the Standard Theory) and some of the outputs generated by freely applying these rules are “filtered out” at some later level. The Case Filter is the poster child illustration of this logic. Rules of (abstract) case assignment are free, but if a nominal fails to receive a case, the case filter kills the derivation (in modern parlance, the derivation crashes) at a later level. In GB, there is an intimate connection between rules applying freely and filters that dispose of the over-generated grammatical detritus.

Flash forward to the minimalist program (MP). What happens to these filters? Well, like any aspect of G and FL/UG the question that arises is whether these features are linguistically proprietary or reflexes of (i) efficient computation or (ii) properties of the interpretive interfaces. The latter are called “Bare Out Conditions” (BOC) and the most common approach to GBish filters within MP is to reinterpret them as BOCs.

In MP, features mediate the conversion from syntactic filters to BOCs. The features come in two flavors; the inherently interpretable and the inherently un-interpretable. Case (on DPs or T/v) or agreement features (on T or C) are understood as being “un-interpretable” at the CI interface. If convergent derivations are those that produce syntactic objects that are interpretable at both interfaces (or at least at CI, the important interface for today’s thoroughly modern minimalists) then derivations that reach the interface with un-interpretable features result in non-convergence (aka crash). Gs describe how to license such features. So filters are cashed in for BOCs by assuming that the syntactic features GB regulated are interpretable “time bombs” (Omer’s term) which derail Full Interpretation at the interfaces.

There is prima facie reason for doubting that these time bombs exist. After all, if Gs have them then derivations should never converge. So either such features cannot exist OR G derivations must be able to defuse them in some way. As you all know, checking un-interpretable features serves to neuter their derivation crashing powers, and a great deal of G commerce in many MP theories exists to pacify the features that would otherwise cause derivational trouble. Indeed, a good deal of research in current syntax involves deploying and checking these features to further various empirical ends.

Though MPers don’t discuss this much, there is something decidedly odd about a “perfect” or “optimally designed” theory that enshrines at its core toxic features. Why would a perfect craftsman have ordered those? In early MP it was argued that such features were required to “explain” movement/displacement, it too being considered an “imperfection.”[1] However, in current MP, movement is the byproduct of the simplest/best possible theory of Merge so displacement cannot be an imperfection. This then re-raises the question of why we have un-interpretable features at all? So far as I can tell, there is nothing conceptually amiss with a theory in which all operations are driven by the need to link expressions in licit interpretable relationships (e.g. getting anaphors linked to anaphors, getting a DP in the scope of a Topic or Focus marker). The main problem with this view is empirical; case and agreement features exist and that there is no obvious interpretive utility to them. To my knowledge we currently have no good theoretical story addressing why Gs contain un-interpretable features. But, to repeat, I fail to see how there is anything well-designed about putting un-interpretable features into Gs only to then chase them around in an effort to license them.

As MP progressed, the +/- interpretable distinction came to be supplemented with another; the +/- valued distinction. To my recollection, this latter distinction was intended to replace the +/- interpretable distinction, but like so much work in syntax the former distinction remained.[2] Today, we have four cells at our disposal and every one of them has been filled (i.e. someone in some paper has used them).[3]

So, +/- interpretable and +/- valued are the MP way of translating filters into MP acceptable objects. It is part of the effort to make filters less linguistically proprietary by tracing their effects to non-linguistic properties of the interfaces. Did this effort succeed?

This is the topic of considerable current debate. Omer (my wonderful colleague) has been arguing that filters and BOCs just won’t work (here). He has lots of data aimed at showing that probes whose un-interpretable features are not cashiered do not necessarily lead to unacceptability. On the basis of this he urges a return to a much earlier conception of grammar, of the Syntactic Structures/Aspects variety, wherein rules apply to effect structural changes (SC) when their structural descriptions (SD) are met. These rules can be obligatory. Importantly, obligatory rules whose SDs never occur do not crash derivations in virtue of not applying. They just fail to apply and there is no grammatical consequence of this failure. If we see rules of agreement as obligatory rules and their feature specifications as SDs and understand them to be saying “if there is a match for this feature then match” then we can cover lots of empirical ground as regards agreement without filters (and so without BOCs).

Furthermore, if this works, then we remove the need to understand the recondite issues of interpretability as applied to these kinds of features. Agreement becomes a fact about the nature of G rules and their formats (a return of SDs and SCs) rather than the structure of interfaces and their properties. Given that we currently know a lot more about the computational system than we do about the SI interface (in fact, IMO, we know next to nothing about the properties of SI), this seems like a reasonable move. It even has some empirical benefits as Omer shows.

There is an interesting conceptual feature of this line of attack. The move to filters was part of a larger project for simplifying Movement Transformations (MT).[4] GB simplified them by removing SDs and SCs from their formal specifications.[5] Filters mitigated the resulting over-generation. So filters were the theoretical price paid for simplifying MTs to that svelte favorite Move alpha. The hope was that these filters were universal[6] and so did not need to be acquired (i.e. part of UG).[7] Omer’s work shows that this logic was quite correct. The price of eliminating filters is complicating rules by adding SDs and SCs back in (albeit in altered form).

One last point: I have a feeling that filters are making a comeback. Early MP theories where Greed was a big deal were effectively theories where the computational procedures carried the bulk of the explanatory load. But nowadays there seems to be a move to optional rules and with them filters will likely be proposed again (e.g. see Chomsky on labels). We should recall that in an MP setting filters are BOCs (or we should hope they are). And this places an obligation on those proposing them to given them some kind of BOCish interpretation (hence Chomsky’s insistence that labels are necessary for CI interpretation). And these are not always easy to provide. For example, it is easy to understand minimality effects as by-products of the computational system (e.g. minimal search, minimal computation), but there are arguments that minimality is actually an output condition (i.e. a filter) that applies late (e.g. at Spell Out). Ok, that would seem to make it a BOC. But what kind of BOC is that? Why for SI reasons would minimality hold? I am not saying it doesn’t. But if it is applied to outputs then we need a story, at least if we care about the goals of MP.

[1] The idea was that relating toxic features with movement reduced two imperfections (and so MP puzzles) to one.
[2] The replacement was motivated, I believe, on the grounds that nobody quite knew what interpretability consisted in. It thus became a catch-all diacritic rather than a way of explaining away filters as BOCs. Note: were features only valued on Spell Out to AP this problem might have been finessed, at least for morphologically overt features. Overt features are interpretable at the AP interface even if semantically without value and hence troublesome to CI. However, valuation in the course of the derivation results ion features of dubious value at CI. If full interpretation governs CI (i.e. every feature must be interpreted) then valued features need to be interpretable and we are back where we started, but with another layer of apparatus.
[3] Here’s my curmudgeon self: and this is progress?! Your call.
[4] This is extensively discussed in Chomsky’s “Conditions on Rules.” A great, currently largely unread, paper.
[5] There were good learnability motivations for this simplification as well. Reintroducing SDs and SCs will require that we revisit these learnability concerns. All things being equal, the more complex a rule, the harder it is to acquire. As Omer’s work demonstrates, there is quite a lot of variation in agreement/case patterns and so lots for the LAD to figure out.
[6] This is why they were abstract, btw. The learning problem was how to map abstract features onto morphologically overt ones, not how to “acquire” the abstract features. From what I can tell, Distribute Morphology buys into this picture; the problem of morphology being how to realize these abstracta concretely. This conception is not universally endorsed (e.g. see Omer’s stuff).
[7] Of course encumbering UG creates a minimalist problem hence the reinterpretation in terms of BOCs. Omer’s argument is that neither the GB filters strategy nor the MP BOC reinterpretation works well empirically.

No comments:

Post a Comment