Wednesday, January 23, 2013

Minimalism and On Wh Movement


The Generative enterprise as Chomsky envisioned it has been based on five separate but related questions:

1.     What does a given native speaker know about his language? Or, what do particular Gs look like?
2.     What makes it possible for native speakers to acquire Gs? Or, what does UG look like?
3.     How did UG arise in the species? Or, what must be added to non-linguistic cognition to get UGs?
4.     How do native speakers use Gs in performance? Or, how are Gs used to produce and comprehend sentences in real time?
5.     How do brains code for G and UG?

These questions, though separate are clearly interconnected. For example, it’s very hard to impossible to consider the properties of UG without knowing how particular Gs are put together.  It’s pointless to engage in questions about how UG could have arisen in the species without knowing anything about UG. It’s hard to study how people use their Gs in the absence of descriptions of the Gs that are used.  And, last, it’s hard to study how brains embody G and UG without knowing anything about brains, Gs or UGs. All of this should be evident.  What is less obvious, but still true, is that even when we know a non-trivial amount about the things we are trying to relate, relating them may be really tough. There are many reasons for this. Let’s consider two.

In the last several posts I considered the relation between (2) and (3) above.  I noted that we have a pretty good description of what UG does, e.g. it regulates movement and construal dependencies, identifies the kinds of phrase structures that are linguistically admissible and where expressions that are displaced can be interpreted both phonetically and semantically. GB is a pretty good effective theory of these relations.  However, it is a problematic fundamental theory. Why? Because it’s hard to see how something with the special purpose predicates and intricate internal structure of GB could have arisen in the species. It’s just too different from what we find in other domains of human cognition and much too elaborately structured. Consequently, it’s entirely opaque how something this apparently complex and cognitively idiosyncratic could have arisen in the species, especially in the (apparently) short time available. This motivates a re-think of GB’s depiction of UG. The Minimalist Program is a research program aimed at reducing GB’s parochialism and intricacy by reanalyzing heretofore language specific operations in more general cognitive terms (e.g. Merge) and unifying conditions on grammatical processes in more generic computational terms (e.g. cyclicity as monotonicity/no tampering). As I’ve suggested in other posts, this strategy recapitulates Chomsky’s in ‘On Wh Movement’ (OWM) and I have suggested that OWM provides a unification strategy worth emulating. What specifically did Chomsky do in OWM?

First, he adopted Ross’s theory as an effective account. Thus, he accepted that that lay of the land described by Ross was roughly correct. Why ‘roughly’? Because, though Chomsky adopted Ross’s descriptions of strong islands, he tweaked the data in light of the details of the unifying Subjacency account. In particular, the relevant island data included Wh islands (pace Ross’s theory, which treated Wh-islands as porous) and was generalized to cover subjects in general (not only sentential subjects as in Ross).  Thus, though OWM largely adopted Ross’s description of the data, it modified it as well. 

Second, Chomsky unified the various islands under the theory of movement by unifying the movement constructions under a ‘Move alpha’ rubric. Whereas in Ross rules were effectively constructions, Chomsky distilled out a movement component common to all the cases subject to the Subjacency Condition (i.e. ‘move alpha’) and proposed that all and only the move alpha operation was subject to the locality requirements eventuating in movement effects when violated.[1] Thus, some constructions were treated as composites; one part move alpha one part specific head with their own particular grammatical contributions (‘criteria’ in Rizzi’s sense). And all that mattered in computing locality was the movement part.  In sum, what makes a construction subject to islands in OWM is that it is composed of a move alpha part. None of its other properties matter.[2]

Chomsky adopted a very strong version of this claim: as noted, all and only move alpha is subject to subjacency restrictions. Arguing for this constitutes the better part of OWM. The most interesting empirical component was reducing various kinds of deletion operations investigated by Bresnan and Grimshaw to constructions mediated by movement, most particularly their analysis of comparatives as deletion operations.[3] At any rate, sitting where we are today, it looks like Chomsky’s reanalysis has won the day and that islands are now taken as virtual diagnostics of a movement dependency.

The third leg of the analysis involved the Subjacency Condition itself. This involved several proposals: one concerning the inventory of bounding nodes (which nodes counted in computing distance), one concerning which nodes had “escape hatches,” aka comp(lementizer)s, and one concerning the fine structure of these comps (in particular how many slots it contained).  In elaborating these in OWM Chomsky noted that movement via escape hatches effectively allowed move alpha to finesse the Specified Subject Condition (SSC) and the Propositional Island Constraint (PIC). These were two proposed universals that were retained in revised form in GB, though in later work move alpha was not thought to be subject to them.[4] At any rate, it is interesting to see what Chomsky took to be the computational implications of Subjacency Theory:

… the island constraints can be explained in terms of general and quite reasonable computational properties of formal grammar (i.e. subjacency, a property of cyclic rules that states, in effect, that transformational rules have a restricted domain of potential application; SSC, which states that only the most prominent phrase in an embedded structure is accessible to rules relating it to phrases outside; PIC, which stipulates that clauses are islands subject to the language specific escape hatch..). If this conclusion can be sustained, it will be a significant result, since such conditions as CNPC and the independent wh-island constraint seem very curious and difficult to explain on other grounds. [my emphasis] (p. 89; OWM).

Note what we have here: an attempt to motivate the particular witnessed properties of the theory of bounding on more general computational grounds. This should sound very familiar to minimalist ears- restricted computational domains, prominent targets of operations (think labels in place of subjects) etc.  As we know, theories embedding subjacency were subsequently built into interesting proposals with non-trivial consequences for efficient parsing (e.g. Marcus, Berwick and Weinberg). 

The upshot? OWM provides a good model for minimalist unification ambitions.  Extract out a common core operation in seemingly disparate “constructions” and propose general conditions on the applications of this common rule to unify the indicated dependencies. Minimalism has already taken steps in this direction. For example, case theory was unified with movement theory in early minimalism (e.g. Chomsky 1993) by having case discharged in the specifier positions of relevant heads.  Phrase structure theory has been unified with movement theory by treating both as products of Merge (E-merge and I-merge being instances of the same operation as applied to different inputs). More ambitiously (stupidly?) still, some (e.g. yours truly, Boeckx, Nunes, Idsardi, Lidz, Kayne, Zwart, Polinsky, Potsdam) have proposed treating construal rules like Control, reflexivization, and pronominal binding as species of movement hence unifying all of these operations with Merge. 

These last mentioned proposals require abandoning two key features of GB’s view of movement: (i) that movement into thematic positions is barredand (ii) all movement result in phonetic gaps at launch sites. [5]  (i) is a plausible consequence of removing D-structure as a “level” and (ii) constitutes a return to earlier versions of Generative Grammar in which transformations included the addition of some designated lexical material (e.g. there, reflexives, bound pronouns, etc.). It is very very unclear at this moment whether this whole range of unifications is possible.[6] I personally find the data uncovered and the arguments made to be highly suggestive and the empirical hurdles to be less daunting than they appear.  However, this is a personal judgment, not one shared by the field as a whole (sadly).  That said, I would argue that extending the OWM reasoning to other parts of the grammar to unify the modules is the right thing to try if one hopes to address Darwin’s problem in (3) above. Why? Because if this kind of unification succeeded, grammars with the phenomenological properties of GB’s version of UG would follow from the addition of Merge to the cognitive repertoire of our apish ancestors.  In other words, we would have isolated a single change sufficient to allow the development of an FL with GB observable properties. Such a theory could be considered fundamental.

Interestingly, such a theory would not only plausibly provide an answer to Darwin’s problem (in (3)), it would also provide one for Boraca’s Problem (in (5)).  Indeed, this post was intended to address this point here (as the title indicates), but as I’ve rambled on long enough, let me delay addressing how Minimalism relates to question (5) until the next post.


[1] Please note: This is a little bit of Whig history and it’s not really fair to Ross. Ross suggested that all island sensitive constructions involved a specific rule of chopping, an operation that deleted the resumptive pronoun that was part of the constructions of interest. Interestingly, the intuition behind this part of Ross’s analysis has enjoyed somewhat of a revival in recent work (see Merchant and Lasnik in particular), which traces island effects as PF phenomena rather than restrictions on derivational operations as Chomsky originally proposed.
[2] This is a fact worth savoring. A priori there is nothing odd about having the specific construction determine locality rather than the created move alpha dependency.  There is nothing contradictory in assuming, e.g. that Focus and question formation would obey islands but that Topicalizations, comparatives and relativizations would not.  But this does not seem to be the way things panned out.
[3] The other really interesting consequence of Subjaency Theory was the implication that all movement was successive cyclic. This prediction was confirmed in work by Kayne and Pollock, Sportiche, and Torrego a.o.  In my opinion, this is still one of the nicest “predictions” any theory in syntax has ever made.
[4] The reason is that in OWM A’-traces were treated as anaphoric elements. This was revised in later work due to some empirical findings due to Lasnik.
[5] To be honest, this is my reconstruction of their results. Kayne and Zwart, for example, retain the theta criterion by assuming that there is a lot more doubling than meets the eye.
[6] Indeed, I believe (actually I think I know this, but let me be coy) that Chomsky is very skeptical about the unification of construal with movement (with the possible exception of reflexivization).  This may account for why binding, when discussed, is suggested to be a CI interface operation fed by the grammar rather than a direct product of the grammar as in all earlier generative accounts.  

No comments:

Post a Comment