Faculty of Language: The Merge Conspiracy [Part 1]

Sunday, October 6, 2013

The Merge Conspiracy [Part 1]

A while ago, Norbert wrote several interesting posts regarding the Specifier Island Constraint (SPIC) that caused me to chime in with some technical observations. The basic upshot was that SPIC cannot reliably block extraction from a specifier because we can use Merge rather than Move to displace constituents, in which case SPIC does not apply. Since the comments section isn't the ideal place for discussing such technical matters in an accessible form, Norbert was so kind to endow me with the divine power of publishing my remarks directly on his blog. So here's the first entry of a three part epic, the story of the power of Merge and how it can be used for both good and evil (rated PG-13 for some technical details; but hey, we're all grown-ups here).

Subcategorization in Minimalism

Syntactic frameworks are a diverse bunch, but they all involve some mechanism to capture subcategorization. Words cannot be combined freely like appetizers at a Lebanese restaurant. You can say both He likes to bake and He likes baking, but the semantically equivalent enjoy is more restricted --- He enjoys baking is fine, whereas He enjoys to bake is ungrammatical (in many dialects of English). Combining enjoy with to bake is the linguistic equivalent of soaking your crêpe in your soup, the two just don't go together. Thanks to the wonders of scholarly diversity, how subcategorization is encoded on a technical level differs between theories. I am just going to discuss the Minimalist view of things here, but rest assured, all the issues we will encounter in later posts also arise in CCG, GPSG, LFG, and HPSG, among others.

One of the most explicit accounts of selection in Minimalist syntax can be found in David Adger's textbook Core Syntax, and a similar implementation is used in Ed Stabler's formalization of Minimalist syntax known as Minimalist grammars. First, every word has a syntactic category feature. Verbs have V, nouns have N, complementizers have C, and so on. Second, if a word has, say, 3 arguments of category X, Y, and Z, where it first combines with X, then with Y, and finally with Z, then the word has corresponding selector features Arg1:X, Arg2:Y, Arg3:Z. All Arg features of a word must be discharged by merging it with arguments with a matching category feature in the correct order.

The split between like and enjoy can now be handled as follows. First one posits two distinct feature specifications for like. In both cases its category feature is V and its second argument must be a DP. The specification for the first argument determines whether like combines with the TP to bake or the DP baking. In contrast to like, enjoy has only one entry, which requires both arguments to be DPs. The relevant feature specifications for all the words in the examples above are as follows:

like[Cat:V, Arg1:T, Arg2:D]
like[Cat:V, Arg1:D, Arg2:D]
enjoy[Cat:V, Arg1:D, Arg2:D]
to bake[Cat:T]
baking[Cat:D]
he [Cat:D]

We can now combine these words via Merge to create bigger expressions as long as the relevant features match. If we want to build He likes to bake, we first merge like[Cat:V, Arg1:TP, Arg2:DP] and to bake[Cat:TP]. This is licit because the category feature of to bake matches the value of the Arg1 feature on like. And of course we can then merge like to bake with he because the latter is of category D and the head of like to bake has the matching Arg2 feature value. The expression He enjoys to bake, on the other hand, cannot be built because enjoy always has D as the value for Arg1. So merging to bake with enjoy is like trying to hammer a square peg into a round hole.

Note that likes to bake by itself is not a well-formed expression because the Arg2-feature of likes still needs to be checked via Merge. Similarly, he he likes to bake is ungrammatical because likes in he likes to bake has no further Arg-features that would license a third application of Merge.

Local Dependencies via Subcategorization

The feature-driven system captures the essential properties of subcategorization:

Arguments must have a specific category.
A phrase cannot be an argument of more than one head.
Heads combine with a fixed number of arguments.

But every mechanism that correctly handles these subcategorization dependencies can also handle other local dependencies: Suppose that we want to ensure that our grammar generates He likes him rather than he likes he. In Minimalism this is handled by the operation Agree, but our simple Merge mechanism can get the job done, too. All we have to do is to split the category D into S for subjects and O for objects.

like[Cat:V, Arg1:T, Arg2:S]
like[Cat:V, Arg1:O, Arg2:S]
enjoy[Cat:V, Arg1:O, Arg2:S]
to bake[Cat:T]
baking[Cat:S]
baking[Cat:O]
he [Cat:S]
him [Cat:O]

Now he can only occur as the second argument of the verb, ruling out he likes he.

This kind of feature refinement can also be used to enforce gender agreement between an object reflexive and the subject. First we introduce himself with category feature O(masc) and then we ensure via the subcategorization properties of like that the subject is he whenever the object is himself.

like[Cat:V, Arg1:T, Arg2:S]
like[Cat:V, Arg1:O, Arg2:S]
like[Cat:V, Arg1:O(masc), Arg2:S(masc)]
enjoy[Cat:V, Arg1:O, Arg2:S]
enjoy[Cat:V, Arg1:O(masc), Arg2:S(masc)]
to bake[Cat:T]
baking[Cat:S]
baking[Cat:O]
he [Cat:S]
he [Cat:S(masc)]
him [Cat:O]
himself [Cat:O(masc)]

Just looking at the lexical entries required to regulate local case and gender agreement via Merge, we can already tell why feature refinement is not an option usually entertained by linguists. The size of the lexicon has doubled, and the generalizations underlying agreement are obscured by lumping them together with subcategorization requirements into a baroque system of category features.

But elegance is not a factor in determining what our proposals are in principle capable of. If a theory can account for phenomenon X in a roundabout way, it can account for phenomenon X. Sure, as scientists we favor the account that seems less crude to our refined palate. But the point here is not whether there are more elegant accounts for agreement, it is that a system designed to capture just subcategorization requirements actually is capable of regulating a lot more than just that. Right now this looks like a technical curiosity at best because we have only seen a few innocent toy examples --- nothing complicated like long-distance dependencies or anything related to movement. But believe me, long-distance dependencies are child's play compared to some of the things Merge is capable of. Tune in on Wednesday for Part 2, where the digestive waste product hits the rotary air perturber.

Postscript: Just for the record, putting crêpes in your soup is perfectly fine if the former are sliced and the latter is Austrian beef broth or oxtail soup.

7 comments:

AveryAndrewsOctober 6, 2013 at 7:47 PM
Along with the doubling of the size of the lexicon comes the problem that the introduction of S and O categories comes the false typological expectation that languages should be able to differ substantially in the internal structures of DPs in those two positions, which doesn't seem to happen (modulo case, and some minor things involving determiners in some languages such as Greek, which I expect can be 'explained away'). So this possibility needs to be headed off at the pass pronto.

My LFG-2008 conference paper contains an attempt to manage some aspects of this problem in a somewhat innovative LFG+glue semantics framework, but it's got substantial problems.
ReplyDelete
Replies
ewanOctober 7, 2013 at 4:40 AM
I'm curious about your previous comment ruling out the "restrict the features" approach to this:

"the feature coding result not only tells us that constraints can be represented as features, it also tells us that features represent constraints, the two are interchangeable from a technical perspective. So whatever set of features I start out with, I can reduce it to a fixed one by adding constraints to the grammar."

"from a technical perspective" is key here. It's generally understood that the formalism does more than simply delimit the class of possible mappings that grammars can specify. Rather, the notation does some work; in the classical theory this was limited to pointing to an evaluation measure, but in say Berwick and Weinberg the "transparency" of the online processing mechanism wrt the specification of the grammar was also raised. In other words, the notational specification of the grammar stands in some homomorphic relation with something more than just the function it computes, in an ideal world. So that's just to say that, well, one _could_ still in principle divide the world into feature-based restrictions and constraint-driven restrictions, given independent evidence and agreed upon criteria for which restriction should go where.

I wonder (since I don't know) if there is still any conceivable MG lever we could pull to have a feature/non-featural restriction division continue to be meaningful, not necessarily in what the grammars can do, but perhaps in what the grammars "mean" - just to see how far we could push these things. Such questions should no longer be antithetical to the "formal view" (I sometimes get the impression that they are although they never should have been) given the new Stabler parsing work.
ReplyDelete
Replies

Add comment

Faculty of Language

Comments

Sunday, October 6, 2013

The Merge Conspiracy [Part 1]

Subcategorization in Minimalism

Local Dependencies via Subcategorization

7 comments:

Contributors