From Phrase Structure Trees to Derivation TreesFrom our earlier explorations you should already know how MGs work: lexical items have features, and those features trigger the structure-building operations Merge and Move. In addition, the Shortest Move Constraint blocks all configurations where two lexical items could both move to the same landing site.
Since the entire structure-building process is controlled by the features on the lexical items, an MG is defined by specifying a lexicon, e.g. the one below:
- John :: D- top-
- the :: N+ D- nom-
- girl :: N-
- likes :: D+ D+ V-
- e :: V+ nom+ T-
- e :: T+ top+ C-
|1||Merge||[VP likes::D+ V- John::top-]|
|2||Merge||[DP the::D- nom- girl] [VP likes::D+ V- John::top-]|
|3||Merge||[VP [DP the::nom- girl] [V' likes::V- John::top-] ]|
|4||Merge||[TP e:: nom+ T- [VP [DP the::nom- girl] [V' likes John::top-] ] ]|
|5||Move||[TP [DP the girl] [T' e:: T- [VP t [V' likes John::top-] ] ] ]|
|6||Merge||[CP e::top+ C- [TP [DP the girl] [T' e [VP t [V' likes John::top-] ] ] ] ]|
|7||Move||[CP John [C' e::C- [TP [DP the girl] [T' e [VP t [V' likes t] ] ] ] ] ]|
The two formats differ only in how they keep track of movement. With traces, a moved element leaves behind a special marker to indicate that this position has been moved from.1 The multi-dominance representation reconceptualizes movement as the addition of dominance branches, so no actual displacement occurs in syntax, the moving subtree is just present at multiple positions at the same time.
Let's ignore the trace-based representation and focus on the currently more fashionable multi-dominance format. Anyone who likes to save ink (be it for the purposes of time management, cost reduction, or moral concerns about squid milking) will notice after a while that the multi-dominance depiction encodes a lot of information in more than one way.
First, the extra branches indicating movement aren't really necessary for MGs. The Shortest Move Constraint blocks all cases where more than one lexical item can check a given movement feature, which renders Move deterministic. Whenever Move takes place, there is no ambiguity as to what is moving where to check which feature as long as we know the feature specification of every lexical item. Albeit grayed out, the features are still visible in the picture, so the extra branches are indeed redundant. Be gone, superfluous branches!
But writing down all those labels also seems like a waste of time, doesn't it? The labels keep track of which head projects, but in MGs it is always the case that if an operation checks the features f+ and f-, it is the head carrying f+ that projects. So let's also ditch all 7 interior labels; another tendinitis hazard taken care of.
Now that looks a lot snappier (and it is also easier for me to typeset). But hold on a second. We removed 7 interior labels, and the entire structure was built in 7 steps. Not only that, the nodes with only one daughter were created by erasing branches indicating Move. Very suspicious, my Spider sense is tingling.
Now if we count from bottom to top, the lower unary branching node is the fifth without a label, and the higher one is the seventh. And in the derivation table above, Move takes place at the fifth and seventh step. All other nodes without labels have two daughters because we did not have to remove any Move branches --- these are projected nodes that were created by Merge rather than Move. So what we have here is actually a tree representation of the table above: the leafs are lexical items, binary branching nodes indicate Merge, and unary branching nodes Move.
We can make this fully explicit by labeling the nodes accordingly, giving us the derivation tree for John the girl likes. And just for good measure let's also print all features in black, graying them out just makes it harder on the eyes.
Linguistic Properties of Derivation TreesDerivation trees display a number of properties that Minimalists demand of syntactic structures:
- No labels
For the last ten years,2 there has been work on the question whether syntactic structures require projected labels, and if so, what they should look like. Derivation trees show that labels can be done away with. The interior nodes of a derivation tree only represent applications of Merge and Move. The former takes place at binary branching nodes, the latter at unary branching ones. So the arity of a node is enough to deduce the operation, no labels are needed.
- No linear order
For the Move nodes, linear order is a non-issue since they only have one daughter. Merge nodes have two daughters, each of which functions as one argument of Merge. The way Merge is defined it does not matter which argument comes first, as Merge(A,B) = Merge(B,A). That's because the output of Merge is determined by the polarity of the features on (the heads of) A and B. The argument with the positive polarity feature projects, and it is a single node tree (a lexical item that hasn't selected anything) iff it precedes the other argument in the output structure. From this perspective Merge is completely symmetric, so the linear order of siblings in a derivation tree contributes nothing.
- No Tampering Condition
Merging two trees should not change the structural specification of these trees except for the fact that they are now siblings in some bigger tree. This condition is also satisfied by derivation trees. The only thing that could conceivably be altered are the location of moving subtrees or the feature specifications of the lexical items. Neither makes any sense in derivation trees, which are a record of the timing of the structure-building operations and the trees they take as input. Removing features from a lexical item is unwanted in this case because we want to know what the item looked like before it was fed to Merge, not afterwards --- the latter we can easily compute ourselves. Similarly, movers remain in situ because they are also arguments to Merge, so if they were displaced the derivation tree would no longer encode the fact that the movers are merged into the structure before they undergo movement at some later point.
- Extension condition
Chomsky's earliest Minimalist writings already require that trees can only be extended at the root. Among other things, this rules out countercyclic operations and head movement, which both insert material at lower positions in the tree.3 The Extension Condition holds of derivation trees by virtue of the interpretation we give them. They keep track of the order in which operations apply, and since by assumption our grammar can only move forward in time, new nodes can only be added at the top of the derivation tree. The next step cannot precede the current step, if it did it would be the previous step. So by virtue of derivations proceeding in a natural order, derivation trees necessarily satisfy the Extension Condition.
Comparing Derivation Trees and Phrase Structure TreesIt is crucial to keep in mind that the properties above hold of derivation trees, but not necessarily of the MG phrase structure trees they represent. Just think of what you have to do in order to turn a derivation into a multi-dominance tree.
- Add movement branches
This might already be a violation of the No Tampering Condition, depending on how strictly you interpret it.
- Linearly order siblings
Violates the ban against linearly ordered structures.
- Relabel interior nodes with labels
Violates label-freeness, may violate No Tampering Condition (e.g. changing a VP label to V').
- Remove/gray out all features (except the category feature of the highest head)
Violates the No Tampering Condition.
At this point this pill might still be hard to swallow for some of you. Derivations as the primary syntactic structure rather phrase structure trees? Doesn't that require a major shift in how we think about things? Phrase-structural notions like c-command, for example, do not work as expected over derivation trees, which suggests that something is lost in translation after all. Well what a coincidence, that will be exactly the topic of my next post.
- Traces aren't indexed in MGs, so you can only tell that something has moved from a given position, but not what. The reason for this simplification is rather technical and --- as you will realize by the end of the post --- not particularly relevant for our purposes.↩
- To my knowledge, this line of research was started in Collins, Chris (2002): Eliminating Labels. In Samuel D. Epstein and Daniel T. Seely [eds.] Derivation and Explanation in the Minimalist Program, 42--64.↩
- In The Minimalist Program, Chomsky nonetheless uses head movement. If I remember correctly, this hinges on very technical assumptions about what it means to extend the root of a tree (some split between labels and sublabels and how they can be targeted by operations).↩