Faculty of Language: Simultaneous rule application: Help!!!

Thursday, December 12, 2013

Simultaneous rule application: Help!!!

Lately I have been thinking of something and have gotten stuck. Very stuck. This post is a request for help. Here’s the problem. It relates to some current minimalist technology and how it relates to the bigger framework assumptions of the enterprise. Here’s what I don’t quite get: what’s it means to say that rules apply “all at once” at Spell Out. Let me elaborate.

A recent minimalist innovation is the proposal that congeries of operations apply simultaneously at Spell Out (SO). The idea of operations applying all at once is not in and of itself problematic, for it is easy to imagine that many rules can apply “in parallel.” However, when rules so apply, they are informationally encapsulated in the sense that the output of rule A does not condition the application of rule B. ‘Condition’ here means neither feeds nor bleeds its application. When rules do feed and bleed one another, then the idea that they all apply “all at once” is hard (at least for me) to understand, for if the application of B logically requires information about the output of A then how could they apply “in parallel.” But if they are not applying “in parallel” what exactly does it mean to say that the rules apply “all at once”?

One answer to this question is that I am getting entangled in a preconception, namely my confusion is the consequence of a “derivational” mindset (DM). The DM picture treats derivations like proofs, each line licensed by some rule applying to the preceding lines.[1] The “all at once” idea is rejecting this picture and is suggesting in its place a more model theoretic idiom in which sets of constraints together vet a given object for well-formedness. An object is well formed not if derivable from rules sequentially applied, but no matter how constructed it meets all the relevant constraints. This should be familiar to those with GBish or OTish educations, for GB and OT are very much “freely generate and filter” kinds of models, the filters being the relevant constraints.[2] If this is correct, then the current suggestion about simultaneous rule application at SO is more accurately understood as a proposal to dump the derivational conception of grammar characteristic of earlier minimalism in favor of a constraint based approach of the GB variety.

Note, that to get something like this to work, we would need some way of constructing the objects that the constraints inspect. In GB this was the province of Phrase Structure Rules and ‘Move alpha.’ These two kinds of rules applied freely and generated the structures and dependencies that filters like Principle A/B/C, ECP, Subjacency, etc. vetted. In an MP setting, it is harder to see how this gets done, at least to me. Recall that in an MP setting, there is only one “rule,” (i.e. Merge). So, I assume that it would generate the relevant structures and these would be subsequently vetted. In effect the operations of the computational system (i.e. Merge, Agree and anything else, e.g. Feature Transfer, Probing, ???) would apply freely and then the result would be vetted for adequacy. What would this consist in? Well, I assume checking the resultant structures for Minimality, Extension, Inclusiveness, etc. The problem, then, would be to translate these principles, which are easy enough to picture when thought of derivationally, into constraints on freely generated structures. I confess that I am not sure how to do this. Consider the Extension condition. How is one to state this as a well-formedness condition on derived structures rather than on the operations that determine how the structures are derived? Ditto on steroids for Derivational Economy (aka: Merge over Move) or the idea that shorter derivations trump longer ones, or determining what constitutes a chain (which are the copies that form a chain?). Are there straightforward ways of coding these as output conditions in freely generated objects? If so, what are they?

There is another subsidiary more conceptual concern. In early Minimalism output conditions (aka filters) were understood as Bare Output Conditions (BOCs). BOCs were legibility conditions that interfaces, particularly CI, imposed on linguistic products. Now, BOCs were not intended to be linguistic, though they imposed conditions on linguistic objects. This means that whatever filter one proposes needs to have a BOC kind of interpretation. This was always my problem with, for example, the Minimal Link Condition (MLC). Do we really think that chains are CI objects and that “thoughts” impose locality conditions on their interacting parts? Maybe, but, I’m dubious. I can see minimality arising naturally as a computational fact about how derivations proceed. I find it harder to see it as a reflection of how thoughts are constructed. However, whatever one thinks of the MLC, understanding Economy or Extension or Phase Impenetrability or Inclusiveness as BOCs seems, at least to me, more challenging still.

Things get even hairier, I think, when one considers that range of operations supposed to happen “all at once.” So, for example, If the features of T are inherited from C (as currently assumed) and I-merge is conditioned by Agree, then this suggests that DPs move to Spec T conditional on C having merged with T. But any such movement must violate Extension. The idea seems to be that this is not a problem if all the indicated operations apply simultaneously. But how is this accomplished? How can I-merge be conditioned (fed) by features that are only available under operations that require that a certain structure exists (i.e. C and “TP” have E-merged) but whose existence would preclude Merging the DP (doing so would violate Extension). One answer: screw Extension. Is this what is being suggested? If not, what?

So, I throw myself on the mercy of those who have a better grasp of the current technology. What is involved in doing operations “all at once”? Are we dumping derivations and returning to a generate-and-filter model? What do we do with apparent bleeding and feeding relations and the dependencies that exploit these notions. Which principles are we to retain, and which dispense with? Extension? Economy? Minimality? How to the rules/operations work? Sample examples of “derivations” would be nice to see. If anyone knows the answer to all or any of these questions, please let me know.

[1] A strong version of this is that it is only the immediately preceding line can influence what happens “next.”

[2] OT’s filters are ranked, whereas GB filters were not. However, I don’t believe that this difference makes a difference for my problem.

20 comments:

Tim HunterDecember 12, 2013 at 12:27 PM
"... checking the resultant structures for Minimality, Extension, Inclusiveness, etc. The problem, then, would be to translate these principles, which are easy enough to picture when thought of derivationally, into constraints on freely generated structures. I confess that I am not sure how to do this. Consider the Extension condition. How is one to state this as a well-formedness condition on derived structures rather than on the operations that determine how the structures are derived? Ditto on steroids for Derivational Economy (aka: Merge over Move) ..."

I think there are rarely in-principle problems with re-encoding derivational constraints as representational constraints. To implement the Extension condition, for example, I think all you need to say is that moved phrases must c-command their traces. (This is just the reverse of the old point about "explaining c-command via extension".) And generally, I think you could check satisfaction of Merge-over-Move by reasoning backwards and asking: "Given that the top of this chain is sitting in position X [i.e. given that something moved into position X, in derivational terms], is there anything with a base position higher in the structure than X [i.e. is there anything that was first-merged higher in the structure] of the sort that could have gone into position X?" If so, that's a violation of Merge-over-Move. Some refinements necessary if you have subarrays and so on.

And even if standard derived representations don't specify all the information you would need to "go back in time" and see how the representation was constructed, you can always enrich the representations to include all the historical information you need. In the limit, if you do this and then also eliminate all the information you *don't* need, then you end up with something that would be sensibly called a derivation tree (distinct from a derived tree). From one point of view, including traces in our derived trees is just a way of "looking back in time" in this sense. One order-of-operations issue that I think is not encoded in what are usually thought of as derived representations is the distinction between cyclic adjunction and counter-cyclic adjunction, which makes all the difference in the famous Freidin/Lebeaux antireconstruction effects: presumably there are two derivations there that "build the same structure", only one of which is "valid". But you can always just say that counter-cyclic adjunction puts a little mark somewhere in the output representation, and bingo, everything can be stated representationally again.

So the bigger question for me is why we might be interested to bother trying to translate between derivational and representational statements of these constraints ... what is the empirical motivation one way or the other?
ReplyDelete
Replies
Charles YangDecember 13, 2013 at 8:38 AM
I think this is doable but very hairy (as one might expect).

I’m thinking about the literature, and computational work, on parallelizing derivational rules of phonology, with the use of finite state transducers. You take a pair of strings (of, say, segments), which basically corresponds to the underlying and surface representations, and you feed it through all FSTs altogether, moving from left to right one *pair* of letters at a time. Each FST roughly corresponds to a phonological rule. For instance, suppose you have a rule: a->b/_c. You write the FST such that if it sees the pair (a:b), i.e., an “a” in the UR and a “b” in the SR, you move to a state in the FST such that the next pair you see must be (c:c); otherwise the FST crashes and the string pair is rejected. so ac:bc goes through but not ad:bd.

In practice, it’s very hard to write these FSTs because any derivational interaction among rules will have to be worked out, by hand, to translate into the parallel executions of FSTs. It can be a lot of fun implementing relatively simple set of data but I think any large scale coverage of the morpho-phonology of a language will be very difficult. It is theoretically possible to take a sequence of derivational rules and compile it into parallel FSTs. The Xerox PARC folks have a tool like that, but the resulting FSTs are very large and not very interpretable: you lose the transparency (and fun) of seeing how the FSTs scan through a pair of strings one character at a time, where you can which FST is messing things up.

For morphophonological analysis, the pairs of UR and SR are generated by, essentially, enumerating all pairs of possible character combinations and then checking them by some type of search heuristic (e.g., breath- or depth-first search). As Barton, Berwick and Ristad point out in their classic study, this could lead to an exponential blowup, especially when dealing with long distance correspondence such as harmony, but in practice as discussed in the 1980s, I don’t think the problem is that severe.

To do this for syntax, we need to do a few obvious things. First, pairs of strings should be pairs of (sub)trees, as one might go from top to down traversing the structure. Second, the generation problem would have to be solved: complexity is an issue—the alphabet of phonology is usually small—but thankfully, the size of the trees would be bound by the phase. Third, one would have to work out the interactions among the constraints, formulated derivationally, and translate them into parallel checks. Maybe the syntax based MT folks have worked out something along these lines?

In Sandiway Fong’s implementation of a GB parser, in theory a parallel system, one (obviously) does not do wild generate and test. A Generalized Left to Right parser is used to (over) generates quite a lot possible trees, which are then fed through the constraints (Case Filter, Theta, etc.). The application of the constraints is not parallel either but ordered to improves efficiency (no point in checking Principle A on a structure that already violates Case). I think Sandiway has been working on Minimalist parsers and he’s probably best placed to talk about this.
ReplyDelete
Replies
ewanDecember 15, 2013 at 2:20 PM
I am thinking out loud here.

"So, for example, If the features of T are inherited from C (as currently assumed) and I-merge is conditioned by Agree, then this suggests that DPs move to Spec T conditional on C having merged with T. But any such movement must violate Extension. The idea seems to be that this is not a problem if all the indicated operations apply simultaneously."

Simultaneous application can't be if there is a condition governing when operation X is licit that is conditioned on the output of operation Y. The only way around this is assume that the output of Y is actually recoverable somewhere in the input to X. Here the first round of trouble is that there is an operation Y that merges C and an operation X that moves DP. X must be fed by Y, but instead you need to say that X is actually triggered by the input to Y, I guess, a C in the workspace which must of necessity merge with TP given whatever else is present in the computation before all this simultaneous application happens. Does that sound right? (I know it sounds crazy, but that's not the question.)

So now the next question is whether this violates extension. Suppose the other thing that was in the input to the computation was TP, and extension says, extend the maximal dominating XP. If extension is interpreted as a simultaneous-application type condition (i.e. only the input counts) then no, extension is not violated. The output, yes, has a further dominating XP (namely CP), but the output is not visible to the SA-extension condition. That has got to be the idea.
ReplyDelete
Replies
ewanDecember 16, 2013 at 12:35 PM
This comment has been removed by the author.
ReplyDelete
Replies

Add comment

Faculty of Language

Comments

Thursday, December 12, 2013

Simultaneous rule application: Help!!!

20 comments:

Contributors