Thursday, June 18, 2015

ROOTS in New York, June 29-July3: What Do I Want To Learn?

ROOTS in New York, June 29-July3: What Do I Want To Learn?
(This Post Also Appears on my own blog with the title Anticipation:Roots)

The recent meeting of syntacticians in Athens has whet my appetite for big gatherings with lots of extremely intelligent linguists thinking about the same topic, because it was so much fun.  

At the same time, it has also raised the bar for what I think we should hope to accomplish with such big workshops. I have become more focused and critical about what the field should be doing within its ranks as well as with respect to communication with the external sphere(s).

The workshop I am about to attend on Roots (the fourth such) to be held in New York from June 29th to July 3rd, offers a glittering array of participants (see the preliminary program here http://wp.nyu.edu/roots4/wp-content/uploads/sites/1403/2015/02/roots4_program.pdf ), organized by Alec Marantz and the team at NYU.   

Not all the participants share a Distributed Morphology (DM)-like view of `roots’,  but all are broadly engaged in the same kinds of research questions and share a generative approach to language. The programme also includes a public forum panel discussion to present and discuss ideas that should be more accessible to the interested general public. So Roots will be an experiment in having the internal conversation as well as the external conversation. 

One of the things I tend to like to do is fret about the worst case scenario.  This way I cannot be disappointed.  What do I think is at stake here, and what is there to fret over in advance you ask?  Morphosyntax is in great shape, right?

Are we going to communicate about the real questions, or will everyone talk about their own way of looking at things and simply talk past one another?
Or will we bicker about small implementational issues such as should roots be acategorial or not? Should there be a rich generative lexicon or not?  Are these in fact, as I suspect,  matters of implementation,   or are they substantive matters that make actual different predictions?  I need a mathematical linguist to help me out here.  But my impression is that you can take any phenomenon that one linguist flaunts as evidence that their framework is best, and with a little motivation, creativity and tweaking here and there, that you can give an analysis in the other framework´s terms as well.   Because in the end these analyses are still at the level of higher level descriptions, and it may look a little different but you can still always describe the facts.  

DM in particular equips itself with an impressive arsenal of tricks and  magicks to get the job done. We have syntactic operations of course, because DM prides itself on being `syntax all the way down´ .  But in fact, but we also have a host of purely morphological operations to get things in shape for spellout (fission, fusion, impoverishment, lowering what have you), which are not normal actions of syntax and sit purely in the morphological component.  Insertion comes next, which is regulated by competition and the elsewhere principle, where the effects of local selectional frames can be felt (contextual allomorphy and subcategorization frames for functional context).   After spellout, notice that you still get a chance to fix some stuff that hasn´t come out right so far, namely by using `phonological´readjustment rules, which don´t exist anywhere else in the language´s natural phonology.  And this is all before the actual phonology begins. So sandwiched in between independently understood syntactic processes and independently understood phonological processes, there´s a whole host of operations whose shape and inherent nature look quite unique. And there´s lots of them. So by my reckoning,  DM has a separate morphological generative component which is different from the syntactic one. With lots of tools in it.

But I don´t really want to go down that road, because one woman´s Ugly is another woman´s Perfectly Reasonable, and I´m not going to win that battle. I suspect that these frameworks are inter translatable and that we do not have, even in principle, the evidence from within purely syntactic theorising, to choose between them.

However, there might be deep differences when it comes to deciding what operations are within the narrow computation and which ones are properties of the transducer that maps between the computation and the other modules of mind brain.  So it´s the substantive question of what that division of labour is, rather than the actual toolbox that I would like to make progress on.

To be concrete, here are some mid-level questions that could come up at the ROOTs meeting.

Mid-Level Questions.
A. Should generative aspects of meaning be represented in the syntax or the lexicon? (DM says syntax)
B.  What syntactic information is borne by roots? (DM says none)
C. Should there be late insertion or  should lexical items drive projection? (DM says late insertion)

Going down a level, if one accepts a general DM architecture, one needs to ask a whole host of important lower level questions to achieve a proper degree of explicitness:

Low-Level Questions
DM1: What features can syntactic structures bear as the triggers for insertion?
DM2: What is the relationship between functional items and features? If it is not one-to-one, can we put constraints on the number of `flavours` these functional heads can come in?
DM3: What morphological processes manipulate structure prior to insertion, and can any features be added at this stage?
DM4: How is competition regulated?
DM5: What phonological readjustment rules can apply after insertion?

There is some hope that there will be a discussion of the issues represented by A, B and C above. But the meeting may end up concentrating on DM1-5.

Now, my hunch is that in the end,  even A vs. B vs. C are all NON-ISSUES. Therefore, we should not waste time and rhetoric trying to convince each other to switch `sides’.  Having said that, there is good evidence that we want to be able to walk around a problem and see it from different framework-ian perspectives, so we don’t want homogeneity either. And we do not want an enforced shared vocabulary and set of assumptions.  This is because a particular way of framing a general space of linguistic inquiry lends itself to noticing different issues or problems, and to seeing different kinds of solutions.   I will argue in my own contribution to this workshop on Day 1, that the analyses that adopt as axiomatic the principle  of acategorial roots prejudges and obscures certain real and important issues that are urgent for us to solve.  So I think A, B and C need an airing.

If we end up wallowing in DM1-5 the whole time, I am going to go to sleep.  And this is not because I don’t appreciate explicitness and algorithmic discipline (as Gereon Mueller was imploring us to get more serious about at the Athens meeting), because I do. I think it is vital to work through the system, especially to to detect when one has smuggled in unarticulated assumptions, and make sure the analysis actually delivers and generates the output it claims to generate.   The problem is that I have different answers to B than in the DM framework, so when it comes to the nitty-gritty of DM2,3 and 5 in particular, I often find it frustratingly hard to convert the questions into ones that transcend the implementation.  But ok, it’s not all about me.

But here is some stuff that I would actually like to figure out, where I think the question transcends frameworks, although it requires a generative perspective. 

A Higher Level Question I Care About
Question Z.  If there is a narrow syntactic computation that manipulates syntactic primes and  has a regular relationship to the generation of meaning, what aspects of meaning are strictly a matter of syntactic form, and what aspects of meaning are filled in by more general cognitive processes and representations? 

Another way of asking this question is in terms of minimalist theorizing. FLN must generate complex syntactic  representations and semantic skeletons that underwrite the productivity of meaning construction in human language. What parts of what we traditionally consider the `meaning of a verb’  are contributed by (i) The narrow syntactic computation itself, (ii) the transducer from FLN to the domain of concepts (iii) conceptual flesh and fluff on the other side of the interface that the verb is conventionally associated with. 

Certain aspects of the computational system for a particular language must surely be universal, but perhaps only rather abstract properties of it such as hierarchical structuring and the relationship between embedding and semantic composition. It remains an open question whether the labels of the syntactic primes are universal or language specific, or a combination of the two (as in Wiltschko’s recent proposals). This makes the question concerning the division of labour between the skeleton and the flesh of verbal meaning also a question about the locus of variation. But it also makes the question potentially much more difficult to answer. To answer it we need evidence from many languages, and we need to have diagnostics for which types of meaning we put on which side of the divide.  In this discussion, narrow language particular computation does not equate to  universal. I think it is important to acknowledge that. So we need to make a distinction between negotiable meaning vs. non-negotiable meaning and be able to apply it more generally. (The DM version of this question would be: what meanings go into the roots and the encyclopedia as opposed to meaning that comes from the functional heads themselves).

There is an important further question lurking in the background to all of this which is of how the mechanisms of storage and computation are configured in the brain, and what  the role of the actual lexical item is in that complex architecture.  I think we know enough about the underlying patterns of verbal meaning and verbal morphology to start trying to talk to the folks who have done experiments on priming and  the timing of lexical access both in isolation and integrated in sentence processing.   I would have loved to see some interdisciplinary talks at this workshop, but it doesn’t look like it from the programme. 

Still, I am going to be happy if we can start comparing notes and coming up with a consensus on what we can say at this stage about higher level question Z. (If you remember the old Dr Seuss story, Little Cat Z was the one with VOOM, the one who cleaned up the mess).


When it comes to the division of labour between the knowledge store that is represented by knowing the lexical items of one’s language, and the computational system that puts lexical items together, I am not sure we know if we are even asking the question in the right way.  What do we know of the psycholinguistics of lexical access and deployment that would bear on our theories?  I would like to get more up to date on that. Because the minimalist agenda and the constructivist rhetoric essential force us to ask the higher level question Z, and we are going to need some help from the psycholinguists to answer it.  But that perhaps will be a topic for a different workshop.

18 comments:

  1. Regarding intertranslatability, it is important to keep in mind that without restrictions on how a given proposal may be modified, it is of course always possible to translate A into B. That's why it is so important that proposals come with clear claims that are stated independently of the technical apparatus, as the latter can easily be "perverted" by a creative mind.

    For instance, the distinction between privative and binary feature systems is often considered reflective of whether feature specifications are symmetric; positive and negative features can undergo the same operations in a binary feature system, but not in a privative one because there are only positive features. Except that one can of course treat +ATR and -ATR as privative features and still have exactly the same power as in a binary system. The usual objection is that this violates the spirit of privative feature systems, but that's incorrect. What is really going on is that picking a privative feature system is not sufficient to enforce the desired generalization. So instead one should clearly state the desired generalization and argue for its empirical adequacy. The encoding isn't particularly important, but the generalizations are.

    GPSG is a good example of this principle. It is rather simple to modify the formalism so that it can generate languages that are not context-free, and it is also simple to change the rule format so that it no longer implies Exhaustive Constant Partial Ordering (ECPO). But context-freeness and the ECPO are the two biggest claims of GPSG, and Gazdar et al considered them the empirical core of the formalism (and they can both be formulated as properties of languages without invoking any notion of grammar). Context-freeness has been disproven, and thus GPSG went the way of the dodo despite being salvageable on a technical level --- it's empirical heart had stopped beating. [Interestingly, the ECPO still seems to be in the running as a possible language universal.]

    So what I hope to get out of ROOTS is a better understanding of what the big claims of DM are. What are (im)possible morphological paradigms, what are (im)possible interactions at the interface of morphology with syntax and/or phonology, how contentious are those claims within the community? And how do those ideas line up with my own view of morphology?

    ReplyDelete
    Replies
    1. Yes, this would be a good. I would like to get this kind of understanding out of DM too. I think the slogan `syntax all the way down´ cannot be one of the big claims, despite previous rhetorical emphasis.(They have their own rich set of morphology-specific operations). Hopefully DM-ers will be able to tell us what their equivalent `empirical beating heart´ is. Right now, I can´t think think of very many things they can´t do. One possible exception is the phenomenon of Poser-blocking and allomorphy that requires access to information that spans more than one head. This is one thing where, while they could do some technical changes to allow it, it seems to go against the spirit of the system and they strongly deny the existence of such phenomena.

      Delete
  2. So here's a specific question about morphology that I've been puzzling over for a while. If I remember correctly, somebody at Athens asked why morphology exists at all. My question is the exact opposite: why don't we have more morphology, and why does so little of morphology serve the purpose of simplifying syntax?

    As is well understood nowadays, the main challenge for parsing is non-determinism. In a deterministic formalism, parsing only takes linear time. Now if we look at MGs, their string languages are of course non-deterministic since natural languages are non-determinstic. But the string yield of their derivation trees is deterministic. Non-determinism is due to two factors: i) movement obscures the head-argument relation, and ii) the parser needs to guess the features of each lexical item. Note that the first point is much less of an issue if one already knows the feature make-up of each lexical item.

    Keeping all of this in mind, I'm at a loss as to why morphology is perfectly happy to put nouns in 8 different gender classes but stubbornly refuses to spell out each lexical item with its full feature specification. Why isn't John likes Mary pronounced as John[D-,nom-] likes[D+,D+,V-] Mary[D-]?

    Here's a few arguments I can think of:

    1) The processing cost would be too high. That's unlikely, though, since the system wouldn't be all that different from a very rich agglutinating language.

    2) Context disambiguates most sentences, and consequently there is not enough of a payoff to merit the extra effort. But why doesn't the same argument force all languages to be isolating? There must be a qualitative difference between purely syntactic features and what gets marked in morphology. The latter apparently profits from redundancy, the former doesn't.

    3) Syntactic features don't exist. That seems to miss the point since morphological features may not exist, either, yet we do see their instantiations.

    4) Some syntactic features would be instantiated on empty heads and thus remain unpronounced. This might have a significant impact on the efficacy of the system, but it also raises the question why empty heads are ever allowed to be empty. Just one more parsing headache.

    5) The features are spelled out. Wh-movers have specific surface forms, topicalized constituents often have a specific marker, and case serves in distinguishing categories. The fact that no language has an agglutinative system with each feature corresponding to a specific morpheme is due to statistical improbability. That's a nice story for the basic movement types, but it doesn't work at all for the kinds of movements that are triggered by "anonymous" features, e.g. DP-internal movement in Cinque's word order account. Hmm, a bug or a feature?

    ReplyDelete
    Replies
    1. Lots of languages have something like John[D-,nom-] ("dependent marking") and lots of languages have something like "likes[D+,D+,V-]" ("head marking"), and some have both. So I'm kind of partial to (5). As to why more languages don't more transparently have the full specifications, as an addendum to (5), perhaps phonological and phonetic processes (in the mouths of "lazy" speakers) make morphological endings more opaque over time; listeners are working on inferring the underlying features, and because they are adept at storing portmanteaux and listing contextually specified allomorphs, do not have a high motivation to enforce a one-to-one mapping between morphemes and features. Plus (2), of course.

      Delete
    2. Agree with Peter that 5 is the answer I'd go for, but Thomas's point that we don't find morphological features for triggering roll-up movement is important and is because, as I hope I argued quite oomphily in A Syntax of Substance, our theoretical systems shouldn't allow roll-up movement. My motivation for this was that roll-up movement doesn't feed semantics, but whatever is needed to trigger it in standard theories also doesn't feed morphophonology. So our theoretical system should rule out such derivations, as the one I presented does. I think one wants to say the same for head-movement.

      Delete
    3. If I understand correctly, David, you're espousing the methodological dictum of "don't assume a morphosyntactic feature that is not expressed in the morphophonology of at least one language." I like that a lot; I would hasten to add that this would probably rule out – among many other things – the syntactically realized (but phonologically covert) domain restrictors that many semanticists are happy to posit orbiting the D area of quantificational noun phrases. I'm happy with that, though I suspect some of our semanticist friends wouldn't be.

      I myself have argued for a slightly stronger dictum, at least with respect to phi-features, namely "don't assume that (phi-)features are there unless you have at least some overt morphophonological covariance involving that very head in that very language." Maybe too strong, but it does some nice work in predicting the intra- and cross-linguistic distribution of PCC effects.

      But I think head movement does not, in fact, fall within the purview of these dicta. The features that trigger it are often morphophonologically visible (think finiteness in the languages where verb movement covaries with finiteness). Moreover, while I'm happy to believe that roll-up movement never feeds semantics, head movement does (see Lechner, Roberts, i.a.).

      Delete
    4. That's where I'm going Omer, indeed, methodologically and I'd be ok with the stronger version too. But on head movement, I think the Lechner/Roberts/Benedicto arguments are weak as arguments you need *movement* of heads to feed semantics, and the features that correlate with height of spell out you mention are always used for something else (that is, they are not `spell out here' features. They do other kinds of syntactic work. So I'd like to maintain and extend the critique to head movement. In fact, I'll probably discuss some of these issues in my Roots talk.

      Delete
    5. Awesome – I'll be there (as an academic tourist), and I look forward to hearing your talk.

      I'm currently working on a proposal (not remotely ready for primetime, yet) that takes head movement to be the canonical form of syntactic movement, with all phrasal movement being what the system does when it can't quite muster head movement! Maybe we'll get a chance to rattle our intellectual sabers at some point outside of official workshop hours :-)

      Delete
    6. Well in my book I don't have any functional heads, so that would be difficult for me to buy! Let's talk over manhattans in Manhattan.

      Delete
    7. I think we need to distinguish between features that are active in the syntactic computation and feed the interpretation, from diacritics for linearization. (And I am also in favour of Omer´s stronger dictum).

      Linearization is the biggest issue for the interface with phonology because each language needs to map in a deterministic way from a complex hierarchical structure involving copies (or multi dominance) to a string. Putting head movements and roll up movements syn-sem computation proper are an architectural misstep in my opinion. They should be replaced by mechanisms that map structure to strings.

      Delete
    8. exactly (though I don't think strings are the right model for phonetic representations/actions). But more, you need a theory that ensures that it's not possible to put roll-up or head movement in the syntax.

      Delete
    9. Just out of curiosity, isnt a theory with extension regulating all movement plus the idea that movement is conditioned by agree a theory that bars head movement? Indeed, even one that has labels and reduces A over A to some version of minimality (as, e.g. I did in my 2009 thingy), has this effect. So doing this for head movement is not that tough. The question, as I see it, is how good the data is for its existence, and here it seems that you and Omer seem to disagree.

      Delete
    10. @David: So if you don't want head movement in the syntax, it seems that you are committed to remnant movement for non-context-free phenomena, in particular crossing dependencies. Unless you vastly change the locality restrictions on phrasal movement, that is.

      But then it is kind of weird that roll-up movement should be blocked since it's just a special case of remnant movement. That's also bourne out formally in the sense that remnant movement can't be emulated by base merge, whereas roll-up movement can (another instance of c-selection being much more powerful than one would expect).

      Delete
    11. @Norbert yes, I think the evidence for head movement feeding semantics is weak. I also think there are basic problems even in deriving head movement with classical Merge (since you effectively need 3 arguments for Merge, the whole tree, its head, the moving term) so it's ruled out for the same reasons that Chomsky, for example, takes sideways movement to be out. But, for those of you that have struggled through my somewhat impenetrable first 3 chapters of my LI book might know, I develop a system with no heads to move.
      @Thomas. Remnant movement is fine for me, if we take it to be movement of a specifier which has been extracted from. But my system bars movement of part of a functional complement line to a position higher in that complement line in general, so roll-up movement, which requires that, is out. (You could, in my system, in principle anyway, rollup part of an EP past its top, but actually not in practice, because the top of an EP will always end up being a specifier of a different, higher one, so it would be extraction from a spec, and not rollup). So rollup isn't a special case of remnant (you can roll without having extracted), at least the way I mean rollup, but my system rules out rollup derivations, yes. Have a look at the book and see what you think.

      Delete
    12. The thing I don´t like about David´s system is that there is no semantic or functional unity to the specifier position. Specifier-ness is driven by word order facts. I think that is getting things the wrong way round. You´re going to need way too much arbitrary complexity in the transducer on the other interface. I think we keep forgetting syn-sem!

      Delete
    13. I agree that there's a tension here. But there is something about the lack of rightward movement to spec positions that seems to be true if we're to capture U20 type facts, so there's some link between specifier hood and linearity. And I don't think my system introduces more complexity in the syn-sem side than other systems do. Not all specifiers are semantically interpreted the same (a wh-operator is not interpreted the same as an undergoer/ EPP positions are not interpreted the same way as argument positions, etc), so we need the semantics to be sensitive to distinctions between specifiers. In my system, arguments of events are specifiers of functional categories whose semantics is eventive, but nominals don't have event variables, so can't avail themselves of specifiers in the same way. That distinction is not arbitrary, it's a theoretical claim about the syn-sem transduction.

      Delete
  3. I like this question very much!

    What do you think of 6) Syntactic features are only manipulated in narrow syntax. That is to say, Spell-Out takes the tree, linearizes it using features, checks it can insert lexical items at the correct position using features but then forgets about features all together.

    ReplyDelete
    Replies
    1. That is a story of what is going on, but not why it is going on. And even for that I don't find the story all that convincing since:

      1) Syntax can't just flatten the output into a string of words, prosody is computed over structures and many phonological processes are sensitive to prosodic boundaries.
      2) There is no advantage to forgetting features. The cost of memorizing a few features is dwarfed by the memory foot print of the whole tree.
      3) Many features still need to be present at the interfaces or are part of the overt realization, e.g. person. So they are not forgotten.

      Delete