Comments on Faculty of Language: What's Next in Minimalist Grammars?

One more thing -- I think well-nestedness will tur...

2013-10-22T02:51:23.173-07:00

One more thing -- I think well-nestedness will turn out to be really very important; for parsing sure, but also for things like binarising derivation structures and so on; it doesn't show up as an interesting property when we think of weak learning but I think it will be crucial for strong learning.
So I guess I need to look more closely at the arguments for going from TAG to MCTAG.

@Avery -- yes, I am thinking exactly of arguments ...

2013-10-22T02:40:46.973-07:00

@Avery -- yes, I am thinking exactly of arguments based on the fact that we have to derive the right semantics for a sentence like "I saw a great exhibition on Saturday by Sarah Lucas" from its syntactic structure, and getting the right interpretation looks like it will require some non-projective dependency structure, regardless of whether it is an adjunct or not.

@Thomas: the constraints I am thinking of are things like the Finite Context Property (FCP) which says roughly that every syntactic category can be defined using a finite number of contextual features; for the mathematical details see e.g. my recent paper in Machine Learning with Ryo, (yay, another crossing dependency) doi 10.1007/s10994-013-5403-2.
I hesitate to say that these will apply to all learning algorithms because there may be whole other families of learning algorithms that we don't know about yet, but they seem to affect all of the ones we know about so far.

@Alex: 'I think there are non-CF dependencies ...

2013-10-21T15:32:18.632-07:00

@Alex: 'I think there are non-CF dependencies in *English* ' .. but in the example given, "I saw a great exhibition on Saturday by Sarah Lucas", the final PP is optional, and PPs not associated with the subject can appear freely in this position "by Sarah Lucas" -> "in Boston", so I don't see how any generative capacity argument can be made unless connecting form to meaning is introduced into the picture, presumably some sort of strong generative capacity, but exactly how?

wrt which it might be appropriate to point out that languages tend to be relative relaxed about adding in extra information about participants in a discontinuous way as long as the semantics stays intersectional, so that in various Slavic languages and Greek, it's OK to split intersective adjectives off from their nouns in certain constructions, but not various other kinds such as modals (Ntelitheos 2004 and various other sources):

to prasino forese i Maria fustani
the green wore the Mary dress "Mary wore the red dress"

*ton ipotithemeno idha dholofono
the alleged I-saw murderer "I saw the alleged murderer" (p58-59)

I suspect that generative capacity arguments will have to be connected to semantics in some way that makes sense to descriptively oriented linguists (as power to update DRS's or some roughly similar form of knowledge representation, perhaps?) in order to make much of an impression on them.

A big picture remark: When you said that MGs shoul...

2013-10-21T10:08:20.446-07:00

A big picture remark: When you said that MGs should go down a few levels, I expected you to bring up well-nested and/or monadic branching MCFGs, which really affect the quality of Move in MGs. The restriction to a specific dimension, on the other hand, is just an upper bound on the total number of licensee features and wouldn't really change all that much about the formalism. So if all natural languages are indeed 2-PMCFLs, that wouldn't be too much of a curve ball for MGs.

Oh, and I completely forgot to ask: the sorts of universals that we get out of learnability seem of a completely different type to the sorts of universals that linguists are interested in. Do you have some examples? And to which extent are those universals independent of the targeted language class (i.e. do all, say, MAT-learnable classes share some interesting universals)?

I find the idea of a strong generative capacity ar...

2013-10-21T09:49:59.093-07:00

I find the idea of a strong generative capacity argument against 2-MCFLs somewhat paradoxical. Can we rephrase it as "Are there MG analyses that use more than one licensee feature (and would thus be translated into an n-MCFG for some n>2)"?

Such MGs are unavoidable if you adopt even the most basic assumptions such as that subjects move from a lower position to SpecTP, that auxiliaries in English undergo affix hopping, or that all language are underlyingly SVO and Det-Num-A-Noun with other orders derived by movement. The last point is particularly interesting because it makes the right typological predictions about DP-internal word order.

But I guess those ideas are too theory-driven for what you have in mind, so let's add the restriction that we only consider clear-cut cases where something occurs in a position different from the one it usually occupies in the "surface structure" (so we ignore all instances of move that are posited for cross-linguistic reasons).

Those cases do exist, too. You already mentioned multiple wh-movement, and there's also scrambling and possibly multiple topicalization (I have my doubts about the latter). Any case where at least two co-arguments undergo movement, e.g. topicalization and heavy NP shift, will also need at least two features. Note that here it doesn't matter if the process is unbounded, if you want to treat both via movement you need at least two distinct licensee features to avoid SMC violations.

I would also think that any of the examples that are used to motivate the switch from TAG to set-local MCTAG should qualify (possibly even the ones for tree-local MCTAG if you care mostly about strong generative capacity).

A convincing strong generative capacity would be f...

2013-10-21T06:44:21.777-07:00

A convincing strong generative capacity would be fine, as long as it isn't too theory internal -- I think the weak/strong line isn't in quite the right place anyway.
Multiple wh-movement maybe?

Do you want a weak generative capacity argument th...

2013-10-21T03:50:20.376-07:00

Do you want a weak generative capacity argument that some natural language is a (P)MCFL but not a 2-(P)MCFL? I have to admit that I can't think of any established results right now. Even Chinese number names and Groenink's examples of cross-serial dependencies across coordinations in Dutch can be done with 2-PMCFGs. But for non-copying MCFGs they would require dimension greater than 2.

So maybe I am not thinking clearly here. I think ...

2013-10-21T01:11:41.979-07:00

So maybe I am not thinking clearly here.

I think there are non-CF dependencies in *English* (what the parsing guys call non-projective dependencies) but there aren't an unbounded number of them. So extraposition (is that the right term?) examples like
"I saw a great exhibition on Saturday by Sarah Lucas"
have crossing dependencies no? exhibition---by crosses saw---on Saturday.

I guess I think of that as a non CF dependency.

What are the syntactic arguments for having a dimension greater than 2 though in MCFG terms?

Is that actually true? Ever since Shieber's pr...

2013-10-20T23:49:13.000-07:00

Is that actually true? Ever since Shieber's proof that Swiss German is at least a TAL, there has been little interest in showing that other languages, too, are at least TALs. People are usually hunting for something bigger, i.e. MCFLs and PMCFLs. It's also the case that disproving weak context-freeness is extremely difficult in languages with impoverished morphology such as English, simply because the underlying dependencies aren't reflected in the string. But I would wager that if you look closely enough in languages with rich morphology you will find non-CF dependencies. It's just that nobody is interested in doing that.

One extreme is to say that the formalism is very l...

2013-10-20T23:23:59.605-07:00

One extreme is to say that the formalism is very large -- say all PTIME languages using something like RCGs, which just involves some very innocuous functional argument from parsing, and have every interesting constraint arrive from the learning algorithm. Or just from functional constraints on parsing. (Since parsing MCFGs gets harder the higher the dimension is)

But why are nearly all languages weakly context-free?

Well, now that I think about it, I never considere...

2013-10-20T22:54:37.112-07:00

Well, now that I think about it, I never considered the case where something can be ruled out in an elegant way via the formalism and via the learner, at the expense of making one of the two slightly more complicated. I'm not sure which route I would prefer in that case.

But isn't the point of bringing learnability a...

2013-10-20T22:53:52.817-07:00

But isn't the point of bringing learnability and parsing into the picture that the formalism can be allowed to overgenerate in at least some respects? So moving MGs down a few notches should be a last resort strategy.

Computationally, LFG (and HPSG) have been shown to...

2013-10-20T22:45:50.579-07:00

Computationally, LFG (and HPSG) have been shown to be much more powerful than MGs due to the underlying feature unification mechanism. But these proofs just formalize the system as it is defined, rather than how it is used in practice, so I'm not sure what to make of them. For example, there is a translation procedure from HPSG to TAG that seems to work fairly well for most grammars. So if HPSG and LFG could actually be understood as special types of TAGs, then they can also be studied from the MG perspective in a rather straight-forward manner.

I am on the optimistic side too -- but there are s...

2013-10-20T07:41:13.245-07:00

I am on the optimistic side too -- but there are some incompatibilities between MGs and the learnability research. In particular the sorts of universals that we get out of learnability seem of a completely different type to the sorts of universals that linguists are interested in. I don't know if that is necessarily a road block to the further integration.

The other problem I worry about is the MG = MCFG equivalence. Because this is a very very large class. The arguments for non context freeness only justify a very small move outside of CFGs. And other, non minimalist grammars tend to use a much smaller (more minimal?) class -- namely the TAG = CCG = LIG class. So I know that the TAG guys have moved a bit further up the hierarchy over the years, but it would be nice if you MG guys moved down a bit too. MCFGs are a really big class.

I wonder how much of these developments could be c...

2013-10-19T00:19:25.157-07:00

I wonder how much of these developments could be carried over into LFG, which has trees, features, structure sharing, and, with glue-semantics, can also be endowed with feature-interpretation (although nobody but me tries to develop this in any way, as far as I am aware).