Faculty of Language: Effects, Phenomena and Unification

Friday, January 18, 2013

Effects, Phenomena and Unification

In the previous post, I mentioned that there is a general consensus that UG has roughly the features described in GB. In the comments, Alex, quotes Cederic Boeckx as follows and asks if Cederic is “a climate change denier.”

I think that minimalist guidelines suggest an architecture of grammar that is more plausible biologically speaking that a fully specified, highly specific UG – especially considering the very little time nature had to evolve this remarkable ability that defines our species. If syntax is at the heart of what had to evolve de novo, syntactic parameters would have to have been part of this very late evolutionary addition. Although I confess that our intuitions pertaining to what could have evolved very rapidly are not as robust as one would like, I think that Darwin’s Problem (the logical problem of language evolution) becomes very hard to approach if a GB-style architecture is assumed.

The answer is no, he is not (but thanks for asking). I’ll explain why but this will involve rehearsing material I’ve touched upon elsewhere so if you feel you already know the answer please feel free to go off and do something more worthwhile.

My friends in physics (remember, I am a card carrying hyper-envier) make a distinction between effective and fundamental theories. Effective theories are those that are phenomenologically pretty accurate. They are also the explananda for fundamental theories. Using this terminology, GB is an effective theory, and minimalism aspires to develop a fundamental theory to explain GB “phenomena.” Now, ‘phenomena’ is a technical term and I am using it in the sense articulated in Bogen and Woodward (here). Phenomena are well-grounded significant generalizations that form the real data for theoretical explanation. Phenomena are often also referred to as ‘effects.’ Examples in physics include the Gas Laws, the Bernoulli effect, black body radiation, Doppler effects, the photoelectric effect etc. In linguistics these include island effects, principle A, B and C effects, weak and strong crossover effects, the PRO theorem, Superiority effects etc. GB theory can be seen as a fairly elaborate compendium of these. Thus, the various modules within GB elaborate a series of well-massaged generalizations that are largely accurate phenomenological descriptions of UG. I have at times termed these ‘Laws of Grammar,’ (said plangently you can sound serious, grown-up and self-important) to suggest that those with minimalist aspirations should take these as targets of explanation. Thus, in the requisite sense, GB (and its cousins described in the last post) can serve as an effective theory, one whose generalizations a minimalist account, a fundamental theory, should aim to explain.

I hope it is clear how this all relates to the Cedric quote above, but if not here’s the relevance. Cedric rightly observes that if one is interested in evolutionary accounts then GB cannot be the fundamental theory of linguistic competence. It’s jus appears as too complex, all that internal modularity (case and theta and control and movement and phrase structure), all those different kinds of locality conditions (binding domains and subjacency/phase and minimality and phrasal domains of a head and government) all those different primitives (case assigners, case receivers, theta markers, arguments, anaphors, bound pronouns, r-expressions, antecedents etc., etc., etc.). Add to this that this thing popped out in such a short time and there really seems no hope for a semi-reasonable (even just-so) story. So, GB cannot be fundamental. BTW, I am pretty sure that I have interpreted Cedric correctly here for we have discussed this a lot over the last five to ten years on a pretty regular basis.

Given the distinction of GB as effective theory and MP as aiming to develop a fundamental theory, how should a thoroughly modern minimalist proceed? Well, as I mentioned before (here) one model is Chomsky’s unification of Ross’s islands via subjacency. What Chomsky did was (i) treat Ross’s descriptions as effective and (ii) propose how to derive these on more empirically, theoretically and computationally more natural grounds. Go back and carefully read ‘On Wh-Movement’ and you’ll see that how these various strands combine in his (to my taste buds) rather beautiful account. Taking this as a model, a minimalist theory should aspire to the same kind of unification. However, this time it will be a lot harder. For two main reasons.

First, what MP aspires to unify have been thought to be fundamentally different from “the earliest days of generative grammar” (two points and a bonus questions to anyone who identifies the source of this quote). Unifying movement, binding and control goes against the distinction between movement and construal that has been a fundamental part of every generative approach to grammar since Aspects (and before, actually), as has been the distinction between phrase structure and movement. However, much minimalist work over the last 20 years can be seen as chipping away at the differences. Chomsky’s 1993 unification of case as a species of movement or Probe-Goal licensing (PGL), the assimilation of control to a species of movement (moi) or PGL (Landau), reflexive licensing as a species of movement (Idsardi and Lidz, moi) or PGL (Reuland), the collapsing of phrase structure and movement as species of E/I merge, the reduction of Superiority effects to movement via minimality. All of these are steps in reducing the internal modularity of GB and erasing the distinctions between the various kinds of relationships described so well in GB. This unification, if it can be pulled off (and showing that it might be has been, IMO, the distinctive contributions of MP), would do for GB what Chomsky did for islands and the resultant theory would have a decent claim to being fundamental.

The second hurdle will be articulating some notion of computational complexity that makes sense. In ‘On Wh-Movement,’ Chomsky tried to suggest some computational advantages of certain kinds of locality considerations. Whatever, his success, the problem of finding reasonable third factor features with implications for linguistic coding is far more daunting, as I’ve discussed in other posts. The right notion, I have suggested elsewhere, will reflect the actual design features of the systems that FL interact with and use it. Sadly, we know relatively little about interface properties (especially CI) and we know relatively little about how FL would fit in with other cognitive modules. We know a bit more about the systems that use FL and there have been some non-trivial results concerning what kinds of considerations matter. As I have discussed this in other posts, I will not burden you with a rehash (see here and here). Consequently, whatever is proposed is very speculative, though speculation is to be encouraged for the problem is interesting and theoretically significant. This said, it will be very hard and we should appreciate that.

So, is Cedric a denier? Nope. He accepts the “laws of grammar” as articulated in GB as more or less phenomenologically correct. Is his strategy rational? Yup. The aim should be to unify these diverse laws in terms of more fundamental constructs and principles. Are people who quote Cedric to “épater les Norberts” doing the same thing? Not if they are UG deniers and not if their work does not aim to explain the phenomena/effects that GB describes. These individuals are akin to climate change deniers for their work has all the virtues of any research that abstracts away from the central facts of the matter.

44 comments:

UnknownJanuary 18, 2013 at 1:23 PM
Norbert: could I trouble you to provide a full list IN PRECISE FORM of "laws of grammar' as articulated in GB. Also, could you specify the difference between 'laws of grammar' and laws of grammar. Normally, the latter
are taken to be propositions with truth values, and in fact, the value true.
What about 'laws of grammar'? For extra credit, you might provide
a reference to a work where each 'law' is characterized and presumably some justification is given for its 'law'-like status.

Paul M. Postal

Paul M Postal
ReplyDelete
Replies
UnknownJanuary 18, 2013 at 5:49 PM
Linguitude...whatever I am on about, I am willing to put my own name
on it,more than once even, by error. As for spam, sorry the concept
eludes you. Spam is unwanted material sent to people who have not
requested or ordered it. From Lingbuzz, to get a posted paper, one has
to download it, that is, in in effect, order it. These points aside, I
thought your comment was an excellent nonsequitur.
ReplyDelete
Replies
Alex ClarkJanuary 19, 2013 at 2:27 AM
So I think the root of our disagreement is in your final sentence of your post. What is the central fact of the matter? What is the central phenomenon that linguistics should explain?

I am in a minority here because I think the fundamental empirical problem of linguistics is to account for language acquisition, and not to account for "island effects, principle A, B and C effects, weak and strong crossover effects, the PRO theorem, Superiority effects etc. "

So from my point of view, GB (as reconstructed by you without parameters) is not phenomenologically correct because it does not account for the principal phenomenon to be explained. (or I guess using traditional terminology because it does not attain explanatory adequacy)

From *my* point of view, a theory of grammar without a learning theory is abstracting away from the central fact of the matter, and is fundamentally inadequate as a theory of language.

There is a quote by Keller and Asudeh that puts this very well: "A generative grammar is empirically inadequate (and some would say theoretically uninteresting)
unless it is provably learnable. Of course, it is not necessary to provide such a
proof for every theoretical grammar postulated. Rather, any generative linguistic framework
must have an associated learning theory which states how grammars couched in this framework
can be learned."

I accept that this is a minority view but for instance Chomsky in 1973 says "The fundamental empirical problem of linguistics
is to explain how a person can acquire knowledge of language". I think he was right then; I don't know if he still holds that view.

ReplyDelete
Replies
Alex ClarkJanuary 19, 2013 at 6:56 AM
Thanks Cedric, that is helpful. I find myself generally in agreement with your UG from below strategy.
But ...

I think you are conflating the problem and the solution.
If the problem is language acquisition, then yes, one solution is to build things like Principle B into the genome. That solves two problems, the acquisition problem, and the problem of why Principle B occurs in all languages. (assuming for the moment that Principle B correctly describes the observational facts and is universal).
But it creates a new problem: Darwin's problem. How did this get into the genome? But yes, this is makes learning possible.

But one certainly can be interested in the problem of language acquisition without thinking that this is the right solution or the only solution; indeed that is my position.

From my perspective, Principle B is part of the problem, not part of the solution. How can we account for the acquisition of this non-obvious property? My answer -- I don't know.

If there is a debate about what the fundamental problem is; say whether it is A or B, then this affects the appropriate research strategy. If we have model 1 which has a partial answer to A but no answer to B, and model 2 which has a partial answer to B but no answer to A, then if you think the fundamental problem is A then you would prefer model 1 and so on.

So given a choice between a model that has a plausible learning theory but that fails to account for the acquisition of Principle B, and a model which has Principle B built in, but has no learning theory, then I prefer the former. But Norbert *clearly* prefers the latter. And this makes me question how sincere Norbert (and you and Chomsky) are about learning/acquisition being the fundamental problem.

If you have a theory that completely fails to account for some problem A, and yet you are so sure that your theory is true that you call your opponents climate change deniers, then surely you can't think that problem A is *the fundamental problem*. You could only be so sure of your theory if you felt that problem A was peripheral and secondary.

Alternatively, for example, Yoshinaka has shown that one can learn MCFGs, which are grammars with structurally sensitive movement, equivalent to Minimalist Grammars,
(see a recent paper by Ed Stabler http://www.linguistics.ucla.edu/people/stabler/StablerEK12.pdf for some interesting discussion).
So this model does not account for Principle B -- but it has the bones of a plausible learning theory.

So the *methodological* point is: here we have what is in my view a significant step towards solving what I think is the fundamental problem, and I hope that one can develop this towards explanations of the *secondary phenomena*, like island effects. Maybe something like the Pearl/Sprouse approach, maybe a Sag/Hofmeister reduction to processing -- I don't know; it's not that I don't care, it's just not fundamental.

So there is a choice between Stabler type MGs plus a Yoshinaka type learner,
and Norbert's GB with no learning theory at all.
And which you choose depends what problem you think is more fundamental.

(Putting on my learning theorist hat, your Nevins quote is a fair comment on the limitations of late 80s neural networks; I am not a big fan of neural networks either but a *lot* has changed since then. In particular we now understand that one can generalise with a huge number (indeed with an *infinite* number) of features if one controls what is called the 'capacity' of the learning machine. So that argument is based on a technical assumption which seems to be false in general, even though it may apply to some learning algorithms).

ReplyDelete
Replies
UnknownJanuary 19, 2013 at 10:03 AM
Norbert:
Because of length this comment requires n posts

Norbert:
It was good of you to take to trouble to respond at some length to my obviously
outsider comments. A few points:

(1) on the minor point of “laws” vs/ laws. When someone takes the trouble to burden their text with quotes around term X, I take it there is a reason, and in particular, the intention to distinguish “X” from X, usually involving some kind of hedging. But you say not and “laws” are just laws. Fine.

(2) Then you conclude with no basis that I ‘don’t think much of the idea of laws’. Of
course I do. You continue by pointing to unspecified friends in unspecified CS departments who have no trouble formalizing GB principles ‘more precisely’ in
unspecified work. The following ‘but to what end’ expresses a certain disdain, does it not, for the importance of precision.

On that point, apparently your enormous admiration for Chomsky’s work nonetheless
leaves you unimpressed with the following declaration:
“Precisely constructed models for linguistic structure can play an important role, both
negative and positive, in the process of discovery itself. By pushing a precise but
inadequate formulation to an unacceptable conclusion, we can often expose the exact
source of this inadequacy and, consequently, gain a deeper understanding of the
linguistic data. More positively, a formalized theory may automatically provide
solutions for many problems other than those for which it was explicitly designed.
Obscure and intuition-bound notions can neither lead to absurd conclusions :nor provide
new and correct ones, and hence they fail to be useful in two important respects. I
think that some of those linguists who have questioned the value of precise and
technical development of linguistic theory have failed to recognize the productive
potential in the method of rigorously stating a proposed theory and applying it strictly
to linguistic material with no attempt to avoid unacceptable conclusions by ad hoc
adjustments or loose formulation.” Syntactic Structures, page 5.

Curious that. While my admiration for the author is currently a bit less than yours, I have nonetheless from the beginning found the content of this quote to be entirely correct and enormously important. Hence I never take the request for precision to be frivolous for the reasons he gave.

(3) Still, we haven’t gotten to substance or any laws. In this area, I find your prose
maddeningly vague and allusive. While you bothered to produce a 314 word response, Instead of writing down some law(s), you instead refer the reader to unspecified works of Haegeman. I think I used to have one but gave it away, and have no access to any currently so this is not terribly helpful.

(4) Then you get to something with some substance.

“Indeed some of them, e.g. cross over phenomena were, if I recall correctly, first described by you (nice work btw). These and local anaphoric licensing and principle C effects (an anaphor cannot c-command its antecedent) and islands and control etc are all reasonably interpreted as "laws" circumscribing what is linguistically possible, e.g. no reflexives without a local c-commanding antecedent, islands cannot separate antecedents from the traces they bind, and so on.”

ReplyDelete
Replies
UnknownJanuary 19, 2013 at 10:04 AM
This passage is intended to provide the content of your claim that there are GB laws of grammar. Let’s go over them:
a. some of them, cross over phenomena,
Comment: phenomena are not laws. So this is a null response. Moreover, the persistent claim that Principle C explains the strong crossover phenomena is the subject of an entire chapter of my 2004 book, Skeptical Linguistic Essays. This argues that the claim is untenable. Never responded to by anyone as far as I know.
b. local anaphoric licensing. This is explicated as: “e.g. no reflexives without a local c-commanding antecedent.”
Comment: I take the latter phrase to be the law. It is far from clear since from the beginning, no characterization of ‘anaphor’ was given independent of associated principles like this. I ignore that. What I would say, in fact have said in an article with Haj Ross, is that the claim is false. ‘Inverse Reflexives’ in the 2009 festschrift for Terry Langendoen: Time and Again, John Benjamins, Amsterdam. Also never responded to as far as I know. In it we describe French, Albanian and Greek simple
clauses which arguably violate your formulation. These cases reveal inter alia a
generalization. Roughly, the claim is not valid in general when there is a ‘derived’ subject (as in e.g. passives) which is reflexive with the antecedent in some nonsubject position. Interestingly, one can see this reflected even in English. The pair:
(1) *Herself was described by Harriet to Arthur.
(2) *Herself described Harriet to Arthur.
work just the way your formulation claims they should. But consider:
(3) It was herself that was described by Harriet to Arthur.
(4) *It was herself that described Harriet to Arthur.
One sees the same effect manifested in the non-English simple clauses I referenced.
When one fills in the traces that the views under discussion posit, one will see that your principle claims that (3) is like (4), when it manifestly is not.
c. Principle C effects: an anaphor cannot c-command its antecedent.
This comes reasonably close to having a law like character.
Alas, it also crashes against (3).

A last point on reflexives, etc. You mention ‘a few puzzles’, citing the well-worn picture noun cases. The subtle implication is that this pretty much covers it. Cases
like (3), etc. aside, this is entirely wrong. There are massive numbers of puzzles in many languages...for instance, the whole now large literature on ‘long distance
reflexives’ consists of such, these being cases which the principle you state in effect says don’t exist. I might add that although it is often suggested that English lacks
long distance reflexives, it in fact has a variety of them beyond picture noun cases:
a sample:
(5) Winston claimed that himself, ordinary people could never understand.
(6) It was herself that Mary claimed Tod did not understand.
(7) My book compared no book other than itself to your book.
(8) Claudine treated you as inferior to herself.
(9) That author claimed there would always be himself for you to count on.
(10) No woman believed that anyone but herself deserved the position.

Many such cases are discussed in article by me entitled ‘Remarks on Long-Distance Anaphora in English’, in the hardly field-centric journal Style, Volume 40, 2006. It
was part of an odd festschrift for Haj Ross.

d. islands cannot separate antecedents from the traces they bind,
There is obviously something right here..but this ignores issues of weak/selective islands and all their associated problems. One point is that the particular
formulation has a GB flavor but of course the basic idea goes back to Haj Ross’s
thesis and had nothing to do with GB.

Sorry for the megaverbosity but at least I didn’t use too many CAPS.
ReplyDelete
Replies
NorbertJanuary 19, 2013 at 11:28 AM
It seems that we agree that Principle C has a rough law like character (but for (3) to which I return anon). That, I take it means, that it is empirically adequate over a pretty wide domain of cases and there are some apparent problems. This, so far as I know, is often the case with laws of nature; again think the Gas Laws, the Germ Theory of disease (i.e. germs cause disease), Newton's laws etc. Most laws that are not fundamental have exceptions. The question is always whether this vitiates matters or is to be tolerated as an anomaly, noted and we move on. For my current concerns, I can tolerate an anomaly or two, though I would love to see them solved. You know, 10% empty, 90% full. So, it looks like we agree more or less on principle C (which, btw, has the standard Evan's counterexamples as well which requires rethinking of what antecedence is).

Ok the anaphor binding cases. Again, we agree over a pretty large domain. the cases you bring up are interesting, less the ones in (5)-(10) than the ones in (1)-(4). The latter interest me less because I am not sure I believe that they are "true" anaphors. For example, for me, these are not in complementary distribution with pronouns, something that I take to be a diagnostic for "true" reflexives. As these seem fine to my ear with pronouns replacing reflexives, I wills set them aside.

Ok the first four cases. First, with focus on 'herself' in (1) I get the same judgments as I get for (3). They are not as bad as (2) and (4), though without the focus on the reflexives, they are not terribly good either. Have you done a careful evaluation of the judgment? I agree with the contrast, but how good is (3)? I ask because neither one of us is famous for the quality of his judgments. At any rate, say that it is good. Yes, it owuld be a puzzle for principle A as standardly stated, hence worth thinking through carefully. However, here's one thing one should not do: throw out principle A because of this as then what do you do with the standard cases, cases where the data is quite a bit crisper than here? Where does this leave me? I agree it would be nice to refine principle A so that it accommodated (3) (and (1) with focus). That said, the refinement will leave the core cases the same so I conclude that A is roughly correct.

Islands? Ok let's credit Ross (which I believe I did by calling them Ross's islands). All GB did is reanalyze these in terms of subjacency. You probably don't think this an advance, I do, but there are lots of problems with the account even if for someone like me. Selective islands (mainly Wh and neg islands) are interesting and as you know, at least for Wh islands, even the standard theory of subjacency needs a separate assumption to bring these in line with the strong islands. So there is no true entirely unified treatment. At any rate, I am happy with Ross's description which, again, I take to be a pretty accurate depiction of a law of grammar.

I might add that I am glad that you think that these should be treated as laws. A point of terminological agreement.
ReplyDelete
Replies
UnknownJanuary 19, 2013 at 12:46 PM
Interesting discussion, I have 2 questions for Norbert:

1. It seems you and Paul disagree about what qualifies as 'law of grammar'. Would it be fair to say that you have the kind of laws in mind Cedric Boeckx [2009] called Galilean [and contrasted with Aristotelean]? Or something even less definite - we could call them Darwinian based on the paraphrase of his species definition: a species is whatever a competent naturalist wants it to be?

I ask because it is not entirely clear to me from this passage:

Most laws that are not fundamental have exceptions. The question is always whether this vitiates matters or is to be tolerated as an anomaly, noted and we move on.

Correct me if I am wrong but it seems to suggest that there are at least 2 different kinds of laws: fundamental laws and non-fundamental laws. It seems the laws you currently are interested in all have exceptions there are some that are fundamental and those do not have exceptions? If this is the case, are there any fundamental laws of grammar? And if so can you provide examples? For me just names will do.

2. You note that you and Paul have different judgments about some of the cases Paul raises. This reminds me of the earlier discussion about evolutionary issues. Assuming you agree that both you and Paul are highly competent native speakers of English how can we explain the difference in your judgments? Is there a difference between your respetive I-languages that somehow [never mind details here] is manifested in the genome? Or are those merely performance differences and you both share the same I-language in spite of the apparent differences in judgement about important grammatical issues?
ReplyDelete
Replies
Alex DrummondJanuary 19, 2013 at 1:19 PM
Regarding (2), not all variation in I-languages has a genetic source. The environment has an effect too. That's why people acquire different languages depending on the linguistic environment they grow up in. P and N didn't grow up in the same linguistic environment, so it would not be surprising if there were differences between their I-languages. That being said, it's also entirely possible that their I-languages are the same in the relevant respect, and that some additional "performance" factor is responsible for their differing acceptability judgments. To my mind, Poverty of the Stimulus considerations make this second option the more plausible one, but that's just a hunch.
ReplyDelete
Replies
AveryAndrewsJanuary 19, 2013 at 6:09 PM
I think Norbert's conception of the difference between GB and MP is a good way to think, but would like to add that some people think that the big pile of somewhat organized data that's been produced by GB, HPSG, LFG etc needs a great deal more 'curation', possibly (much) more than it needs attempts to explain it.

Many of the empirical generalizations are questionable (fixed subject constraint seems pretty well demolished (ruins surveyed by Asudeh in his 2009 paper), principle B has overt apparent counterexamples in many languages, and the typology of bound pronouns is very complex), many of the supposedly required principles got installed at a time when people greatly underestimated what learning could do at least in principle, there are big architectural problems with many of the formalisms and gaping holes in our conception of what they ought to cover (GB/MP still doesn't seem to have any sensible analysis of 'case-stacking' as first described to the world by Dench and Evans 1988, and discussed in various publications since then (the phenomenon seems to me to fatally crash Baker's 2008 ideas about how concord works, Erich Round's thesis from 2009 is the latest, bigggest, and best piece of work on it afaik), nobody seems inclined to think seriously about how the knowledge behind linguistic variation is represented by speakers and used in production (the focus in computational linguistics is on getting the best parse for utterances in a corpus, not producing output with the kinds of statistics that the actual corpora have).
ReplyDelete
Replies

Add comment