Monday, May 12, 2014

Against open-mindedness; a reply to Mark

Mark writes in comments to this recent post in reply to following remark of mine:

"Generative Grammar (and Postal IS a generative grammarian, indeed the discoverer of cross over phenomena) has made non trivial discoveries about how grammars are structured. [...] So, when asked about a result, there are tons to choose from, and we should make the other side confront these."

I find the talk about "we" and "other side" sociologically interesting, but ultimately unappealing. As Alex Clark notes, there's a couple different senses in which one might define the sides here; that's a first problem. Also, for me as someone from Europe, it feels a bit, how shall I put it, old-fashioned to think in terms of "sides". I think younger generations of scientists are way more heterodox and more attuned to the necessary plurality of the language sciences than you may realize, or than may be apparent from some of the regular commenters on this blog. This goes both ways: new generations are aware of important advances of the past decades, but also of the affordances of new data, methods, and theories for pushing linguistics forward.

This is why I quite like Charles Yang's and David Adger's recent work (just as I like Morten Christiansen's and Simon Kirby's work): it is interdisciplinary, uses multiple methods, tries to interface with broader issues and tries to contribute to a cumulative science of language. I don't think all the talk of "sides" and worries about people not acknowledging the value of certain discoveries contributes to that (hell, even Evans & Levinson acknowledge that broadly generativist approaches have made important contributions to our understanding of how grammars are structured).

For the same reason, I think the Hauser et al review ultimately misses the mark: it notes that under a very limited conception of language, the evolution of one narrowly defined property may still be mysterious. That's okay (who could be against isolating specific properties for detailed investigation), but the slippage from language-as-discrete-infinity to language-in-all-its-aspects is jarring and misleading. "The mystery of the evolution of discrete infinity" might have been a more accurate title. Although clearly that would draw a smaller readership.

I would like to take issue with Mark’s point here: in the current intellectual atmosphere there are sides and one chooses, whether one thinks one does or not.  Moreover, IMO, there is a right side and a wrong one and even if one refrains from joining one “team” or another it is useful to know what the disagreement is about and not rush headlong into a kind of mindless ecumenism. As Fodor’s grandmother would tell you, open-mindedness leads to brain drafts which can lead to sever head colds and even brain death. What’s wrong with being open-minded? If you are tolerant of the wrong things, it makes it harder to see what the issues are and what the right research questions are. So, the issue is not whether to be open-minded but which things to be open-minded about and what things to simply ignore (Bayesians should like this position).  Sadly deciding takes judgment. But, hey, this is why we are paid the big bucks. So, for the record, here is what I take my side to be.

In the white trunks sit the Generativist Grammarians (GGs). Chomsky has done a pretty good job of outlining the research program GGs have been interested in (for the record: I think that Chomsky has done a fantastic job of outlining the big questions and explaining why they are the ones to concentrate on. I have had more than a few disagreements about the details, though not about the overall architecture of the resulting accounts). It ranges from understanding the structures of natural language (NL) particular Gs to understanding the common properties of Gs (UG) to understanding what among these properties are linguistically specific (MP).  This program now stretches over about 60 years. In that time GGs have made many many discoveries about the structure of particular NLs (e.g. Irish has complementizers that signal whether a wh has moved from its domain, English lowers affixes onto verbs while French raises verbs to affixes, Chinese leaves their whs in place while English moves them to the front of the clasue, etc.), about the structure of UG (e.g. there are structural restrictions on movement operations (e.g. Subjacency, ECP), anaphoric expressions come in various types subject to different structural licensing conditions (e.g. Binding Theory), phrase are headed (X’-theory)) and, I believe, though this is less secure than the other two claims, and about the linguistic specificity of UG operations and principles (e.g. Labeling operations are linguistically specific, feature checking is not). At any rate, GG has developed a body of doctrine with interesting (and elaborate) theoretical structure and rather extensive empirical support. This body of doctrine constitutes GG’s current empirical/scientific legacy. I would have thought (hoped really, I am too old to think this: how old are you Mark?) that this last point is not controversial. I use this body of doctrine to identify members of “my team.” They are the people who take these remarks as anodyne. They accept the simple evident fact that GGs have made multiple discoveries about NLs and their grammatical organization and are interested in understanding these more deeply.  In other words, you want to play with me, you need to understand these results and build on them.

I hope it is clear that this is not an exotic demand. I assume that most people in physics, say, are not that interested in talking to people who still believe in perpetual motion machines, or who think the earth is flat, or who think the hunt for phlogiston to be an great adventure or who deny the periodic table or who reject atomism, or who deny the special theory of relativity, etc. Science advances conservatively, by building on results that earlier work has established and moving forward from these results. Science does this by taking the previous results as (more or less) accurate and looking for ways to deepen them. Deepening means deriving these as special cases of a more encompassing account. To wit: Einstein took Newton as more or less right. Indeed, his theory of gravitation derives Newtonian mechanics as a special case. Indeed had his theory not done so, we would have known that it was wrong. Why? Because of the overwhelming amount of evidence that Newtonian mechanics more or less accurately described what we saw around us. Now, Generativists have not built as impressive a scientific edifice as Newton’s. But it’s not chopped liver either. So, if you want to address me and mine, then the price of admission is some understanding about what GGs have found (see here for an earlier version of this position).  My team thinks that GG has made significant discoveries about the structure of NL and UG. We are relatively open concerning the explanations for these results, but we are not open to the supposition that we have discovered nothing of significance (see here for a critique of a paper that thinks the contrary).  Moreover, because my team believes this, when you address US you need to concern yourself with these results. You cannot act as if these do not exist or as if we know nothing about the structures of NLs.

Now, I hope that all of this sounds reasonable, even if written vigorously. The price of entry to my debates is a modest one: know something about what modern Generative Grammar has discovered.  Much to my surprise much of the opposition to GG research seems not to have the faintest idea what it is about. So, though this demand it very modest, it is apparently beyond the capacity of many of those arguing the other side.  Mort Christiansen has written drivel about language because he has refused to engage with what we know and have known for a long time about it (see here for a quick review of some stuff). Though I like Alex C (he seems a gentleman, though I don’t think I have ever met him face to face), it is pretty clear to me that he really does not think that Generative Grammar has discovered anything useful about NLs except maybe that they are mildly context sensitive.  Islands, Binding Theory, ECP effects, these are barely visible on his radar and when asked how he would handle these effects in his approaches to learning, we get lots of discussion about the larger learning problem but virtually nothing about these particular cases that linguists have been working on and establishing lo (yes: lo!!) these many years. I take this to mean that he does not prize these results, or thinks them orthogonal to the problem of language learning he is working on. That’s fine. We need not all be working on the same issues. But I do not see why I should worry too much about his results until they address my concerns. Moreover, I see no reason to ignore this work until it makes contact with what I believe to be the facts of the matter. The same goes for the evolutionary discussions of the emergence of language.  The work acts as if Generative Grammar never existed. (As Alex C recently helpfully noted in the comments section (here), the work seems to have a very “simple” idea of language: strings of length 2. Simple? Simple? Very gracious of you Alex). Indeed, rather than simply concede that evo work has had nothing to say about the kinds of questions Generativists have posed, they berate us for being obsessed with recursion. Imagine the analogue in another domain: someone asks how bee communication evolved and people are told to stop obsessing about the dance; Who cares how bees dance, we are interested in their communication!!

Generativists have discovered a lot about the structure of NLs (at least as much as von Frisch discovered about bee dancing) and if you want to explain how NLs arose in the species you must address these details or you are not talking about NLs. Not addressing these central properties (e.g. recursion) only makes sense if you believe that Generativists have really discovered nothing of significance. Well, excuse me if I don’t open my mind to consider this option. And, frankly, shame on you if you do.

People can work on whatever they want. But, if you are interested in NL then the place to begin is with what Generative research over the last 60 years has discovered. It’s not the last word (these don’t exist in the sciences), but it is a very good first-third words. Be as ecumenical as you want about tools and techniques (I am). But never ever be open-minded about whether what we’ve discovered are central features of NLs. That’s the modus operandi of flat earthers and climate science deniers. Sadly, many of those who work on language are similar in spirit (though there are a lot more of them when it comes to language than there are climate science deniers) in that they refuse to take seriously (or just are ignorant of) what Generativists have discovered.

Last point: Generativists are a legendarily argumentative lot. There is much disagreement among them (us). However, this does not mean that there has been no agreement on what GG has discovered. Ivan Sag and I did not agree on many things, but we could (and did) agree on quite a lot concerning what Generativists had discovered. My side lives under a pretty big tent.  We (or at least I) are very catholic in our methods. We just refuse to disown our results. 


Mark, I suggest you close your mind a little bit more. Bipartisanship sounds good (though for an excellent novelistic critique, read here), but it’s really a deadly cast of mind to have, especially when one of the two sides basically refuses to acknowledge how much we have learned over 60 years of research.

120 comments:

  1. Thanks Norbert, this is very enlightening. Just want to note, for the record, that qualifications like "mindless ecumenism" and "being tolerant of the wrong things" are of course not what I had in mind when I described younger generations of linguists being more heterodox and attuned to the necessary plurality of the language sciences. Mindless ecumenism is something no scientist worth their salt strives for — here at least we are (unsurprisingly) in agreement.

    Your final paragraph sums up very nicely exactly what can be the problem of thinking in terms of "sides". You say, "it's really a deadly cast of mind to have, especially when one of the two sides basically refuses to acknowledge how much we have learned over 60 years of research" (my emphasis). Note that an important part of the discussion on the other post circled around the perceived narrowness of the Hauser et al review of work on language evolution; e.g. one of the problems raised at Replicated Typo is that the review can be seen as refusing to acknowledge how much we have learned over the past decades. Since you don't specify which side you mean, a naive reader might think that was your target. It wasn't of course; the target was the "other side". Taking sides soon devolves into trench warfare — you don't see the enemy, you just shoot from a firmly entrenched position. It's not only unappealing but also unproductive.

    Language is a complex adaptive system with biological, cognitive and cultural foundations. It's smack dab in the middle of the classic cognitive science hexagon. Scientists of language therefore cannot afford to be an inward-looking bunch accusing other teams of either narrowmindedness or mindless ecumenism. Like any cumulative science, the science of language should be mindful of past accomplishments yet sceptical of old mantras and ingrained beliefs, and always on the lookout for new ways of (re)formulating and attacking central problems. Theoretical commitments are important, but since scientists are professional sceptics, there should always be room for a healthy dose of heterodoxy and an open mind to the affordances of new data, methods, and theories.

    ReplyDelete
    Replies
    1. mark and norbert, am puzzled by your posts here and the many others on the previous post regarding our paper. the puzzle comes from the conclusion that so many have drawn that the paper defines language narrowly, focused only on discrete infinity, and thus, that the entire interest in language evolution is about the evolution of discrete infinity. but this isn't what we said. here is a direct quote from the first paragraph of the section entitled "The Language Phenotype": "As we and many other language scientists see it, the core competence for language is a biological capacity shared by all humans and distinguished by the central feature of discrete infinity—the capacity for unbounded composition of various linguistic objects into complex structures. These structures are generated by a recursive procedure that mediates the mapping between speech- or sign- based forms and meanings, including semantics of words and sentences and how they are situated and interpreted in discourse. This approach distinguishes the biological capacity for language from its many possible functions, such as communication or internal thought." So, yes, the recursive procedure is key, but so too are the mappings! Each section then goes on to talk about evidence that bears on both the recursive procedures as well as the representations that arise from the mappings. What we conclude is that there is no evidence bearing on either the recursive procedures OR the representations. Interestingly, though many in this blog speak to the progress, why hasn't anyone cited any evidence on these topics? Lastly, though I realize the Norbert likes the original Lewontin better, and feels that he said it all then (who can challenge norbert's good taste), I disagree. Lewontin focused on cognition more generally, and I actually think he is dead wrong! There has been a lot of progress on the evolution of other cognitive systems, mathematical representations and computations being one! Further, language stands out in the cognitive arena because of all the press and the conferences and the firm beliefs that progress has been spectacular. It is for these reasons, and others, that we wrote the paper.

      Delete
    2. @Mark: I will not let your relentless apparent reasonableness deter me. There are sides, like it or not. As I noted, "my" side insists on not throwing away what we have discovered about the structure of NL and UG over the last 60 years of research. The other side simply refuses to acknowledge this. This is not an issue over it is reasonable to be open minded and, frankly, if there is to be trench warfare, her is a very good place to have it. I want to emphasize that I am not talking about fancy "theoretical commitments" here or, as you might put it "old mantras and ingrained beliefs," (a locution incidentally which ever so slightly does suggest what side you are on: welcome to the dark side Mark). The stuff I mention is as far from theory as the Gas Laws are from Statistical Mechanics. These are well grounded facts about the structure of language. If you are not acquainted with them, I recommend Haegeman's text as a good intro, but any will do (e.g. Radford, Carnie, Adger etc.)

      However, in the spirit of tolerance, I invite you to write a post recapping what you take the significant discoveries of evolang to be. It seems that they have nothing to do with the kinds of structures linguists talk about (which are apparently "perceived to be narrow," though focused and explicit is what I would call it), nonetheless, you believe that there are important results bearing on language that we are ignoring. Ok, tell us what they are. Note, I have been happy to reiterate what I think we have found in linguistics (island effects, binding effects, ECP effects, control effects etc.). So please enlighten us about what we are missing. A few details would be helpful and you can have, say, 2-3 posts to make clear the lay of the land. Just send them to me and I will post them under your name. I will even let you handle the comments section. So, ball's in your court.

      @Marc. Yes I did feel that the paper could have been better. However, I thought that it was good enough. I took the first part on the frogs to be a good illustration of Lewontin's general point in the old paper; just how complicated it is to put together a useful evo argument. This is always worth reiterating. I also agree that the question of how things get mapped to the interfaces and/or the representations the procedures give rise to is an important issue. I did role them together and you are right to pull them apart. However, IMO, linguists know relatively little about the latter. The CI interface is often invoked but very poorly understood. We know a lot about the syntactic rules and the computational system, comparatively little (well not much at all I think) abut the CI interface. So, qua results from Generative Grammar, we have little to add here (btw, this would change if the binding theory were really an interface module, but I really don't like this idea).

      Last point: I agree that there is nothing like language. And I would love to have some non-trivial story telling me how it arose. I also agree that right now we seem to have none that is very compelling.As for your spat with L, I will stay out of it. When elephants tussle the mice run to hide.

      Delete
    3. This reply is to Marc H.

      First my apologies that seemingly I misunderstood part of your [pl] paper. Like Norbert I was under the impression that the discussion of the tungara frogs was intended to support the Lewontin/Chomsky point that because it is already so hard to come up with a good evo story for such a simple system it is hopeless to try for human language. If you [sg] disagree with this it did not shine through from the paper.

      You ask, rightly, "Interestingly, though many in this blog speak to the progress, why hasn't anyone cited any evidence on these topics?"
      Let me offer a partial answer: In case you haven't already, have a look around to posts where Norbert "engages with" the other side [one I recall is '3 psychologists walk into a bar', there are others]. If your work would be ridiculed like that [or eviscerated as Norbert loves to call it], would you be inclined to post on this blog to explain your work? So I fear you'll have to look elsewhere for answers. You may of course not find them satisfactory but from that no one has posted them here you should not conclude there are none.

      As for evidence in general I fear the Chomskyan side is not as strong as presented on his blog. Norbert loves to talk about 'sides' and he makes it appear there are only two. This is certainly not the case. Many of the linguistic results he tries to sell as rock-solid are questioned by fellow linguists. You will have noticed neither Norbert nor anyone else replied to my request to provide the support for the GG account of strong crossover Paul Postal says is missing. I doubt this silence has anything to do with Norbert's renewed promise to never talk to me again. Here is a comment a linguist I work with [NOT PMP] made about the books Norbert recommends to Alex C.

      "Radford's earlier books are very useful as guides to arguments for constituency and phrase structure in general; he lays out every trick in the book, pretty much (though some of his tests are, we know now, less than fully secure). You can learn a lot about how to do syntax from AR's texts, at least till he started bringing minimalism into the picture. Haegeman's books are much less useful that way---pretty much straight, top-down presentations of GB. You don't learn how to do syntax from them, you learn what Chomsky was thinking in the mid 1980s and so on."

      I have gotten excellent advice from the linguists I work with - so if I were interested in learning about the latest in syntax I'd include more than "Norbert's favourites"

      Delete
  2. (BTW, that's not a reply to your whole post; it's more a reformulation of my position to clarify certain points. There are quite some things in your post that I agree with, and even more things I disagree with. To be revisited perhaps.)

    ReplyDelete
  3. One might also substitute for "mindless ecumenism" Richard Dawkins' turn of phrase "being so open minded your brain falls out"; the virtue of being very firm in the logic of a bold position yet able to entertain new ideas when they constitute genuine and clearly thought out alternatives - and particularly when they address genuine and not superficial holes in the theory - one might call this lifelong skill "discernment", taking a cue from Saint Ignatius. I think Norbert and others are right to point to Chomsky as having done this quite well (and thus as having a good set of intuitions to follow), even if it doesn't appear so from the outside because he has chosen to do this more by polemic than debate. There will emerge even better people to follow, and there could have been better ways of following. Refusing to disown your results is good, the siege mentality is not.

    However, I personally read into this a reaction not so much to mindless ecumenism in general, but to a problem particular to the study of the mind - maybe only because it is young - there are a handful of "magic bullet" theories that are seductive because they follow our folk ideas, not because there's any evidence for them. I think that there are two intuitive theories that are really worth scrutinising and I think that's part of the point in this particular case.

    One is "language is complicated"; or even "the mind is complicated." so, there may be a long list of other properties that need to evolve other than discrete infinity. If you restrict yourself to Hauser et al versus Jackendoff and Pinker, it's "language has many necessary parts (proper to language) all of which need to evolve." This one is I think worth fighting against, even if it turns out to be right, just because it leads to dubious scientific practice of generating unwieldy theories which cannot be reasoned over.

    The second is "language is for communication." Again, this is a folk theory and it can lead us to fall back on insisting that modeling communicative/sociological faculties MUST be necessary for language evolution without properly examining the counter-intuitive alternative.

    Again my point is NOT to lead off to the tangent of boosting/denigrating either of these ideas - although it would certainly be nice to get references to people who have tried to promote these two ideas intelligently and even-handedly, appreciating that neither should have any particular priority - rather I think the idea that they are worth combatting BECAUSE they are intuitive and therefore seduce us back into our pre-scientific understanding of the world - that is the fundamental clarion of generative grammar, and I think it hasn't been fully appreciated. I see that laying behind both the original Hauser paper and this call for discernment.

    ReplyDelete
  4. Following (hopefully) on to Ewan's comment, a lot of the not-yet-so-explicitly-mentioned background to the debate is an attempt to resurrect a variety of functionally-oriented explanations using a methodology more compatible with...what is used in some disciplines of psychology.

    The underlying logic is more or less explicitly oriented towards the denial of the last 60 years that Norbert mentions. Specifically, if it's the case that some kind of functional account (these days, things like working memory) for a subset of phenomena deemed "structurally-driven" can be established through psychological experimentation, then (it is claimed) the null hypothesis shifts, the full onus of proof of the necessity of mystifying structuralistic explanations (/irony) is now on generativists, and good luck with that, eh.

    These kinds of base-level functional accounts can then be connected to a whole other functional apparatus waiting in the wings (since e.g. working memory can be connected to a heck of a lot of other things). All of the "folk theories" that Ewan mentions are once again admissible, the last 60 years an unfortunate detour from what ought to have been a flourishing of "Behaviourism Plus".

    This is more or less what I have been told (in other words that I am putting less charitably) by more than one psycholinguist of a certain stripe.

    In many places, generativists are not winning this battle, partly because the temptation to believe that linguistic structure is conditioned "usefully" on function is very strong, and more problematically, because generative grammar lacks some of the surface trappings of psychological science.

    Here is a kind of hypothetical remedy: what in software engineering might be called "regression testing". Create a repository of predictions of sentence judgements, and then every so often evaluate the "performance" of the theory on these judgements. It would create a way to "score" the scientific claims of generative grammar in a manner at least superficially similar to what scientists in other fields might expect.

    ReplyDelete

  5. "discover" is still factive right?

    My major disagreement is with the (recent) methodology of generative syntax.
    Since I don't think that methodology is reliable, I don't think that the various proposals that are put forward ..
    the DP hypothesis, binary merge, antisymmetry, deep structure, macroparameters ... are reliable.

    It's hard to tell what things are definitely solid discoveries of generative syntax, which you have to believe or you won't get invited to the right parties, and which are just weakly supported conjectures that may or may not pan out.

    I also think, as a good Chomskyan, that "The fundamental empirical problem of linguistics
    is to explain how a person can acquire knowledge of language "
    If you don't have a theory that accounts for the fundamental empirical problem then your theory is likely false;
    and current minimalist theory does not have an account of language acquisition.

    ReplyDelete
    Replies
    1. Thx Alex, I thought I got you right. BTW, maybe you could expand a bit on the methodological problems. Do you think the data unreliable? Do you think that generalizations offered do not in fact hold up? Do you think that cross linguistic data unreliable? But thx for being so open about your doubts. BTW, what do you think of global warming?

      Second to Last point: The list of things you find solid seems to me very curious. There is actually not all that much evidence in favor of the DP hypothesis. And it raises real problems wrt selection and subcategorization. There is even less for binary merge and deep structure, at least as it is conventionally understood. As for antisymetery, that implies you think that there is also asymmetric c-command and that movement is always upwards and that there is a lot of remnant movement etc. Oddly, the GG community is still reserving judgment on antisymmetry and macroparameters, so far as I can tell.

      I suspect that we disagree about having an account for language acquisition. We have the same one we had before but we replaced GB parameters with lexical ones. I didn't find the earlier one satisfactory, but then MP is no worse than before. What we do have is an account of what does not have to be acquired: e.g. whatever is part of the machinery of FL. Where things are much less clear is how we acquire what is not built in. FYI, I am beginning to think that Berwick's early theory of rule learning is ripe for re-evaluation on these matters. It is a non-parameter theory that did pretty well. I suspect that with some tweaking, it may do pretty well again. Bob and I have talked about this and maybe something will materialize. Here's hoping. BTW, Mark, take note. Like I said, 60 years of research and many prominent people are ready to throw this all down the toilet. I told you there were sides, like it or not. We can disguise this fact by talking pretty, and being oh so gentile. But it won't change the shape of the landscape one jot. I am partisan to Keyne's view concerning these matters: "Words ought to be a little wild, for they are assaults of thoughts on the unthinking." Time to tut-tut and a little eye roll about now.

      Delete
    2. @Alex - Tangential again, but since this post is mainly sociological - your list of points is revealing of the sociology of GG. None of the points you mention seem to me like points that serious reflective syntacticians are all that committed to (note e.g. Norbert's reply saying the same thing - but note also that I started independently to say the exact same thing and didn't realize til half way through that Norbert had already said it).

      The ugly part of the culture of GG has been to cash out the good part of not being too open minded - i.e. "pursue a theory seriously until you have precisified and evenly handly compared some alternatives" - and cashed this out as "talk about your idea as if it were fact".

      I don't see anything subversive about pointing out that this is a bad idea. Constructively it is rather analogous to the bad technical writing which leads to people with reasonable and legitimate questions to be obnoxiously deferred and deferred to what someone wrote in some obscure manual. In fact if you ask the author of the software, they will often tell you that in fact you have put your finger on a serious limitation. I do not think these problems are insoluble, but I think they are real and I think they are more sociological than they might appear.

      Delete
    3. So it seems like you are agreeing with me: I said that those proposals are not reliable and you agreed with me. Or have I got confused? It happens quite often.

      You say:
      "What we do have is an account of what does not have to be acquired: e.g. whatever is part of the machinery of FL. Where things are much less clear is how we acquire what is not built in."

      A theory of language acquisition is *precisely* a theory of how we acquire what is not built in, based on what is built in. We ideally want an evolutionary account of what is built in too (the topic of the paper we are discussing)

      I am all in favour of Bob B working on his rule learning model, and Charles trying to convert the P & P models to work with the much reduced models of UG -- no innate grammars but just recursion and maps to the interfaces. And I will read their work with great interest. But at the moment there are, AFAIK, no worked out proposals. Any references would be welcome (I am aware of the Yang & Roeper chapter in Boeckx's handbook).


      Delete
    4. That last remark was directed at Norbert;

      Ewan, of course I agree. But methodology is at least partly sociological.

      Delete
    5. Alex C wrote: Since I don't think that methodology is reliable, I don't think that the various proposals that are put forward ..
      the DP hypothesis, binary merge, antisymmetry, deep structure, macroparameters ... are reliable.


      Norbert wrote: There is actually not all that much evidence in favor of the DP hypothesis. And it raises real problems wrt selection and subcategorization. There is even less for binary merge and deep structure ...

      Like Ewan I think there could be an interesting sociological point here. In part perhaps it's that generative linguists "talk about your idea as if it were fact", but in particular is it possible that the "outreach work" that generative linguists do (stuff that appears in non-specialist journals, although this isn't a comment specifically on the Hauser et al article, which I haven't read) ends up being somehow drawn from the speculative shaky stuff, moreso than in other disciplines? Should we spend less time telling outsiders about merge and optimal solutions and interfaces and so on, and more time telling them about basic Condition C effects and control/raising contrasts? Perhaps previous attempts to publicize these basic sorts of things went wrong, but the solution might not necessarily be to try to publicize shakier stuff instead.

      Delete
    6. Yes, what is the rock solid stuff that everyone can agree on -- Norbert, Chomsky, Steedman, Pullum, Bresnan etc etc.? Why isn't there a textbook that contains just that?

      In the way that a chemistry textbook will contain only material that all respectable chemistry professors will accept.

      Introductory textbooks on syntax seem to contain a lot of stuff that is controversial; e.g. Asudeh and Toivonen's review of the Adger and Radford textbooks.

      My hunch is that if one wrote that book it would be quite short and I would agree with all of it.

      Delete
    7. Take a look at the earlier GB texts: Haegeman, Radford's early texts. The later texts are aimed to introduce students to Minimalist technology. But there are plenty of good texts that concentrate on received wisdom.

      How short would it be? Well Haegemann is about 350 pages. Not bad for 60 years, and it does not discuss diagnostic for phrases or parts of speech etc. So combine an early Radfor with Haegemann and you'd end up with about 500 pages. A pretty big chunk of change.

      Last point: believe me the problem is not that linguists have always talked about the latest shiny syntactic theory. If that were the case we wouldn't be saddled reading the drivel gobs of which I've reviewed on this blog. The problem is that the sophisticates are skeptical, oh so skeptical about the very empirical methodology of the field.

      That said, let me ask Alex C and Mark: what parts of the received wisdom you buy? A little list would be nice.

      @Ewan: The ugly part of GG? I am sure that you don't mean to single out GG in this regard. Everyone oversells their results. This is sad, but acceptable. Contrary to you, I do think that GG is under siege. Not because it has been shown to be wrong, but because it is considered ok to act as if there is nothing there. We need some hostile noise. Now.

      @Alex: Yes, a learning theory of what's not innate would be nice. I do believe that it will rest in part on those features that are universal and hence not learned. This was true of Bob's early work on rule acquisition and I see no reason why it won't be the same now. Wexler and Culicover's work on learning relied on roughly a subjacency condition. Bob's on something very like the Extension condition built into the Marcus Parser that he used. So, the models I think useful paradigms of how to proceed used UG principles quite productively. It's not the last word, but it's a pretty good first one. Same with Charles on parameter setting models, though I am not sure I like these much anymore for theoretical reasons (that I think are still open to dispute (that's for you Mark)).

      Delete
    8. Yes, something like the Haegemann book was roughly what I would call the non-shaky stuff. I'd be genuinely interested (I don't mean that in the snarky sense!) to know what Alex C thinks of that book. It may be inevitable that there's slightly less uncontroversial stuff around than in chemistry given that we've only been at it for 60 years, but I suspect there's significantly less that would be called controversial in those older books than in the minimalism-oriented textbooks.

      This of course all relates back to the issue of the relationship between GB and MP, which I like Norbert's descriptions of, and the way it's perhaps often misconstrued.

      Delete
    9. @Alex: Following up on Norbert's suggestions, I'd recommend Richard Larson's Grammar as Science, which for the most part uses only context-free phrase structure rules and doesn't introduce movement or X'-syntax until the last few chapters. I built my undergrad syntax course on this book, and while I had to flesh out the feature mechanism a bit to handle some things like subject-verb agreement or the differences between mass nouns and count nouns wrt determiner selection, it was still fairly easy to cover a lot of empirical ground such as the argument/adjunct distinction or the difference between control verbs and ECM verbs. And the emphasis is on the empirical facts and how to construct syntactic tests, which might well be the true art of syntax, similar to proofs in math.

      Of course, even at such a basic level you have to make tons of theoretical assumptions, e.g. that coordination is a reliable constituency test, and some of the analyses you end up proposing might well be wrong (e.g. picking an analysis for negation or possessor constructions). But this is part of doing science, you start with basic assumptions and then go from there, and if you have competing proposals, you evaluate them with respect to a set of jointly agreed upon criteria.

      What binds generative grammar together is a shared understanding of what the important assumptions and evaluation criteria are. Different brands of GG add additional assumption on top of that --- CCG's and TAG's concern with weak generative capacity, Minimalism's SMT --- but the baseline presented by Larson is shared by all of them.

      As for length, the book is a little over 400 pages and covers only the simpler constructions of English. Adding topics like negation, the VP domain (auxiliaries, negation), passive, binding theory, and syntactic typology would easily push it past a 1000 pages. That's a lot more than what you find in math, where textbooks seldom go beyond 400 pages, and is still respectable for "fact-listing" fields like psychology and biology.

      Delete
    10. @Norbert - I certainly do think this is special to GG. I think you may be comparing your own field to adjacent disciplines, which might not give the clearest relief; overselling is shared to whatever degree by many/most cognitive scientists ("one part scientist, one part snake oil salesman"). Deceit about certainty is absolutely not necessary in order to make ideas worthy of people's attention, however young the field. One can develop the discipline to allow for more conjectures at the base without pretending they are not. This sociological error is far graver within linguistics than elsewhere, and I think it is one of generative grammar's largest shortcomings - probably its only serious shortcoming. Believe it or not, it turns many intelligent and critical undergraduates off the field.

      I was lucky that I had the patience to stick it out and was critical enough to see through my own skepticism, but in undergraduate linguistics courses (and I mean second and third year courses, not intro courses where you're also fighting against lay intuitions) the comment that "these people are full of it"/"what on earth is the evidence for all this" was in my experience always at levels matched (and beaten) only by lit theory. This is unfortunate because if you are paying attention there is a lot we actually do know. But it is systematically obscured by the atrocious practice of conflating all levels of uncertainty at zero.

      Delete
    11. I would add (to be clear) that I don't think this has affected the quality of research too much apart from maybe distracting people from thinking hard and clearly about the fundamentals - but rather just simply that many excellent and discerning linguists wind up teaching/textbook writing and engaging with outsiders in a lazy way, and that this is encouraged by the rhetorical ruts of the field but not at all attractive to those very outsiders who are the target of the charm job. You might not see it because you don't do this and have chosen to affiliate with people who don't do it either - I assume the heat of your rhetoric here is just a reflection of the fact that you think the field is under siege - but my main point was simply that I really don't think it's fair to blame Alex C for not having discerned the level of certainty held by those in the know about various claims, when those in the know are in fact surrounded by people who knowingly or not give the impression that we know a lot more than we do.

      Delete
    12. If 'anecdote' is the singular of 'data' then my experience does not jibe with yours. I learned syntax initially from Lightfoot and he was very critical all the time of what we were considering. I have taken syntax courses from many others and, by and large, have not found the problems you mention. However, let it be said, that one should never oversell and that modesty is almost always a virtue.

      However, forgive me if I doubt that this is the main reason that professionals in psych, cog-neuro etc don't take this stuff seriously. There are many non-sociological reasons for this. The hard core empiricism of the practitioners is one that easily comes to mind. After all, brains just look like connection machines right? So how can this fancy stuff be in the brain? Puleeze. So, sure, best to be modest about one's achievements. But this is not the main roadblock to success. Ever meet a modest Biologist (they have a prize you know). Does it stand in their way?

      Delete
    13. Well, you got good answers in class, and there is plenty of variance obviously, but pick up a random textbook and tell me if you would really like to learn from a class that went no deeper or more critical than that. It's not about being modest - there is a standard curriculum in the field which came to be by a process that didn't really respect the field's degree of uncertainty. As a result one finds all kinds of generalizations taught, with all different levels of empirical support, and it seems to have the effect that they all get argued for poorly - even the good generalizations.

      I would forgive you for thinking that communication problems with other scientists are principally their fault, in the sense that I would forgive anyone for believing something which places responsibility elsewhere. The unscientific attitudes with which your average "skeptic" approaches language are laughable and we ought to be able to stamp them out if we are so smart. But the replies we give them to convince them are usually just bad because we are used to talking to an echo chamber where we don't have to sort the wheat from the chaff. And this is very dangerous because they will quickly turn off and dismiss the field. Most human beings would do that for better or for worse - "I think I know everything" - "I said something [backwards and naive] and this person said something that's obviously not a good response" - "They keep doing this over and over" - "They are all charlatans [see point 1]". They're wrong but we better give them something better than bluster and "read the manual" to go on. The manual is bad and it needs to be fixed.

      Delete
    14. Thomas, thanks for the pointer to the Larson book; I will read it. But it brings up my concern about methodology; why should we think that the standard methods of syntactic analysis will bring us to the (or a) psychologically real grammar for a language? I don't think there are any good arguments for the methodology, and so I don't necessarily believe the results produced by people using the methodology (quite apart from concerns about data, precision of theories etc.). It's quite different from how Chomsky argued for syntax in the 70s for example.

      Re the Haegeman book; I don't believe all of it, and I don't think many other s do either now. So it is definitely does not consist only of the rock-solid consensus. To take one trivial example: for a sentence like "the tiger escaped" it gives a standard (S (NP (D the) (N cat) (VP (V died))) phrase structure. Now those of us that believe in the DP hypothesis (namely everyone that has published a paper in LI in the last few years) thinks that this structure is wrong, and probably believes the VP internal subject hypothesis too. So there is no consensus even among generative grammarians that that is the right structure.
      Now endocentricity is a substantive term -- so this is not just a notational difference but a substantial difference in the empirical claims of the Haegeman book and contemporary syntax.

      Moreover I really don't think that "Norbert, Chomsky, Steedman, Pullum, Bresnan " would all agree on the correctness of all of this book. And I don't think it is even worth talking to them to check.

      Delete
    15. As for what Steedman thinks everyone agrees on, he & Jason Baldridge did say this in their recent handbook article on CCG:

      The continuing need for volumes like the present one raises an obvious question: why are there so many theories of grammar around these days? [...] in some respects they are all rather similar. Sometimes the similarities are disguised by the level of detail at which the grammar is presented—for example, Tree-Adjoining Grammar [...] and CCG can be regarded as precompiling into lexical categories some of the feature-unification that goes on during derivations in Lexical-Functional Grammar [...], Head-driven Phrase Structure Grammar [...] and other attribute-value grammars. Nevertheless, [...] all of the theories under discussion including CCG and at least some varieties of Government-Binding or Principles and Parameters grammar (GB) have essentially the same binding theory, with a lexically defined domain of locality corresponding to the tensed clause, and a command or scope relation defined at some level representing predicate argument structure, such as logical form. The mechanisms involved, even when couched in terms of transformations like “NP movement,” seem to be of rather low expressive power—essentially context-free (CF) and “base generable,” to use Brame’s (1978) term. Many phenomena involving dependencies bounded by the tensed verbal domain, such as raising, control, passivization, reflexivization, and the like, have this character. While some deep problems remain—in particular, the question of what the primitive components of linguistic categories themselves are—the theories are all in formal terms pretty much alike in their analysis of these constructions.

      Delete
    16. @Alex: Now I'm curious where exactly you think the methodology goes wrong. The most basic model is to treat sentences as strings of words. Since the set is infinite but must be described by finite means, words have to be put into buckets, i.e. parts of speech. I'm sure you're still fine with that, even though one may discuss what exactly the set of parts of speech is (distribution classes VS more complex notions such as verb and noun).

      But simply grouping words in parts of speech and describing strings in terms of POS-sequences is still not enough. If the sequences are limited in length (n-grams), the formalism isn't expressive enough. Unlimited sequences are a no go. So we can either keep finite sequences and throw in a mechanism that increases expressivity (e.g. adding weights to n-grams), or move away from sequences to more complex structures, i.e. trees or graphs. Once again I'm pretty sure that you are happy to be in the tree-camp, at least for technical reasons (succinctness, compositional semantics, easier math, etc.).

      So now we have to figure out what the structures are. Since we cannot see them, we have to rely on tests. In order to construct tests, we start with simple things. For instance, we see that certain things can occur at the left edge of the sentence even though they are interpreted somewhere else. But only certain things:

      (1) Mary John likes.
      (2) The man that I met yesterday John likes.
      (3) *I yesterday John likes the man that met.


      Our first hypothesis is that only substrings can occur at the left edge, not subsequences, but that is proven wrong by (4) and (5).

      (4) *likes John the man that I met yesterday.
      (5) *the man that John likes I met yesterday.

      But since we are using trees, let's assume that only subtrees can occur at the left edge. And from this point on, it's a bottom-up process: infer what the subtrees are, develop new tests based on this information (coordination, proform-replacement), refine structure accordingly, and so on. Of course this is a tricky business, since tests can be faulty if one isn't careful (binding theory being the prime candidate), but that's a problem with the execution, not the general methodology as such.

      Yes, there are probably a few structural assumptions that were carried over from earlier theories that should be reevaluated in light of what we know nowadays, but those only affect the nitty-gritty of, say, possessor constructions. And those are also exactly the cases where generative theories diverge (as one can see in Borsely's comparative intro to GB and HPSG) and make slightly different predictions in some special cases. But that's fine-grained stuff that has very little to do with the big picture facts: there are clearly different types of verbs (ECM, control, raising) that behave very differently with respect to a number of constructions, there are restrictions on the interpretation of pronominal forms, there is something like displacement, there are subcategorization requirements, there are island effects, and so on. Those and similar facts are established via the methodology outlined above. What's the alternative?

      Delete
    17. If you just want to get some reasonably succinct description of the infinite set of sound/meaning pairs then that is a perfectly good way of constructing a grammar. But we know that there are infinitely many structurally different grammars that will generate any such given set. So why do we think that the one that is produced by this analytical procedure would be the right one? i.e. the unique (?) psychologically real one?
      We know mathematically that these techniques cannot be reliable, because of the problem of the indeterminacy of the inverse function from languages to grammars.

      So inference to the best explanation type arguments are of course fine - and standard in science, but only if they actually explain the facts. If we have a theory that fails to explain a fundamental fact -- that these things are learned -- then you can't run an IBE.
      It used to be the standard view (e.g. Chomky and Katz 1974, what the linguist is talking about) that syntactic analysis on its own could not produce the definite answer. Rather one had to integrate cosniderations of explanatory adequacy. I think now one would add evolutionary adequacy to the list. You need to proceed by integrating available constraints -- syntactic adequacy being one of them, but also algorithm level constraints on processing (.e.g the Weak efficiency version of the SMT that we discussed a few posts ago), learnability, evolution etc etc. If you have a linguistic theory that satisfies these big constraints, then an IBE argument will be convincing, just as it is for any other bit of science.

      I strongly disagree with the claim that syntactic analysis on its own will or can give us, reliably, the right grammars. And that was and should be an uncontroversial view.

      (Of course that is not to say that every part of the analysis is wrong; on the contrary. I think that some parts of the analyses are quite likely to be correct).

      Delete
    18. Again my impression is that this

      (1) the standard view ... that syntactic analysis on its own could not produce the definite answer ... one had to integrate [other] considerations

      is another case where bad communication has damaged linguistics.

      Smart people would not at all disagree and have not changed what they think, but you would not know it from the level of defensiveness and paranoia in their responses - rather than simply getting down to brass tacks and give you a summary of what they think has and hasn't been sorted out. Similarly they might often use recalcitrance as a rhetorical tactic to make others get down to brass tacks.

      Again I have talked to many linguists who seem to sincerely believe that (1) is false, and I think it is because their sometime intellectual guides are failing them with their bluster.

      Delete
    19. @Alex

      The position you adopt is what I would dub "methodological nihilism." It's a technique I learned in Philo Grad School. Insist that any account is incomplete if you don't like it. It comes very close to the position that unless you understand everything, you understand nothing. This is NOT standard scientific practice. In chemistry, lots of progress was made without knowing how valence was determined. We knew the gas laws were more or less right even though we did not know why (in fact it took a quite a while to get a working theory of this). Physicists still don't know how to reconcile their two deepest theories but this doesn't stop them from thinking each is correct in the sense that any better theory will have to take their results as boundary conditions on adequacy. That's the way things work.

      What does this mean in the current context? It means that in linguistic we can know a lot about FL/UG without yet knowing how they exactly fit into other larger questions. This is standard science methodology. What you are asking for is not. The "truth" of Kepler's laws and Galileo's did not wait for unification in Newton's. The "truth" of Newton did not require derivation from Einstein. The truth of something like island effects and binding theory and ECP and Control theory and…does not await the specification of a learning theory for Gs.

      In fact, we can go further: whatever the right theory of learning is, we can be pretty confident that there are some areas that it will presuppose rather than "learn." The results about UG, if they are roughly true, which they are, imply that the principles expressed by these modules of the grammar are not learned but built in. The two of us have gone down this path before (and disagreed). My view (the reasonable one) is that our best theories of FL/UG help determine fix the scope of the learning problem. And the scope of the evo problem, if we ever get good enough MP theories. That's very important even in the absence of a detailed theory of either. So, in fact, standard ling work mightily contributes to both Plato's and Darwin's problems, even if it does not solve them.

      Let me end by adding a word of thanks to you. It has long been my view that our disagreement concerning Lenin's Question (What is to be done?) has nothing to do with the sorts of concerns that people like Ewan or Thomas (nice young things as they are) have surmised. We see the problem entirely differently. There is very little common ground. What I take to be facts to be explained, you take to be barely defensible propositions. My view of the problem is entirely different from yours. What I take to be an admissible kind of answer, you take to be beyond the pale. I have long thought this true, but I have not had the evidence at hand to prove it. Your forthright manner has made our disagreement clear. And that is a very good thing. Bright lines are useful when the disagreements are as large as this. Ecumenism in such circumstances is counter-productive.

      Before Ewan or Mark jumps in, let me add that this does not prevent either of us from stealing techniques from the other on an as needs basis. But it does strongly imply that the intellectual impasse is large and that, like it or not, when working on the topics we are disagreeing about, one will have to take sides. there is no middle ground, and this is worth knowing. Thx.

      Delete
    20. I really don't think "nihilism" is the right word. I am advocating pluralism -- integrating all of the different sources of information into the scientific process, and I am quite optimistic about this. I think we can figure out the nature of the language faculty using the available techniques.

      Nihilism would be the view (modifying Hauser et al.'s paper)
      "Based on the current state of evidence, I submit that the most fundamental questions about syntactic structure remain as mysterious as ever, with considerable uncertainty about the discovery of either relevant or conclusive evidence that can adjudicate among the many open hypotheses."

      I think this view is wrong when it comes to syntactic structure, but I have some,limited, sympathy for the nihilism advocated by Hauser et al when it comes to the evolution of language.

      I am also advocating some scepticism about syntactic analysis as a methodology, which is one reason I advocate pluralism.
      It would be nice to hear some more arguments in favour of syntactic analysis. Maybe I am underestimating its effectiveness.

      Delete
    21. @ Alex:

      Of course you would not think this 'nihilistic.' That was my point. We are so far apart on what would even count as data/argument that we have no hope of fruitfully interacting. I don't expect to convince you. This whole discussion is for the benefit of others. I have been arguing that one must choose between your position and mine. As you have made clear, this is not a choice concerning which details to accept and reject (as say would be a discussion between me and Postal, or Steedman, or Jackendoff, or Bresnan, or Chomsky, or….well everyone but maybe CB). As you made clear in your reply to Thomas, your skepticism is wholesale (bless you for saying this btw). All the methods and arguments linguists use are suspect to you. There are no results because there can be none, given the standard procedures linguists follow. This is not local skepticism, but a global kind of Cartesian Skepticism applied to this domain. IMO (not yours I am sure), this kind of global skepticism would be laughed out of court in any other domain. This kind of wholesale skepticism is what distinguishes climate science deniers and creationists. And that's what I want my fellow linguists, especially the young ones, to see.

      I have argued that there is no middle ground here and that one must choose sides. You have helped me make my case, I hope. I know what side I'm on. You seem to know what side you are on. Now it's time for others to see that there is no splitting the difference between us. Our disagreement is not one that will be finessed by just making the arguments clearer (Ewan, this is for you) or keeping an open mind (this is for you Mark). Your side thinks that at bottom linguistics has contributed nothing of value because it cannot. We have no results to speak of and that's because our methods preclude grounding any conclusions at all about the structure of FL or even Gs. My view is that we have results with varying degrees of evidence in their favor, ranging from really good evidence and strong results (islands, crossover effects, ECP effects, Binding effects, predicate internal subject hypothesis, etc.) to so-so (the DP hypothesis, LCA, macro-parameters). Your view is that all of this is completely inconclusive due for methodological reasons. My view is open to your methods being of some potential value. I have nothing against Bayesian modeling, nor even formal analyses like Stabler's (though I admit that the self promotion here is far outstripped the results, IMO). Your view is that mine cannot possibly advance our knowledge of the structure of FL. This is not a bridgeable divide, as I hope our readers will see.

      So, will I convince you. Can't. Will you convince me. Nope. This discussion is entirely for the benefit of my younger more open minded colleagues who think the disputes are largely due to misunderstandings and foggy rhetoric. It isn't. It goes very deep. That's why there is no splitting the difference. You either reject the wholesale skepticism about our methods and accept that we've established some truths over the last 60 years or you reject this. Of course, once you've accepted that linguistic methods can and have advanced our understanding, we can argue about which arguments/data are better and which worse. We can argue about ho to supplement the conclusions with other arguments/methods/data. But until you buy this, we have nothing to discuss, though I would be happy to argue with you in any public venue anytime you might be interested.

      Delete
    22. @Alex

      In the interest of seeing whether clarifying matters can indeed help so that I might thereby subvert the Bolshevist logic, can you say start by saying what you mean by "syntactic analysis" as a methodology?

      i.e., there are a variety of pursuits that lead you from a set of data (for the sake of argument say acceptability judgments) to a narrowed down set of possible grammars for a bunch of languages, through again to a set of hypothesized underlying mechanisms (learning mechanisms with whatever constraints). It sounds like you agree with this general scheme (unless the key pillar is whether you rely exclusively on acceptability judgments) so what narrow sub-pursuit is "syntactic analysis"?

      Delete
    23. @Ewan

      Here's the money quote:

      Alex ClarkMay 14, 2014 at 1:55 AM
      Thomas, thanks for the pointer to the Larson book; I will read it. But it brings up my concern about methodology; why should we think that the standard methods of syntactic analysis will bring us to the (or a) psychologically real grammar for a language? I don't think there are any good arguments for the methodology, and so I don't necessarily believe the results produced by people using the methodology (quite apart from concerns about data, precision of theories etc.).

      Delete
    24. @Alex: I have a hard time figuring out just what your position is (what should I say, Norbert, I can be rather dense). On the one end of the spectrum we have the observation that it is unlikely that

      syntactic analysis on its own will or can give us, reliably, the right grammars.,

      which is indeed uncontroversial. Note that I explicitly mentioned above that the generative methodology includes criteria for picking between competing proposals, and that includes "aesthetic" factors like simplicity and elegance as well as "tangible" ones such as learnability and processing The reason the latter are not applied as stringently as syntactic tests is that their implications are a lot less clear-cut. The battle DP vs NP has little to do with processing or acquisition, as far as we can tell. (Side remark: that's not to say simplicity or elegance are clear-cut criteria, but that's irrelevant to the discussion here.)

      But your reply also has a strong undercurrent --- the one that Norbert picked up on --- that if a proposal is not provably right or the only unique answer, it is not a good answer at all. I have to assume that this is not what you are saying because it undermines your own point:

      Yes, there are infinitely many structural descriptions that generate a given string language. Even if we extend our empirical domain to string-meaning mappings, the theory is massively underspecified. But that is the case even if learnability enters the picture. I know that your recent work on learning tree languages builds on the assumption that structure can be uniquely inferred from strings, but that is a restriction of your learner, which for instance has no access to semantics. There is no guarantee that learnability would get you around underspecification, so the argument that you level against a purely syntactic methodology also works against syntax+learnability. Quite generally, you end up with an infinite regress: If efficient learnability matters, you need a parsing model, a realistic parsing model needs to be implemented at a neural level, neural computation builds on the laws of physics and chemistry, and so on, and so on. Science isn't about perfection, it's about good enough.

      And that brings us to the second point: why is underspecification a problem? The implicit assumption seems to be that there exists a vastly different theory of language that solves all the problems we have right now. But much of the data is astonishingly clear and hardly contentious. Nobody debates that there are island effects, the question is only where they arise and why (grammar VS parser). Nobody debates that pronominals are restricted in their distribution, the question is what those restrictions are and how they differ between languages. The problems are independent of the theory, just like the inherent complexity of 3SAT is independent of the algorithm you use to solve it. Linguists disagree on how to slice the cake, but that's very different from saying that there is no cake to slice.

      So why worry about underspecification? Even math is underspecified, any kind of interesting mathematical object can be studied from many different perspective. That does not make any of those perspective less valuable. In math, that's because we know how to link those different perspective together. And we have similar results in linguistics, e.g. translation procedures from CCG to TAG, TAG to MG, and so on.

      Delete
    25. Wow this thread just got very strange .. um I should clarify that saying "I am sceptical about X" is not a polite British way of saying "X is false" but rather saying I consider X less than certain. And on the dogmatism to scepticism spectrum I am firmly off on the sceptical side which is the right side to be on, in my opinion, if you are being a scientist as opposed to a member of a religious cult. There is no place for articles of faith in science.

      "But your reply also has a strong undercurrent --- the one that Norbert picked up on --- that if a proposal is not provably right or the only unique answer, it is not a good answer at all. I have to assume that this is not what you are saying because it undermines your own point:"

      Yes that is not quite what I am saying.
      We have several different sources of information: the linguistic data - LD - (the set of sound meaning pairs from attested languages), the fact that they are learned, the fact that they can be processed in near real time, the fact that humans have evolved, psycholinguistic data (self-paced reading, structural priming, word recognition etc etc.).

      So one view (A) is that the LD is enough on its own to determine reliably the syntactic structure of languages.
      This LD first view would then entail the following: figure out the grammars using the methods of syntactic analysis applied to the LD, then try to figure out how they can be learned, and how that might evolve.

      Alternatively, one might say, (View B) actually the grammars are very undetermined by the LD. and so we should incorporate from the outset considerations of learnability/explanatory adequacy.

      It might be that (B) is undetermined, that is to say there are several different theories of grammar that both attain explanatory adequacy. So one could go further and say (C) actually we need to look at the actual neural activity to identify which is the cognitively real grammar.

      or one could go further (D) and adopt sceptical arguments against a computational theory of mind (e.g. Searle/Kripgensteinian rule following arguments).

      My view is that (A) is untenable. We know it is untenable because we cannot resolve DP vs NP disputes. There are multiple distinct theories on the table -- though they do share some common structures.

      My view is that (B) is the standard view -- I can dig up plenty of quotes from Chomsky that say almost exactly that. Actually let me do that: from Aspects
      "If (the linguist) wishes to achieve descriptive adequacy in his account of language structure, he must concern himself with the problem of developing an explanatory theory of the form of grammar, since this provdes one of the main tools for arriving at a descriptively adequate grammar in any particular case. In other words, choice of a grammar for a particular language L, will always be much undetermined by the data drawn from L alone.
      " and "It is not necessary to achieve descriptive adequacy before raising questions of explanatory adequacy. On the contrary, the crucial questions that have the greatest bearing on our concept of language and on descriptive practice as well, are almost always those involving explanatory adequacy with respect to particular aspects of language structure."



      (C) is not tenable either for the simple reason that we do not yet have any theories that attain explanatory adequacy. At least not by my standards.
      So it might well be that in 10 years we will have several different theories that attain EA. In which case we might have to start looking to neural evidence. But we arent't there yet.

      In other words, Chomsky is right, and (B) is the correct way to go about this.
      There is no need for nihilism -- or at least we can be optimistic.

      Delete
    26. @Norbert
      "Your side thinks that at bottom linguistics has contributed nothing of value because it cannot. We have no results to speak of and that's because our methods preclude grounding any conclusions at all about the structure of FL or even Gs"

      This is absolutely not what I think. I certainly have reservations about how syntacticians construct grammars. But I think a lot of work on linguistics is vitally important; I think case-stacking is a really important discovery which has very significant consequences for grammar ( I am reading Double Case at the moment) , I think Wolof and Yoruba relative clause copying is a crucial test case, I think Swiss German was crucial. These results tell us very solid things about the structure of the grammars, and we would not know them without careful theoretically informed work.

      Delete
    27. @Norbert

      So Alex has implicitly confirmed that all he means by "syntactic analysis" is "attempts to figure out FL _only_ using acceptability judgments" - you could broaden "linguistic data" a bit but not much. And that therefore all he is saying is that this isn't going to tell us all we need to know - not - in his words "absolutely not" that it is of no use at all. And everyone agrees on this.

      So I am again going to present you with the supposed money quote.

      "why should we think that the standard methods of syntactic analysis will bring us to the (or a) psychologically real grammar for a language?"

      Show me where this is not simply saying the sensible thing that everybody knows, i.e. that we may start by working only with the linguistic data but that is only a start and not the end of the story.

      Do you really read this as suggesting that working from AGs has some _special_ priority on being insufficient? Alex can correct me if I am wrong but I read this sentence as one that would be just as obviously true if you inserted "computational methods" or "experimental method X" or whatever else we have access to instead of "syntactic analysis" and I understood it to be meant that way.

      The motivation that I saw behind it, quite apparent, was that it was being offered as a corrective to an interlocutor who had given the impression that he did not believe the sensible thing that we all agree on (ie the thing called (B) above) - in favor of the not so sensible thing (A) which I am almost certain you just do not believe.

      Which leads me to think that you said or suggested something stronger than what you believe in the course of defending your position. Which of course leads me to think that I am right.

      However, if the intention was actually to give profess the belief that syntactic analysis is _particularly_ uninformative then, indeed, I take it back and defer.

      Delete
    28. Let's be clear. Given an opportunity and significant prodding here is what you are willing to sign onto "non skeptically": three string related properties of sentences. All, not surprisingly, dealing with whether Gs are MCS. So, you are willing to sign onto string properties as facts we should regard non-skeptically. Now, Thomas and I and quite a few others listed some of the following as well, that you decided NOT to mention. I assume given that they were there for the picking it's because you ARE skeptical about these. Let me list some of these and maybe you would be so kind as to tell us which you treat non-skeptically. Interestingly, IMO, they all fall into the domain of structure dependent features of Gs: Here we go. Oh, by the way, I will refer to these as 'effects.' Effects are just generalizations or "laws" by another name:
      1. Island effects (Ross)
      2. Cross Over effects (Postal)
      3. Control vs raising effects (Rosenbaum)
      4.Minimal distance effects (Rosenbaum)
      5. Binding effects (Lees and Klima, Lasnik, Chomsky)
      -A-effects
      -B-effects
      6. Cyclicity effects (Kayne&Pollock, McCloskey, Chomsky)
      7. C-effects: an anaphor cannot c-command its antecedent (Chomsky)
      8. CED effects (Huang)
      -Subject condition effects
      - Adjunct condition effects
      9. Fixed subject effects (Perlmutter, Bresnan)
      10. Unaccusativity Effects (Perlmutter, Postal, Burzio, Rizzi)
      11. Connectedness effects (Higgins)
      12. Obligatory control vs non obligatory control effects (Williams)
      13. Long distance anaphors as subject oriented effects (Huang)
      14. Case effects (e.g. *John to leave inspired Mary)
      15. Theta criterion effects (e.g. *Who did John kiss Mary)
      16. NPI licensing effects (Linebarger, Ladusaw)
      17. Phrasal Headedness (Lyons)
      18. Clausemate effects (Postal)
      19. Expletive-Associate Locality effects (Chomsky)
      20. Parasitic gap effects (Engdahl, Chomsky)
      21. pro drop effects (Perlmutter, Rizzi)
      22. ECP effects (Lasnik&Saito, Rizzi, Chomsky, Huang)
      That's a nice selection of things that syntacticians have discovered that are NOT facts about word strings. But it gets better. For we have even found systematic exceptions to some of these:
      23. Weakest cross over effects (Lasnik and Stowell)
      24. ATB effects (exception to CSC)
      25. Parasitic gap effects (Exception to CED)
      26. Ellipsis effects (Ross, Merchant, Lasnik exception to island effects)
      27. A movement/scrambling obviating SCO effects
      I'll stop here.There are others. Of the above, some of these are less secure than others IMO. But this is not a bad list to choose from. None of these are string properties. All involve invocation of structure. Some of these have considerable support from non-judgment data (Binding, Islands, Ellipsis, Unaccusativity, CED). Note too that these are not exclusively Chomsky discoveries. Many have contributed.

      So, Alex, here's your chance. Which of these are you not "skeptical" about? Which of these don't offend your methodological scruples. Here's my bet: none of them. You are skeptical of all. However, this is what GG has contributed. This is the "body of doctrine" (a term of Davy for the results of 19th century chemistry) that GG has delivered, and that allows for deeper inquiry into the structure of FL. There is more, IMO, but there is at least this. When I strongly implied your views are extreme, it's data like this that I assumed you were dismissing on "methodological" grounds. As a group, this data is not particularly controversial. To me, and mainstream GG these are just settled facts. I doubt they are for you. So, am I right? If not, let us know. You can just check the ones you accept.

      The reader will no doubt have noted that these are the kinds of data that linguists of the GG tradition have used to argue for properties of FL. Maybe you like these arguments, maybe you don't. But, whether they go through has little to do with the solidity of these findings. So, Alex, where do you stand?

      Delete
    29. I appreciate you writing these out, Norbert; I really do think this is an impressive body of discovered generalizations!


      Let us take the first, because it is first. I do not wish to call any data into question (sentences with stars, etc) largely because I mostly agree with it. I also do not wish to call into question the descriptive generalizations that Ross gave (`that sentences of form A will be given stars'); I'm perfectly happy believing that they are also mostly correct. (And this is the case with the rest of the effects you mention). So I'm in complete agreement that (1) the data summarized by these effects are correct (i.e.~not the result of instrumental errors or some such) and even (2) that the intentional descriptions of these effects make clear predictions given an analysis.


      But I am skeptical of this effect in the ontological sense -- it is not obvious to me that the right theory will treat it as a unified phenomenon. We see this already in the island domain, with weak islands being suggested by some as having their explanation in the semantics, and with strong islands in the syntax. (A la jadeite and nephrite.) There are also theories of islands according to which they should be influenced by context, parsing heuristics, etc in the same way as other phenomena sensitive thereto. According to these theories, island effects are not true tout court, they are however discrete approximations of the truth.


      I don't see this as a radical position, and I don't see it as conflicting with the claim that GG has discovered a lot of stuff.


      Here's another effect you didn't mention: the lexical integrity principle [LIP] (which covers the phenomena of the ban on extraction out of words, the `fact' that parts of words don't set up discourse referents, etc). The LIP suggests that the way to view these agreed upon data is by making a particular theoretical maneuver (adopt a non decompositional syntax). I accept that the data is mostly right (just as I do with your list of effects), but I do not believe that the suggested account of this data is right. I picked this example because I imagine you feel similarly about it (or at least respect people who do).


      Finally, these MCS phenomena Alex mentions are also structure dependent. They are all based on generalizations about a finite set of data. It is just that once we have projected the finite data into the infinite, we can prove properties about that set of strings.

      Delete
    30. You are right. I have been trying to stay relatively neutral concerning the ontological basis of these effects. That's a job for theory. So Subjacency is a theory of islands. And the Movement Theory of Control is a theory of OC. These are empirically less secure than the effects cited as they are more speculative. I personally think that something like subjacency is more or less correct (and I am sure (he he) that the MTC is entirely correct). However, I recognize that these are very different from the other things on the list. Unlike effects, these theories can be motivated on other grounds, e.g. explanatory adequacy, simplicity etc. But yes, what's amazing is the range of general phenomena that GG has discovered that is quite abstract and that call for (IMO) theoretical explanation.

      Another point that I find interesting is that if these are roughly true they have consequences for how we must think about FL. Most of these will support pretty robust POS arguments. Of course, such arguments will not tell us what features of FL are responsible for these effects. But they tell us that their etiology belongs there in some way. In this regard, these play the same role within linguistics that things like the photo-electric effect played in early physics or the stroup effect plays in psych or the Bernoulli effect plays in dynamics.

      One that I forgot to mention is Rizzi's Minimality Effect, another one to add. Your lexical integrity effect is also good.

      Thx for the correction re MCS cases.

      Delete
    31. This is a reply to Norbert's shopping list [1-27] of supposed "laws" [or effects or generalizations or otherwise labeled important undisputed discoveries] from May 16:

      Since we comment here on open-mindedness let me first distinguish between good vs. bad open-mindedness. [1] Bad: I agree with some commenters that being openminded does not mean to ACCEPT every opinion on a topic, no matter how little supported by evidence. However, [2] Good: open-mindedness should include to be INFORMED about all work that is relevant to one's opinion so one can either change one's opinion [if another view is superior] or show how one's view can overcome proposed challenges. What Norbert's list demonstrates is that he refuses to engage in [2]. He probably knows everything Chomsky and a few 'select luminaries' have ever published. But that even a name like Ray Jackendoff is entirely absent from his lists [or book recommendations - what IS wrong with "Simpler Syntax"?] should ring some alarm bells...

      Norbert loves to group people in 'sides' - so on behalf of the other sides I challenge Norbert to formulate the 1 - 27 talking points in enough detail that they actually can be evaluated. Here is a short excerpt from what Paul Postal had to say about the list: "Almost none of these cases is stated in a law-like fashion, no testable claim emerges. An exception is 7. This one has the virtue of being clear enough to be demonstrably false. Several languages, including such exotic ones as Greek and Albanian, have subject reflexives in passives anteceded by the passive ‘by’ phrase equivalent. In fact, an even more exotic language called English arguably manifests the same phenomenon in a more restricted form, as shown by:

      (1) It was himself that was praised by Otto.

      Here the obscuring factor is that the reflexive (anaphor in the Chomskyan terminology) has been extracted."

      This issue has been discussed in the literature but to date entirely been ignored by orthodox Chomskyans. So I further challenge Norbert to demonstrate that the claims made and evidence provided in the below are compatible with his #7

      Postal, Paul, M. & John R. Ross (2009) "Inverse Reflexives" in William D. Lewis, Simin Karimi, Heidi Harley and Scott O. Farrar (eds.) Time and Time Again, John Benjamins, Amsterdam.

      [for those interested, I have a pdf version I can e-mail upon request]

      Delete
    32. 1) "But that even a name like Ray Jackendoff is entirely absent from his lists [or book recommendations - what IS wrong with "Simpler Syntax"?] should ring some alarm bells..."

      I am not sure Norbert claimed anywhere that his list was exhaustive. So, I don't see the relevance of this point. If your point is that Norbert doesn't pay attention to what Jackendoff has said in the past, that is simply ludicrous. lol!

      2) "Paul Postal had to say about the list: "Almost none of these cases is stated in a law-like fashion, no testable claim emerges"..."

      Legions of generative syntacticians have gone around looking to test these generalisations, including Postal, so I don't understand what it means for him to say no testable claims follow. And the fact that you are blindly quoting him shows how little you have looked into what syntacticians do. (funny you are commenting about paying attention to others). lol again!

      3) Yes, a lot of these generalisations have some exceptions, but so do Newton's laws, Boyle's law,… Are we to stop calling them laws? lol a third time! Perhaps, wikipedia might define laws as exceptionless, but they rarely are in the sciences. And when you do find exceptions, it isn't the case that we stop calling them laws. If you truly believe one should, then you have much bigger fish to fry, go argue with the physicists and chemistry guys about how unscientific they are being. If you want exceptionless laws, I am afraid you are not doing science anymore, you are looking for religion.

      4) If you don't think the generalisations that Norbert mentioned are quite accurate, then please try to describe the same data they are attempting to describe in another way. As many have asked before, what are your SPECIFIC claims regarding the same data? If all you wanna do is show an exception or two, then I am afraid you are not doing science (for all your sermons about proper science). So, don't quote what others have said, don't talk about platonic love (Postal), or your realistic hate (Chomsky), don't discuss personalities, just freaking analyse the same data using non-generative methods (whatever that means), come up with a better analysis and show all the generative syntacticians they are wrong.

      If you are not willing to do that and just want to keep criticising, people would be justified in ignoring you, as Norbert has already done.

      Delete
    33. This is clearly too strong on my part:
      "If you want exceptionless laws, I am afraid you are not doing science anymore, you are looking for religion."

      Sure, there are exceptionless laws. But, all that means is that no known stuff that violates them. But, it is not the case that laws with exceptions cease to be called laws suddenly (Newton's laws, Kepler's laws, Boyle's law, …). Calling them laws makes synthesing information and understanding the subject matter easier, even if they do have exceptions, so we continue to call them so.

      Adding on, this stuff about "untestability" of a law is such a crazy claim. I don't see the physicists whining about the "LAW of conservations of energy".

      What the heck is energy anyway? Is this now suddenly untestable? Really Cristina (and Postal) if you are really serious about your critiques of laws, then you should go fight the physicists and chemists (??, chemistry guys). They are so much bigger than the small field of generative syntax (think instant "rock star" status, if you convince everyone there).

      Delete
    34. @Confused Academic: WOW what a lot of words to say one should ignore me. Just a sentence would have done. Note also your amount of ridicule and LOLs - very courageous without providing a real name...

      Misinterpretation seems your specialty. Nowhere did I say that Norbert never paid attention to Jackendoff. As you say he did, IN THE PAST, as long as Ray was still a good Chomskyan. From what Ray tells me he wasn't treated with too much respect after he withdrew his support for the orthodoxy [ =MP]. Further, what makes you think I am quoting Paul blindly [or that he is alone with his criticism]? Space is limited here - so I stuck to the essence.

      Now if you could be so kind and address the challenge to #7 we'd all be much obliged. Keep in mind why the exceptions ARE a problem: the biological LF is supposed to account for language acquisition [especially in POS cases] - so if we find exceptions to the "laws" in the data how does the child know what the exceptions are? If s/he learns it from the data why is Chomsky's view superior to say Tomsasello's or Christiansen's? Further, AFAIK no physicist or chemist claims that all human children grow [knowledge of] physical or chemical laws. So it is not clear what your analogy is supposed to establish. [Apologies that I ignore the second part of your comment, really no idea what it is meant to establish]

      I have no clue what your comment about Platonic love is supposed to mean but I assure you that I do not hate Chomsky. It IS an interesting sociological phenomenon that virtually anyone who criticizes Chomsky's work is accused of hatred. Does Norbert hate Stephen Levinson or Morten Christiansen because he criticizes them? Is it showing respect to Chomsky to encourage him when he is into his 80s to publish a book like the Science of Language? I think Chomsky would be much better off if this book had never been published...

      Delete
    35. Regarding your binding example, you have to look at the full paradigm (my own judgments, which have admittedly been polluted by thinking about binding quite a lot during the last two years):

      (1) It was himself_i that was praised by Otto_i
      (2) ? It was himself_i that was praised by everybody_i
      (3) * It was himself_i that was praised by nobody_i

      Failure to be bound by an antecedent with an empty denotation domain is a clear indicator for non-syntactic binding, so the reflexive in (1) is probably what's been called (rather misleadingly) a logophor, which are not subject to Principle A.

      That being said, the data is also entirely unproblematic under Pollard&Sag's theta-role based approach, I think. Since Otto is the agent and thus higher in the theta-hierarchy than the theme himself, it is a viable binder.

      Finally, even if the reflexive is not a logophor and you're thinking in terms of standard binding theory, i.e. Principle A, the fact that you're dealing with a passive makes this example unproblematic: a passive subject is assumed to start out as an underlying object and thus is c-commanded by the by-phrase at some point of the derivation. That's sufficient for Principle A, it need not hold at what's been called S-structure.

      So no matter how you dice and slice it, I don't see why anybody would think (1) is a challenge to any version of binding theory post 1990. The only question is why English -- and many other languages --- do not like reflexives in syntactic subject positions:

      (4) * Himself was praised by Otto

      Delete
    36. I think we are slightly talking past each other, which makes the disagreements between us seem more prominent than they are.

      My scepticism is about the theoretical claims; namely claims about either
      i) the psychologically real grammars of particular languages or about ii) innate elements of UG.
      It is not about the facts that these theoretical claims are meant to explain.
      So there certainly are people that reject syntactic facts tout court,
      or reject the claim that there are complex hierarchically structured generative grammars that generate potentially infinite sets of strings.
      I am not one of those.

      So to cherry pick an island example, one can distinguish the following claims:

      a) In English relative clauses are islands to extraction (a bare descriptive claim that just says something about
      the limited range of interpretations/unacceptability of sentences like "messily children who eat chocolate will have to wash their hands")

      b) native speakers of English represent and follow a rule that says they cannot move things out of islands, and that relative clauses are islands,
      ( a theoretical claim about the psychologically real grammars of native English speakers)

      c) part of UG is that relative clauses are islands.

      So I accept a) as true (though there may be exceptions/problems I don't know of them), I am sceptical about b) , and I think c) is false.

      Delete
    37. @Alex:
      Ok Now we are cooking with gas (I never quite understood what this meant, but it apparently means something positive). We agree that Island effects are real (and, I assume, you buy that the other wonders that I mentioned are more or less accurate). What you are skeptical about is whether these facts reveal anything about FL. As we both know, this is where the POS comes into play. I assume, that you would agree that island effects are not prevalent in the PLD and so they reveal something about SOME innate feature of us, be it domain general or specific. Indeed, form what I can gather, most island effects are simply too complex to be exemplified in the data. The question then is whether they indicate something language specific or more domain general. And if domain general what the particulars are that allow the LAD to learn them given the PLD. The only theory that I know of that tries to give such an account is Pearl and Sprouse (two UMD grads, I might add). Their theory comes with quite a bit of baggage that is pretty language specific to my eye (domination statements, particular heads that are distinguished, phrase structure rules, knowledge that it is worth counting embeddings, etc.), but I agree that it is the right kind of story that would free FL of its island burdens. BTW, there is a good skeptical evaluation of their work by Colin in the Srouse and Hornstein volume that is worth a read if you are interested in this. At any rate, they are in for domain general innate structure to handle island effects.

      Do I think they are right. Not really (read Colin). But I admire their efforts. Why? Because they analyze a known effect and show how to derive it. They take a particular result and take the POS stuff seriously and provide a model of how this might be acquired given certain domain general innately specified structure. This, even if wrong, is progress. Do I HOPE they are right. Frankly, as one with rich minimalist sympathies, yes I do. It would be nice to get islands off my agenda. But, like you, here I am skeptical. My main reason for this being the close theoretical relation between islands and ECP effects. I think that these should be handled in similar ways and I don't see how they can be on their model. However, to repeat, this is all a family argument. They've made the right kind of argument, drawing on the kind of data that must be considered, taken the POS argument seriously and proposed a detailed alternative. Were that all the disputes were like this!

      Ok, so do you agree that Island effects are prima facie evidence for a loaded FL and that your skepticism requires you to put up a learning model that covers the data and respects what we know about the PLD? If you buy this, then we are all playing the same game. I evaluate your proposal, you evaluate mine. I have no problem conceding that Island effects (even if not exactly learned) follow from domain general principles. I assume you have no problem IN PRINCIPLE of these arising from domain specific principles of FL. Right?

      Last point: I agree about c) above. But out of curiosity why do you think that it is false? My reasons are largely theory internal with an additional observation that islands tend to come clumped together in ways that the theory would allow. What are yours?

      Delete
    38. @Thomas: I think it is really possible that the "himself" in the sentence (1) is indeed logophoric. This was my first thought when I read it.

      A google search reveals that it is possible for people to say the follow:
      1) It was himself-i that was praised by Otto-i
      2) It was himself that they were after.
      3) It was himself that did it.
      4) It was himself that took down the label.
      5) It was himself that played this part
      6) It was himself that was presented in heaven in bodily resurrection.

      But, if these data are to be believed, it is even more likely that "himself" is logophoric. [Disclaimer: all of these, along with the original, sound pretty questionable to me.].

      Furthermore, (7) sounds as good/bad as (1) to me.

      7) It was himself-i that was praised by Sean.

      Now, my judgments are very untrustworthy on this matter, so one could perhaps discount (7).

      A side note: There are a lot of biblical references (as in 6) when I searched for "It was himself that" - makes me wonder if this is some archaic template).


      @Christina:
      a)Regarding me misrepresenting you. You changed your words in your reply to me. So it is clear that you were the one who was being sloppy with words. Don't blame someone else for your mistakes.

      b) "what makes you think I am quoting Paul blindly"
      Because you have NEVER provided an argument after quoting him. As if quoting him makes your case.

      c) "Does Norbert hate Stephen Levinson or Morten Christiansen because he criticizes them?"
      If Norbert talked about nothing else except Christiansen and Levinson and about how much they lie and fudge everyone else's claims and about how unscientific they are, yes I would reasonably infer that perhaps he has some vendetta against them (hatred towards them). But, Norbert has published a LOT of work where he doesn't do such a thing; even on this blog, he has said a lot of things without invoking their names. The same can't be said for you in regards to bashing Chomsky. Even here, when the discussion was about generative syntax, you can't stop yourself from trying to beat up Chomsky "if s/he learns it from the data why is Chomsky's view superior". It is as if there is no one in generative linguistics beyond Chomsky for you. Talk about ridiculous!

      Furthermore, "If s/he learns it from the data why is Chomsky's view superior to say Tomsasello's or Christiansen's" - "Binding" is not a Chomsky view. It is a generative syntax claim. Tells everyone very clearly how much you have actually read in the field beyond Chomsky, and of course Postal (how could we forget him).

      d) Furthermore, my anonymity is irrelevant to the arguments, so no point bringing them up.

      e) I asked you not to drop more names, you did. I asked you not to mention Chomsky and Postal, you did. I asked you for a counter-analysis of the relevant facts, and we are met with silence.

      Should we assume that you don't have a fully fleshed out counter proposal that does better? Again, not someone else's work or quote.

      We are all discussing your example/challenge, and I am sure more will pitch in. Because ultimately, we all (generative linguists) care about the proper analyses/generalisations that are needed to account the data (we aren't about dogma as you so wrongly think). I would like to see the same level of care from you in presenting a fully-fleshed out counter-proposal that works better in accounting for the relevant facts. Since you raised the issue, you should be willing to debate it properly. So, I will ask you another time:

      "So, don't quote what others have said, don't talk about platonic love (Postal), or your realistic hate (Chomsky), don't discuss personalities, just analyse the same data using non-generative methods (whatever that means), come up with a better analysis and show all the generative syntacticians they are wrong."

      Delete
    39. @Thomas: Thanks for the comment. I'll get back to you on some of the details [as mentioned earlier, I am traveling]. You are of course aware that sentence (1) is not the only example from the Postal/Ross challenge, right?

      @Confused Person; I doubt I ever met anyone who was as successful as you at distorting what others say. I commented on your courage [or lack thereof] not the quality of your arguments when I wrote: "Note also your amount of ridicule and LOLs - very courageous without providing a real name..." For being a coward not having the balls to provide one's real name when one pours a load of insults over others IS quite relevant. Once you got yourself unconfused enough to remember your name maybe we can continue this conversation. In the meantime: have a good day.

      Delete
    40. @Christina: You spend most of your time critcising the generative approach to language cognition, by claiming that it needs a much more integrated approach respecting many more fields, yet when trying to understand the psychology of a person (namely, me), you seem to be so sure that my lack of certain pendulous appendages is the ONLY reason one might not want to present my own name. How sure you are without relevant data or experiments! The only thing relevant to this discussion is the baselessness of your claims even wrt my intentions. lol (again)! And I will keep lolling till you actually discuss relevant subject matter instead of discussing personalities and spewing invective. Cos honestly you need to be trolled; you've done enough of it yourself.

      Btw, I STILL don't see a counter analysis of the issue you raised. Even I, henceforth the unappendaged, have done more in this academic debate by bringing up relevant data in a positive and productive manner than you. So much for engaging you in your own criticism. Live and learn, I guess.

      Note: The ONLY reason I created this id "Confused Academic" is to see if you would somehow bring that into the discussion even when it was completely irrelevant to any possible argument. Clearly my expectations were borne out. Maybe, I should try out more such ids to see how low you will sink.

      You can have the last word (as if you were gracious enough to have it another way!).

      Sincerely,
      The Unappendaged (formerly, The Confused Academic)

      Delete
    41. I would also like to add that I think that it's very curious that Christina Behme keeps insisting that she doesn't have a personal vendetta against chomsky when a) she has spent a lot of time fixated on him in this blog. b) her youtube records (which are public, see here: https://www.youtube.com/channel/UChJxTJ1u3xZYSMK30CmZ9tg) show that she has been going around leaving nasty comments on videos that are specifically focusing on Noam Chomsky.

      I mean, who has that kind of time? At some point you've got to get bored, or decide that you're spending so much time criticizing someone that you haven't got the time to put anything original forward. I mean, someone's got to be paid to be this dedicated.

      Delete
    42. @Norbert: Ok Now we are cooking with gas (I never quite understood what this meant, but it apparently means something positive).
      Did you ever try to make sauce hollandaise without having the egg curdle? Or consistently simmering a sauce for several hours? Damn hard with an electric stove, let alone the wood-burner contraption my grandma used. Easy as pie with a gas stove. Gas stoves are one of the greatest inventions of the last 150 years, and the fact that they are increasingly being replaced by (electric) glass top stoves is nothing less than a crime against humanity. Sic transit gloria mundi.
      ...
      ...
      ...
      Yeah, that's a major pet peeve of mine.

      Delete
    43. @Norbert: Now that the dust has settled I am genuinely interested in this question:
      "why should we think that the standard methods of syntactic analysis will bring us to the (or a) psychologically real grammar for a language?"

      So if (A) is the bare descriptive claim, and (B) is the theoretical claim about grammars, and (C) is the claim that something is innate, then the standard argument goes
      (A) therefore (B) therefore (C).
      I accept (A) and I often accept than (B) implies (C), but I reject (C).
      And so I reject that (A) implies (B).
      Indeed I think that we need to integrate considerations of computational efficiency, learnability, etc etc. into our construction of the theories (B). So why is this controversial? Is this controversial at all? I thought it wasn't.

      Or alternatively is anyone prepared to defend the claims that the methods of syntactic analysis can (without appeal to considerations of learnability etc.) reliably pick out the psychologically real grammars?
      Are there any papers that defend this view?

      Delete
    44. @Alex: The most basic argument is that the standard methods (constituency tests etc.) narrow down the class of viable grammars. Some subset thereof is the class of correct grammars.

      Since even in practice one immediately encounters cases where several grammars account for a given data set, there's also a bunch of criteria that are used to rule out certain grammars. In principle, those are computability/processing and learnability. But they didn't play much of a role for a long time because the connection between syntactic analysis and these conditions wasn't well-understood for a long time. We're still struggling, but there are also non-syntactic arguments nowadays (semantics, acquisition patterns, etc.).

      This brings back a question I asked you earlier but got lost in all the unsettled dust: How could any kind of learnability result distinguish between an NP-analysis and a DP-analysis? Psycholinguists have some ideas, but I doubt that you would find them convincing.

      Anyways, since computability and learnability are too coarse at this point to help with these fine-grained issues, the criteria linguists mostly rely on are standard ones like parsimony, elegance, stating generalizations, etc. To some extent they are claimed to matter for learnability, but even if that wasn't the case they would still be important criteria for evaluating theories.

      tldr: Syntactic analysis is the best method we have at this point, and unless you can show that 1) the underspecification it gives rise to is an insurmountable challenge (meaning we can't translate our current findings into a new formalism once we know more about learnability), and 2) learnability offers a reliable way for ranking competing proposals, there really isn't much to discuss.

      Delete
    45. @Alex: Excuse me if I believe that you have a penchant for moms and apple pies. Sure We need to integrate all we can in whatever way we can. Duh. However, I think that you misunderstand the argument form. The generalizations I listed, that we now all agree to be roughly correct, are ones that native speakers adhere to. We can now ask, how did they come to do this? Is it a fact about tracking the environmental data or something innate? This we can actually sometimes answer but looking at the PLD and seeing to what degree it supports inferences to certain patterns. You know the drill. If the PLD is not rich enough, we can conclude that whatever the powers native speakers have it is not one that is, as our favorite empiricist would have us think, "a compressed representation of the regularities in the input" (a quote from a paper by Lidz and Gagliardi). So, we can know if it suggests innate structure or simple patterns matching. Now, once we've done this, we will need some theory to get the show on the road, i.e. to identify WHAT that innate something is. There are always at least two options (and mixtures thereof). It is an innate capacity special to language or it is a more general innate capacity. We even know how to go about arguing for this. We develop theories of each kind and see how far they go. Now, here's where I am pretty sure we part company. So far as I can tell, we've got plenty of domain specific accounts for Islands effects, binding effects, ECP effects etc. I've even presented some on this blog. But, and listen carefully here, to my knowledge we have NONE that are domain specific. Actually, let me backtrack: we may have one for islands presented by Pearl and Sprouse (see my comments somewhere above in the thread). So, I've done my job and now its time for you to do yours. You are allowed to be as skeptical as you want to be, but the price of admission to play the game and be taken seriously (and not just as expressing an opinion, sort of like the gas you get from indigestion) is to pony up a story. If you don't have one, which is fine (I don't have many), then you say so, express hopes for the future and tell people that you will participate in the discussion when you have something to offer. What you don't do is insist that your reservations be taken seriously without anything positive to add or without some serious criticism of what's on offer cannot work either.

      So that's the way we do things: we try the best we can to explain the data at hand. Personal opinions are fine, but carry no weight without good arguments and an alternative. Without these, well, who cares how skeptical you are or I am about anything. As Ecclesiastes would say: "wind, wind, all is wind."

      BTW, let me add that I am not nearly as skeptical as you are concerning learnability arguments etc for Gs given UGs. We have some nice hints about what these might look like in Yang's work, Lidz's, Crain's, Berwick's Wexler's etc. So, even here there are things to say. Is it finished? No. Does it look plausible and scalable? To my eye, yes.

      Delete
    46. Saying that syntactic analysis is unreliable but is the best we have is a perfectly reasonable stance; and that it is supplemented with non-empirical principles (like parsimony, elegance, capturing linguistically significant generalisations) because it is indeterminate otherwise is also fine. But I wonder if anyone would make some stronger claims.

      Because obviously the learnability, efficiency, evolution issues are empirical conditions, and so if there is a conflict between the results produced using the non-empirical conditions and these empirical facts, then we should reject those results. So I would expect some people here (Norbert?) to defend a stronger claim.

      More specifically on the DP/NP issue, we don't really yet have a notion of head yet in learnability theory. That may be something that is resolved in the future but for the moment the DP NP distinction is notational.
      Empty constituents are a problem also, so to the extent that the DP hypothesis depends on null determiners, that would point towards an NP analysis.
      (There is a technical sense in which "interesting" is the head of "very interesting").

      Delete
    47. if there is a conflict between the results produced using the non-empirical conditions and these empirical facts, then we should reject those results

      Is the antecedent of this conditional true?

      Delete
    48. Hi Norbert,

      So a big part of the story is in this paper in Machine Learning (here) which shows how you can learn a large class of grammars distributionally. This is incomplete in several respects, but I have other papers that fix most of those problems. It is complete in the sense that the grammars are powerful enough to represent all of the phenomena you are interested in: I think all of them.

      So that's my story. You appeal to Yang, Berwick and Wexler etc. . I don't know which papers you mean but these are really incompatible with each other. The Berwick and Wexler paper is about learning transformations given the deep structure trees, and Yang's book is about parameter setting in a 90s style P and P model. So could you be a bit more specific?

      Delete
    49. @Alex C. I know I've asked this a million times before, but how does any of this work defuse POS arguments for the innateness of the ECP and other such principles? What's at issue is whether these constraints are learnable given the data that kids actually have.

      Delete
    50. @ Alex C: Adding to Alex D's comment, the algorithm(s) that you mention (which learn(s) large classes of grammars) will also learn a huge set of grammars which are not actually seen in languages; potentially, those that have flips of the generalizations that are actually observed in languages.

      For example: Condition-A' - The antecedent is always c-commanded by the anaphor…

      So, showing a general purpose algorithm that can learn the observed patterns doesn't solve another (real) problem at hand, which is "why are these the generalizations?" Of course, one could argue that the specific observed generalizations are due to historical/domain-general cognitive factors, but if so one will have to be specific about exactly what those factors are (otherwise, it is a just-so story, much like many from the "evolution of language" literature).

      Delete
    51. Yes, I think that is the right question to ask. So there just aren't a lot of examples of e.g. parasitic gaps in the input data. Probably none, really. So we have some algorithm that we know (i.e. have a proof that) can learn if it gets enough examples of everything. But it's not going to get enough examples of everything, so the conditions of the proof don't hold and so the proof shows nothing.

      So I feel the force of that argument, but once we are in the space of arguing that you can learn from 1 billion sentences but not from 1 million sentences, then it's not an impossibility argument. It's an argument that this is missing something, something vital to be sure, but not something that is impossible. There is some component that learns some sort of feature system (i.e. the feature system that reduces MCFGs to something more like MGs) and at the moment I don't have a precise story about what that component is. I don't see any reason to think it is specific to language though. It might be though. I just think that at this point we have *one* complete articulated theory of how some very rich grammars can be learned from data. What are the alternatives?

      That's a genuine non-snarky question. I know the work that Norbert plugs, but I just don't see how the pieces fit together. So I read the Hauser et al paper where the only language specific bits are recursion and maps to the interfaces -- so where's the learning story that goes with that version of UG/FLN? Nobody gives me a straight answer. They change the subject or start waffling about 3rd factor principles.


      Delete
    52. @ Alex C:

      1) "but not from 1 million sentences" - to me, that sure sounds like an impossibility argument (but an impossibility argument that is quantity sensitivity).

      2) I am pretty sure you might already know of this work, but here goes: Legate and Yang (2002) tried to address this very question in their response to Pullum. [http://web.mit.edu/norvin/www/24.902/LegateYang.pdf]

      Delete
    53. It's not just parasitic gaps. FOr the sorts of things I listed, it's most of these things. There are very few CNP in the PLD, there are virtually no extractions from them. There are not that many sentences with 3 levels of embedding etc. There are no fixed subject constraint violations etc. etc. etc. So, the questions that Alex and CA ask are the ones that we always return to and the answer seems to be, you feel our pain. Great.

      To get to your last question: there are models for language acquisition for particular bits. Jeff Lidz has been working on this. The standard current approach is to embed a UG into a Bayesian style learner. This strikes me as plausible, indeed, it looks a lot like what Chomsky proposes in Aspects, IMO. Do we have a full learner? No. But then again it's not clear to me we should. Do we have a theory of physics that deduces the physics of the ball field? No. So why should things be different here. What we want is a model for learners. We have piecemeal accounts and we have some idea of what kinds of properties we want them to have. That's not bad.

      As for what's language specific: Chomsky has his own proposals. But, it is a PROPOSAL!!! the MP is a conjecture, it is not yet a theory. The aim of current research is to try to turn MP into a plausible account. Part of this will involve doing the sorts of things I've talked abut before, e.g. unify the modules in GB, see if they can be factored into general kinds of operations and linguistically specific ones etc. That's the aim. The results have not yet IMO redeemed the program, but that's ok. This is why I have urged people to zero in on areas where there is more consensus. the list of "laws" effects" were/are an attempt to do this. Why not concentrate on seeing how THESE could be acquired given the PLD. Put in whatever you want and see what you get. I know that putting in binding theory can get you quite a way towards acquiring how reflexives and bound pronouns work. Again, Lidz has some nice stuff on this if you are interested. So, what's language specific and what domain general? Depends on your particular MP theory. However, whatever the answer, they had better in concert with PLD deduce something very like the facts that the effects describe from exposure to just the PLD. See, not third factors.

      Last point: I too am working on fitting the piece together. That's what research does. I think I have an overall story, but it won't be to everyone's taste and it won't necessarily be Chomsky's. In a upcoming post I will try to lay it out, as best I can.

      Delete
    54. @Alex: Empty constituents are a problem also, so to the extent that the DP hypothesis depends on null determiners, that would point towards an NP analysis.
      You can also have DPs with unpronounced NPs in many languages (NP ellipsis instead of English-style one replacement), so the two analyses are on equal footing regarding the need for empty categories. And of course both can do without empty categories if you're willing to blow up the size of your grammar.

      Delete
    55. @Thomas; yes that's probably right. I think the notion of head is a bit problematic; if you allow empty constituents it seems vacuous. What's the current view on conjunction -- "cats and dogs"?

      @Norbert: I think we all need to remind ourselves from time to time that we are comparing incomplete and partial theories; and they explain different things. But we can only compare reasonably explicit proposals. What I find difficult is that when you are pressed you start mentioning approaches which seem completely incompatible:
      Yang style parameter setting algorithms in a P & P model
      Berwick Wexler style learning of transformations using deep trees as inputs
      Aspects style evaluation metric
      and now
      UG plus a Bayesian model. But what UG? The UG plus Bayes models covers everything from Chater/Perfors style work to something quite different.
      These are all really different and incompatible models, producing different grammars, using different inputs and making different claims.

      I find it quite frustrating that you challenge me to "pony up a story" but when pressed, you start saying the MP is just a conjecture and a program and not a theory.

      Delete
    56. @ConfusedAcademic -- I really don't think the "You can learn too much" argument is any good. (whereas I think the POS argument is a very good type of argument)

      First of all, it is not the case that these algorithms learn everything. The classes of languages that they learn are not closed under union and so there are languages where L is learnable but L union the reversal of L is not learnable. So that does potentially
      provide an answer for why we don't see questions being formed by reversal.
      (to use one example from the literature).

      Secondly it is not the case that every language is equal. Some languages are much more complex and hard to learn than others. For example, it can learn the (string) language {a^n b^n | n > 0} and the string language {a^n b^n | n > 0 and n < 1000000} but the grammar for the latter is a million times larger than the grammar for the former. Moreover some languages that have small grammars are nonetheless quite hard to learn.
      (e.g. parity functions over n-bit strings).
      Similarly inserting a word at position 6 will in general cause the size of the grammar to blow up.

      Thirdly there are many many different explanations for why all languages have some property P; basic functional explanations,
      common descent, communicative efficiency etc etc. and of course just coincidence (as Piantadosi and Gibson argue). And I think many supposed universals are false or an artifact of the unreliable analytical procedures.
      Theories that explain more are to be preferred to theories that explain less, but I don't think that a statement "P is innate" is an explanation of anything,
      and I don't think that a failure to explain everything is a reason to think a theory is false. So I reject the inference in general that typological universals should be part of UG.

      Delete
    57. @Alex C: (Part I)

      a) "I really don't think the "You can learn too much" argument is any good".

      I can recognize that this might not be an important aspect of linguistic data to you, but the surfeit of data is actually a problem. There are a LOT of potential generalizations in the data that are simply not made. One has to think about why that is the case. What separates the ones that are made from the ones that aren't? By just saying that such arguments are simply not as convincing, we are potentially ignoring a good source of information about the set of generalizations that we need a theory for, especially if only some kinds of generalizations are made, while others are systematically ignored. Of course, you can always say that is not interesting/useful/convincing, but I don't find that attitude particularly useful in understanding language.


      b) "The classes of languages that they learn are not closed under union and so there are languages where L is learnable but L union the reversal of L is not learnable. So that does potentially provide an answer for why we don't see questions being formed by reversal."

      Note, this is not very different from the hand-waving you complain that Norbert indulges in wrt to acquisition models. Could the facts that you mention above account for such reversals, perhaps, but equally well, they might not. For example, a grammar that can learn {a^n b^n | n > 0}, could also learn {b^n a^n | n > 0}, where "a" =/= "b". Here, the relationship is formally identical, but substantively different.

      So, it is crucial that one show that the relevant reversals are unlearnable for the proposed (general) learning mechanisms, if you believe that the learning mechanism is the locus of the explanation for it (your answer suggests that you do think this is true for some generalizations, at least). Furthermore, what you mention is at least as vague as the things you have been complaining about. Of course, I respect that you have a difference of opinion, or that you value a different set of facts as crucial to understand, but to not acknowledge your own vagueness while complaining about someone else's is not terribly sincere.

      Delete
    58. @Alex C: (Part II)

      c) "Thirdly there are many many different explanations for why all languages have some property P; basic functional explanations,
      common descent, communicative efficiency etc etc. and of course just coincidence (as Piantadosi and Gibson argue)."

      Again, to me, this is not an explanation. In fact, it is not at all clear that these different source of explanations will be mutually compatible. We know at least from the realm of phonology/phonetics that communicative efficiency, production and perception don't always lead to compatible expectations (and infect, all of these terms are terribly vague). So, again, as you try to argue wrt respect to Norbert's statement about different ways of incorporating innate knowledge ("These are all really different and incompatible models"), the ones that you mention are not necessarily in sync. In which case, just as you expect clarification from Norbert, I see it as fair to expect clarification from you as to which of these "potential factors" actually accounts for the observed substantive laws/effects…? In the absence of such a clear exposition, I am afraid you are again being as unclear as you complain others are.

      Again, it is not my point to say that to focus purely on formal universals/laws is a bad strategy. Sure go for it. But, to allege that the generative viewpoint is being vague/sloppy while you yourself are being so with (substantive) generalizations that are important to generative linguists seems a very inconsistent position to maintain.

      In fact, I can echo what Norbert (I think) has tried to communicate in the past. You can't say you are playing the game till you take both formal and substantive generalisations/laws/effects seriously. By systematically ignoring one source of information and presenting potential sources of explanations without precision/clarity, you simply cannot claim that general learning mechanisms (with perhaps little innate information) are going to work. The work might as well be about "Neanderthal language".

      I acknowledge I have not presented an argument for strong innateness (more precisely, domain specific), but that was never my intention. And as I said earlier, I can't imagine any sensible researcher wanting to hang on to strong innateness if there are potential (general) cognitive explanations. The real question is, "Are there any that work or come close to working?"


      d) "I don't think that a statement "P is innate" is an explanation of anything".

      Surely, it is not a big chunk of the explanation. But, it could be the beginning of one. Just as "barking is innate" is the beginning of any work on how dogs develop into adult barkers. Furthermore, "P is not innate" is not any more of a full explanation. What we want is a (reasonably explicit/clear) explanation of ALL the relevant facts.


      e) The debate/disagreement clearly is about what the relevant facts are that need a learning-model based explanation. And here a good test bed is the kind of stuff that Alex D mentioned earlier: Does the model/story/... make sense in view of the time-course of acquisition actually observed in children?

      I think a potentially useful exercise given the disagreements on this blog would be to try and clarify (precisely) what would count as a good argument against one's own position. Now, it might be futile to try this for general viewpoints/programmes, but it might be worth it to ask such a question about specific aspects.

      Delete
    59. My dialectic was very unclear, so I am sorry for any confusion. I trying to explain what Norbert and I take to be the fundamental problem (language acquisition) and I am definitely not trying to explain typological universals. So explanations of language acquisition should be precise and not vague.
      So I claim that some class of languages L, for concreteness, say the class of well nested MCFLs of dimension 2 with the finite context property, is a) learnable and b) contains the natural languages. This then locates the explanation of some universals (e.g. mildly context-sensitivity) in UG, and the explanation of other universals outside of UG.
      You may argue that an argument against this is that it fails to explain why, e.g. all languages have some other property P. It does fail to explain this fact, if it is a fact. I don't think that is a problem, because there are many other possible "explanations" for this (P is vacuous, P is false, P is an accident, P is a functional property, proto-Earth had P etc etc) which explanations are obviously different depending on what P is. (Relative clause islands being false, binary branching being vacuous, etc.) and some of which may for some P overlap (function and common descent for example) . My claim is also that L because of its structure could explain universal properties P even if L contains languages which are not P.
      And I gave a v simple example (this is a comment on a blog!). So I am vague about the things I am not trying to explain and precise about the things I am trying to explain. But Norbert is being vague and contradictory (in my opinion) about what he is trying to explain (language acquisition) and that is what I am "sincerely" complaining about.

      Delete
    60. @CA: you said "The work might as well be about "Neanderthal language". "
      This is a very interesting remark; I think you intend it as a criticism, but it does highlight a difference in attitude. So I am trying to develop a general theory about how one can learn hierarchically structured languages from strings, -- so I think (tentatively..) that I would expect any evolved learned communication system that is based on hierarchically structured sequences of symbols to fall into these classes. So Neanderthals, dolphins, martians etc .. Different subclasses of course, but there are general arguments that suggest that these distributional techniques are optimal in a certain technical sense. That is a point of difference I think from Chomsky's view which was (I don't know if it still is) that there would be very human-specific aspects to UG. The recent evolang paper we discussed here contained none of that rhetoric though so perhaps he has changed his mind on that.

      Delete
    61. @Alex C: "I think you intend it as a criticism, it does highlight a difference in attitude".

      I actually didn't mean it as a criticism. I was just pointing out that the algorithm could in theory work for "Neanderthal Language", or any hierarchically structure language.

      But, as I followed up, this essentially highlights that there are at least two camps out there (nothing new, here). View A: the learning algorithm has nothing particularly interesting to say about the substantive aspects of the observed generalisations in human language. View B: the learning algorithm does have interesting things to say about the substantive aspects of the observed generalisations in human language.

      In fact, View B is what unifies some of the "really different and incompatible models" that Norbert mentions. At least some of them, P&P and UG + Bayes, try to grapple with the substantive aspects of the generalisations.

      You appear to be have View A: "So Neanderthals, dolphins, martians etc .."

      But this is not about the actual acquisition of human language then and it is important to appreciate that, in my opinion. For example, different song-bird species have different song repertoires, and appear to have different learning abilities. Perhaps, they are all finite state or even more constrained than that, and one could come up with a general learning mechanism for FS in general, or that part of the subregular hierarchy, if it that patterns are infect more precise. But, this doesn't actually address the issue at hand which is: how/why does each of those species develop/acquire their particular songs. Sure, a general purpose mechanism might help guide our view of the acquisition process. But, it in no way (at least to me) directly addresses the issue of prior/innate knowledge, or of the specifics involved in any one species.

      Could it be that a general learning algorithm is all that a human being needs (or for that matter different species of songbirds). Sure, but that has to be shown. The algoritm will have to make sense wrt to the question that Alex D raised earlier (repeat):

      Does the model/story/... make sense in view of the time-course of acquisition actually observed in children?

      Do children's experiences contain the relevant distributional evidence necessary at/by the time children seem to know the relevant generalizations?

      And what I added: Does the generalization IGNORE the relevant patterns observable in the input?

      I think I am beginning to repeat myself, so I should probably stop and let others continue the discussion. I hope you won't be offended by that, if you do respond.

      Note: I ignored your comment where you replied to the charge of vagueness, because it appears you misunderstood what I meant. I was referring to your vagueness related to the substantive aspects of linguistic generalisations.

      Delete
    62. @CA:
      I may be misunderstanding what you mean by "learning algorithm" but this strikes me as somewhat odd:
      "In fact, View B [= the learning algorithm does have interesting things to say about the substantive aspects of the observed generalisations in human language] is what unifies some of the "really different and incompatible models" that Norbert mentions. At least some of them, P&P and UG + Bayes, try to grapple with the substantive aspects of the generalisations."

      To the contrary, your usual Bayesian approach is rather agnostic about learning algorithms -- what is of interest is what the posterior induced by the specific biases one puts into a model (through specifying prior and likelihood) and the input looks like.
      To derive this, you will have to make use of some algorithm or other (except for extremely simple models for which you can just do the math) but the algorithm is usually not of any particular interest. The focus is really on what certain biases plus data imply, not on how a specific algorithm performs.

      Similarly, I think the interesting bit about Yang's variational learner (which I you mean when you mention P&P?) is not the "algorithm" which is, in a sense, just error-driven learning, but his setting everything up in terms of multiple grammars.

      In any case, for neither "Bayes+UG" nor Yang-style P&P is it the algorithm that has anything substantive to say about the kinds of generalizations observed in human languages.

      Delete
    63. @Alex C:

      I fear that you are becoming attracted to the dark side once again. Here's your quote:

      You may argue that an argument against this is that it fails to explain why, e.g. all languages have some other property P. It does fail to explain this fact, IF IT IS A FACT (my emph NH). I don't think that is a problem, because there are many other possible "explanations" for this (P is vacuous, P IS FALSE (my emph), P is an accident, P is a functional property, proto-Earth had P etc etc) which explanations are obviously different depending on what P is. (RELATIOVE CLAUSE ISLANDS BEING FALSE (MY EMPH) binary branching being vacuous, etc.) and some of which may for some P overlap (function and common descent for example) . My claim is also that L because of its structure could explain universal properties P even if L contains languages which are not P.

      Here's my problem. Some time back I thought that we agreed that the list of effects that GG has discovered are more or less accurate. Are you reneging on this? Tquote above suggests you are. If you are, I am happy to go into my climate science denier routine. The rough accuracy of these discoveries I do not take to be up for grabs. If you do, let everyone know this so that we can all stop wasting our time.

      Given that the effects are not/should not be controversial, what is: well how to explain these. Why do we find that Gs respect these effects (again, I take this to be a simple descriptive fact that GG has established as more or less accurate). One answer is that FLs are built to do so. Another answer is that the follow from principles of communication, common descent or whatever. Fine. Show us the money. My story is that they are part of FL and that's WHY all Gs appear to respect them. You don't like this story. Fine. Provide another. But provide one for THESE effects. You asked me to give you some and I did. The game is to explain THEM, or at least my game is. You can play whatever game you want. But what you have no right to do is pretend that you have an account of these phenomena or, even worse, pretend that they do not exist. The latter is climate science denial for it rejects the work of the past 60 years of GG. So, can we agree that ONE project is to try and explain why THESE effects exist in G after G after G? And can we agree that concluding that it is because these IN SOME FORM are innate is ONE possible explanation of these facts? Inquiring minds want to know Alex. Here are the two questions and you can answer yes/no:
      1. the effects discovered by GG are legit targets of explanation
      2. One possible account is that they or something that leads to them is innate

      Your move Alex. Come clean.

      Delete
    64. @Norbert: The contentious issue for Alex, methinks, is the universality of the observed phenomena. For instance, it looks like relative clause extraction is possible in mainland Scandinavian languages. Your solution is probably to posit that the island constraint still holds and we're not dealing with movement. Alex's solution is to say there is no RC island constraint in these languages.

      The latter has some appeal in that a formalism like MGs can enforce island constraints, but doesn't have to, variation is expected. In order to make this an interesting perspective, however, one would have to show that the distribution of RC-island VS RC-non-island across languages is related to some aspect of MGs (similar to recent attempts to related the frequency of natural vs unnatural morphological paradigms to feature decompositions).

      Delete
    65. Addendum: The distribution could also be due to the learner, or some aspects of diachronic change. The important point is: if you assume that RCs are not universal islands, you should have a model that correctly predicts that they are islands in the majority of languages we know. Because treating it as a coincidence is never satisfying.

      Delete
    66. So, once again, I agree that
      a) relative clauses in English are islands to movement
      is a true descriptive generalisation.
      However I think that
      b) relative clauses in mainland Scandinavian are islands to movement
      is false, as a descriptive generalisation, though one may be able to rescue it by some theoretical move -- (e.g. redefining what is a relative clause).
      So I don't think that as a typological universal that RCI is correct.

      I gave some arguments in my earlier comment about why I am not that interested in explaining typological universals. If they are not clear, I am happy to amplify them.

      Delete
    67. Crossed with Thomas -- exactly -- I happened to be at the conference in honour of Elisabet Engdahl and one of the papers was on this topic, and I also had the good fortune to speak to an expert when I was giving a talk in Cambridge last month. There is a very recent paper on this (Christensen, Ken Ramshøj & Anne Mette, Nyvad. 2014. On the nature of escapable relative islands. Nordic Journal of Linguistics 37(1), 29–45.)

      Delete
    68. @CA, Alex D: I never understood the sentiment that the learning algorithm should replicate the learning trajectory of children. If a psycholinguist told a syntactician that their analysis makes the wrong processing predictions, the syntactician probably wouldn't doubt this analysis but rather the assumptions the psycholinguist makes about memory usage, serial VS parallel parsing, etc.

      The same applies in the case of learnability. Alex's work provides a general learning algorithm, but there's many different ways of implementing this algorithm. If you wanted to code it up, you would have to make many decisions about data structures, control structures, etc. that are irrelevant for the general mechanism of the algorithm, but will have an effect on how your program behaves.

      And keep in mind that the cognitive architecture of children undergoes many changes during the acquisition process, and we have no idea about the computational implications of this --- mostly because CS only has to deal with quantitative changes while running a program (e.g. dynamically expandable clusters of computers) but not qualitative ones (switching a machine from an x86-CPU to ARM without interrupting the OS, which is simply impossible with current architectures), so nobody has studied anything like this.

      In sum, the relation between a learning algorithm and its cognitive execution is very tentative, there's a huge number of variables, and we don't even know what exactly they are.

      Sure, you can make assumptions and if you get something that mirrors reality, great! But as long as we have no principled reason to deem those assumptions plausible, that's just a nice bonus. It is not an essential requirement for a learning algorithm. In particular, a learning algorithm that mirrors the learning trajectory in some experiments you run but hasn't been proven complete and sound is worse than one that deviates from the learning trajectory but is complete and sound.

      Delete
    69. It's not true that mainland scandinavian fails to obey RC islands. Rather it seems to not obey SOME RC islands. Curiously, as has been noted, these islands in the same places in English (which does obey them) are better than the cases that are also bad in Scandinavian. How to describe these is a very interesting project. I personally believe that something along the lines that Dave Kush has pursued is the right way to go (see his co-authored paper in the Sprouse and Hornstein volume). So, I don't buy the description.

      Second, I don't buy this, in part, because it if is accurate it would simply be unlearnable. There is no way in hell that the relevant PLD is available. I also don't buy this, because we find the very same contrasts in English.

      But let me end: is it Alex;s view that ALL the stuff I cited at the outset is false? That there have been no discovered generalizations? That it's an accident that in language after language most of the islands fall into the same group, that binding works the same way, that ECP effects are as portrayed etc. IF that is the view, then it denies what I have said again and again is akin to global warming denial. There too there are anomalies. There too things are puzzling. There too there are skeptics. And I intend to treat skepticism in my domain rather like I believe these ought to be treated.

      SO let me put this bluntly again: you buy that the generalizations we have found are more or less accurate or you cannot play the GG game. Reject these in toto and there is nothing to discuss. If Alex wants to go there, let him. But, and here I am addressing the GG community: nothing he has to say is worth worrying about for no discussion can possibly be fruitful. It's like someone denying that there is recursion because of Piraha or constituency because of Walpiri.

      I have repeatedly said that Alex's skepticism is extreme. Your reconstruction of his views (and his endorsement thereof) tells me that I was right (again).

      Delete
    70. @Thomas. I didn't say that proposed learning algorithms ought necessarily to replicate the acquisition trajectory of children. I'm not sure which part of my comment you're getting that from. The question was just whether any of Alex C's work does anything to undermine POS arguments for the innateness of the ECP etc. The hurdle to overcome here is a much more basic one than replicating the trajectory of kids' acquisition of ECP effects. It's to explain how it could even in principle be possible for kids to learn the ECP on the basis of the data available to them. I'm open to the possibility that kids do quite a bit of grammar learning, but I don't yet see any work on learning algorithms which significantly reduces the force of POS arguments.

      Delete
    71. @benjamin.boerschinger: You are right. I was being sloppy (or just plain ignorant. Maybe, even "confused". :)).

      I meant "model", not "algorithm". But, me and my alter ego will standby the rest of the statements.

      Delete
    72. @Norbert: I know you promised not to talk to me again but maybe you can explain to your GG community how you can get from Alex' statement above to:

      "If Alex wants to go there, let him. But, and here I am addressing the GG community: nothing he has to say is worth worrying about for no discussion can possibly be fruitful. It's like someone denying that there is recursion because of Piraha or constituency because of Walpiri."

      How could anything Alex said be on the same level of confusion as someone denying there is recursion because of Piraha? So why brand Alex as such a lunatic? Why are you so defensive? Charles Yang writes in the next blog post [which has been mostly ignored] that "any scientific hypothesis, is guilty until proven innocent". http://facultyoflanguage.blogspot.de/2014/05/how-to-solve-mysteries-reaction-to-our.html

      I am unaware that MP [or hypotheses proposed under the MP umbrella] have been proven innocent yet. Are you suggesting GG proposals are no scientific hypotheses? Or will you be going after Charles next for reminding us that, when it comes to certainty that our hypotheses are right, we're all in the same boat? Unless you have some unique divine revelations that give you absolute certainty about the proposal Alex has expressed SOME scepticism about it is rather irresponsible to tell your community: "nothing he has to say is worth worrying about for no discussion can possibly be fruitful"

      Delete
    73. @Alex D: Apologies for the incorrect attribution. It's a big thread, so it's easy to lose track of who said what. I was mostly responding to CA's remark that a good test bed is the kind of stuff that Alex D mentioned earlier: Does the model/story/... make sense in view of the time-course of acquisition actually observed in children? And looking at it again, I probably misconstrued the point of that remark, too.

      Delete
    74. So we have probably reached the point where the insight we can get by continuing is not worth the typing time, but (ignoring for the moment N's distortions of my views)
      we are arguing about several different things:
      So take RCIs.
      There are three different questions that I am interested in to varying degrees, that are getting squished together.

      L1A) How does the child learning English/Warlpiri ...come to acquire a grammar that correctly generates the facts that are described by the expression "RCs are islands in English/Warlpiri"?

      TYPO) Why is is that so many (maybe all in fact) languages are described by the expression "RCs are islands"?

      METHOD) when we construct a theory of L1A, should we start from the typological generalisations like RCI , assuming that they are "innate"? i.e. should we assume that RCI is in UG?


      So (L1A) is the fundamental theoretical question of linguistics, and we just aren't going to be able to sort it out in the comments of a blog.


      But I think we could have a useful discussion about the methodological point, which is where I do sharply disagree with Norbert. Without, as I have explained several times, denying the truth of many/nearly all of the descriptive generalisations he listed.

      I think the final methodological point is a terrible and incredibly counterproductive idea.
      I sort of understand, when I am being charitable, why people thought this was a good idea 40 years ago, since there were no alternatives, and people knew nothing about genetics, or learning theory., or the theory of computation.

      And I feel the force of the POS arguments raised by Alex D, though as I wrote a book on the topic with Shalom Lappin, I am not going to try to recap the reasons why I think these arguments fail.

      But to lapse into Norbertian hyperbole, I think assuming the RCI is innate is the worst possible starting assumption. Once you make that assumption I think it is impossible to make progress. And I think that is why nearly 50 years after Aspects, there are no good answers to (L1A) on the table.

      Delete
    75. @Alex:
      As Alex has ceded the last word, I will take it. Here's the lay of the land. We have some very good established more or less true generalizations (including RCIs). The aim is first, to explain them (that's what subjacency and the ECP were trying to do) then to explain how native speaker Gs respect them (that's where POS comes in and the assumptions that large parts of subjacency and ECP are either themselves innate or derive from something that is). Then, to explain how they could have arisen (Darwin's problem and MP). These efforts go on simultaneously. Where Alex and I disagree is that though he claims to buy the idea that RCIs (and my other GG facts) are well grounded generalizations (given some Scandinavian problems) his stories never actually try to explain why Gs respect them. He claims to buy the facts, but he refuses to address the consequence of their being facts. Moreover, when someone tries to address these, he objects if they involve assumptions he doesn't like (e.g. the theory of subjacency is innate). What does he do? He changes the topic. I have nothing against his topics (though they don't really interest me much personally) but I do have a lot against his little shuffle: Yes I buy the data but no I don't have to do anything that addresses it but I'm still allowed to be oh so skeptical about any approach that does. This habit of Alex's is deeply counterproductive, not to mention annoying. In fact, it suggests that he doesn't really give a damn about the facts he claims to accept for were he to care he would concede that at this moment he has nothing to say about them. His methods may one day address them, but right now nothing. That would be a refreshing pronouncement. It would even be true. But the yes I buy it, but I won't address it stance is simply squid ink obscuring the empirical landscape.

      This will be my last word on this thread as well.

      Delete
  6. I am following this thread with quite a bit of eye-rolling and of course relentless apparent reasonableness. Typing this on my tablet at the airport just prior to a week of travel and conferencing, so it's unlikely I'll be able to keep up.

    Quickly then, some brief points to chew on:

    1. I believe that the history of the language sciences includes the past six decades that you keep mentioning, but started way before it — and that our understanding of language today is grounded in discoveries by von Humboldt, von der Gabelentz, Peirce, Wundt, De Saussure, Boas, Malinowski, Sapir, Zipf, Bloomfield, Harris, Haas, Hockett, Chomsky, Miller, Vygotsky, Jakobson, Lenneberg, Bolinger, Hale, Dixon, Halliday, Levinson, Jackendoff, Lakoff, and many other luminaries (I'm trying not to include too many living linguists because true impact tends to come with a time lag).*

    2. I believe that the "sides" and the "current intellectual atmosphere" you keep mentioning are ultimately sociological constructs that are created in large part by our own discourse. That is why your blog piqued my interest a while ago and why I decided to comment on this particular sociological issue (without having the intent or the time to single-handedly defend or define whatever sides you think are important or take me to belong to). Yes, sides exists: we are creating them as we speak. But we can also see their detrimental effects (I pointed out one) and we can make a conscious effort to avoid those effects.

    3. I believe that we all have intellectual blind spots, and that taking sides and adhering to 'doctrine' (your word) does little to help us get rid of them, whereas being heterodox and always sceptical of 'old mantras and ingrained beliefs' (the phrase was inclusive and self-reflective, don't worry) may do more to that effect. Like many nowadays, I refuse to be drafted into some army not of my own making and I prefer to advance our understanding of human language by working side by side. (Though not quite hand in hand with you, by the looks of it.)

    4. I believe that the situation is not quite so dire as you make it out to be, and the pervasive taking of sides obscures this. To give an example from this very thread, you are treating me as if I don't know anything about GG, whereas in fact I was taught syntax by the proud wielder of an MIT PhD (and semantics by someone with an ANU PhD, the other side for you I guess). Speaking of sides, here on the other side of the pond things look different and not quite so polarised.

    And now I'm off to board my plane.

    * Of course all of these people have also produced some rubbish ideas, but I take that as a given: scientists should have bullshit detectors aimed not at the level of schools or scholars, but at the level of ideas.

    ReplyDelete
    Replies
    1. @ Mark:
      First, I have no idea what you do or don't know. I know what should be beyond dispute. We have established lots of "facts on the ground." They are at different levels of specificity, ranging from language particular analyses to generalizations about kinds of rules to even some ideas as to what is language particular and what not. The last is most contentious, as it should be, for it is the most novel and currently topic of a lot of basic research, work which is always contentious. Some of it is, or should not be, contentious: again: that Islands affect movement rules, that construal rules take place under narrow licensing conditions which we have described, that phrases are endocentric, that rules are structure dependent etc. These are not theoretical claims, but simple observations in the same sense that the gas laws are not theoretical statements but simple observations. So, you don't buy these (or at least most of them) then you have chosen sides no matter how reasonable you try to sound. As I noted, you strike me as someone whose sophisticated skepticism puts you on "other" side. If you are, then we likely have little to say to one another. Just like a climate warming skeptic and a climate scientist have little to say to one another. Skepticism is cheap, real cheap, until it is backed up with argument, which, if you don't mind my noting, you do not provide. Remember my offer of a post or two. It still stands. I would love to see you make your case, o reasonable one!

      I always admire those who confuse substantive issues for sociological ones. So meta. Keeps your hands from touching the meat huh? These are not sociological disputes any more than flat earthers and climate science denial and creationism are "sociological" matters. They concern a dispute over, for lack of a better term, what the facts are. There is always room for skepticism about one or another. But this is not what we see. There is wholesale rejection and this kind of global skepticism has never been defended. It sounds nice to talk as you do, all sweet reasonableness, but IMO this is prissy BS. The sociology is not the problem. And that's why you need to decide.

      Last point: I WANT polarization. Bipartisan consensus is deadly when there is this kind of real divide. Ecumenism simply allows real differences to go unexplored. It's odd how everyone is enjoined to play nicely together. Criticism is as much a part of science as skepticism is. But forget the high-minded blather. Let's get down to details. Which of the results do you challenge? Why? Give it your best shot so we can clear this up. Stop hiding behind sweet reasonableness.

      Delete
    2. For Alex' psychological reality issue, I think that the existence of generalizations that lack alternative plausible explanations is the key - in English, Greek & German for example we find the same structural possibilities for 'traditional NPs' in most positions where the traditinal NP ingredients occur (some restrictions for prenominal possessors, and a tendency for less complexity in properly embedded ones), and identical for all phi-feature combinations, but different between the languages, even though they were once the same for all of them due to their descent from a common ancestor. It is hard to imagine what the explanation for something like this other than changes in some internal arrangements corresponding to NP structure, that is used in multiple places in the sentence, and has a lot of autonomy from whatever is responsible for the phenomena we describe using phi-features, whose details also differ between the languages (this is basically Martin Davies' '5%' argument, using historical change rather than imaginary surgeries to make the same point).

      Where we get into trouble, I think, is mainly for the following mathematical reason: we don't know how to distinguish in our generative theories between the stuff that captures the generalizations that clearly need an explanation in terms of internal structures (rather than historical change, interactions with the environment, etc), and the (unfortunately rather large amount of) stuff that we have to make up to get our theories to be generative.

      So, as alluded to by ewan, the literature and the textbooks is a jumble of very solid stuff that you could bet your house on, total fabrications, and everything in between (eg the details of VP structure), with no clear methodology for teasing them apart.

      Delete
    3. @Norbert

      When you say "I want polarization" it seems you are mixing up the concomitant confidence with bravado. Obviously it would be crazy to think that linguists have a special monopoly on bravado, but digging in won't make it any better. The Fortune Cookie Principle (so called because I found it once in a fortune cookie) -

      "Strong and bitter words indicate a weak cause"

      Replies will remain in the realm of pointless ideological polarization and out of the realm of useful controversy as long as anyone detects that your cause is weak - because then they need not reply with anything better.

      Delete
    4. @Ewan

      It all depends what you think the problem is. To you, the problem has been the inarticulateness of linguists. They have just not been able to make their arguments. They have overpromised, beomce defensive, spewed ink, muddied the waters etc. TO me the problem is that some already know what the answer is. This is big in cog-neuro where it appears that the leading practitioners KNOW, just KNOW that brain is a connection device and that the kinds of things linguists and cognitivists postulate just cannot be there. The same goes for some members of the CS community (I won't name names).

      If the problem is the first, then reform of linguistic practice is the solution. If the second, it will hamper matters to play nicely. How so? Well, first it will confuse linguists. A good deal of the problem with eschewing polarization in the second scenario is that it leads you to forget why you are doing what you are in fact doing. This has a terrible effect on one's own research. So, polarization is important for the home team. It is also important because it sharpens the points of disagreement and makes it possible to attack these in public fora. The problem with the mainstream opinion right now is that the wrong side has grabbed the mantel of being "scientific." Their positions are tendentious and they refuse to debate the actual issues, or the issues as I see them. I want to provoke them into open debate. Instead we hear plangent utterings about inference to the best explanation, methodological problems, the role of mathematics in gaining knowledge etc. I want the discussion to be about the ECP, island effects, binding theory etc. I want to make it clear that given every opportunity to make their case, the other side fails to even engage! What's their view, we don't know anything unless we know everything. We don't have reason to think that island effects are real until the theory is perfect, answers every question and has defeated every problem. That's a position that sounds reasonable if said calmly, but it is complete and utter junk.

      Do I expect to convert the heathens? No. Those engaged in the debates you see here are not potential converts. They have their views. The aim is to make the case as strongly, provocatively and amusingly as I can to convert the up and comers. Here bright lines matter and really help clarify matters.

      Let me raise your cookie with a Keynes:"Words ought to be a little wild, for they are the assaults of thoughts on the unthinking."

      THe problem, my friend, is not the delivery. It is a real dispute and all the kind words won't resolve it. Indeed, IMO, it will only give the game away. Reasonable words for reasonable positions. The opposition here is not, so we shouldn't be either. Polarization!

      Delete
    5. This comment has been removed by the author.

      Delete
  7. Just briefly peeking in and I want to say that I really like this thread and that the discussion so far has been illuminating in many ways. For what it's worth, I agree with Ewan that the clarion of GG (nice term) is its attraction but that overselling may be its problem; with Alex that explanatory adequacy is desirable and methodological pluralism necessary; and with Norbert that getting these disagreements out in the open is ultimately beneficial. I am also thankful for the discussion of some of the fundamental insights from GG. I second the Haegeman book (Larson is new to me).

    I understand Norberts position better now, but I confess to a bit of puzzlement about the ease of his acts of classification, such that even a 'good Chomskyan' (AC's own words) is seen as squarely in the other camp just because of some reservations about methodology and calls for integrating multiple methods. If pushed (and I have been rather agressively pushed) I would probably say things similar to Alex, though less from a mathematical point of view and more from a psycholinguistic one.

    Norbert has this funny way of amplifying what people say — but the effect is not that we hear more clearly, the effect is that everything is distorted. Reservations are not disownments, pluralism is not mindless ecumenism, and scepticism is not nihilism. The distortion is really not helping. In fact I think it is fair to say that the aggressive tone is sometimes amusing, but too often borders on the offensive. Take for instance the gratuitous comparisons to climate change deniers and creationists — really Norbert? Although I can see past that in this thread, I do think Ewan may be right that it invites implicatures in line with the Fortune Cookie principle.

    Finally, a question for Norbert: am I right in sensing a bit of ahistoricism in your view of "our field" and "its methodology"?

    ReplyDelete
    Replies
    1. @Mark: Here's a reply in two parts.

      1. Let me clarify. Alex is not a good Chomskyan, regardless of his interest in explanatory adequacy. As I've been trying to make clear, unsuccessfully I see, after 60 years modern GG has a body of results. Being a reasonable participant in linguistic discussions, including concern with explanatory adequacy, requires that you adopt a bunch/chunk of these as reasonable working hypotheses (i.e. fixed points for going forward, things to explain and build on). The wholesale skepticism that Alex evinces, and here's the quote again so that you can see I am not putting words into his mouth, indicates that he does not believe that GG has shown anything, and explains why.(between **s)

      **Alex ClarkMay 14, 2014 at 1:55 AM But it brings
      ...But it brings up my concern about methodology; why should we think that the standard methods of syntactic analysis will bring us to the (or a) psychologically real grammar for a language? I don't think there are any good arguments for the methodology, and so I don't necessarily believe the results produced by people using the methodology (quite apart from concerns about data, precision of theories etc.)...**

      I know that you and Ewan don't want to believe this. After all, why believe it just because he said it. But that's his point of view. Can we agree that WERE someone to adopt this stance that this is not something to compromise on? Can we agree that this kind of wholesale skepticism puts you at odds with the Chomsky vision and, dare I say it, the kind of inquiry you, Ewan and me want to engage in? I am happy to let Alex climb down from this quote and explain what he does take as fixed. But till he does this is what he said and, I am willing to take him at his word, unlike the two of you.

      Now, the extreme rhetoric. It is not extreme if what I have been accurate in my description of Alex's views. Look this is not a call for "integrating multiple methods," or "some reservations about methodology," or a call for "methodological pluralism." This is a rejection of what GG has found, and that is not a plea for tolerance but a call to reject work of the last 60 years and consigning it, with a few methodological qualms to the garbage heap. I am not going there. I know what we have found. I've listed examples galore. I have not seen a single critique suggesting that these findings are bogus. But I am supposed to be nice and tolerant of views that suggest just this? Why? To sound reasonable? To keep debate about ludicrous positions open? Why would anyone think that kind of open-mindedness laudable?

      Delete
    2. @ Mark:

      2. I have spent a good deal of this blog discussing methodological issues. I have highlighted work using different platforms, much of which dovetails with results discovered by grammarians (e.g. no filled gap effects into islands, online c-command restrictions, principle C in young kids, hierarchy sensitivity in fMRI etc.). Everyone (including me) wants more data, of different kinds, all pointing in the same direction. I have discussed this. I have even pointed to problems with some current theoretical and empirical discussion. I have spent tons of time outlining particular arguments using concrete examples and available research. I am happy to be critical (and have been). What I won't concede is that GG has accomplished nothing (please read that quote!) and that it is ok to concede this as if it were some kind of scientifically legit position. It's not. And the pretense that this an acceptable stance to take if one dresses one's reservations in "methodological qualms" is, IMO, disingenuous at best and…well I won't say what I think it is at worst.

      So let me repeat: multiply platforms, criticize research, use all the methods you want. That's all inside baseball as far as I am concerned. What's beyond the pale is cheap and lazy pretense to open mindedness when all that is being promoted is a universal skepticism. Put your cards on the table: what do you take as fixed and what do you reject. What are reasonable fixed points going forward. If, like me, you think that something roughly along the lines of GBish syntax is more or less correct and can be taken as such, let's keep talking. If not, we will have nothing to say to one another, which is fine, but let's not pretend otherwise.

      As for the ahistoricism: I'm not sure what you mean. IMO there is a coherence to how GG has proceeded from LSLT through to MP. IMO, the amount of actual change, even theoretical, has been modest rather than extreme. MP is the most radical proposal, but the work over the last 60 years has been, by and large, very conservative with new theory building pretty well on the results of earlier insight. The methodology, such as it is, is pretty standard stuff as far as I can see. There has always been work on GG themes using various methods: computational, psycho, grammatical, comparative, and, recently, some neuro. Some of the work is better and more insightful, some less so. But this is all pretty normal science stuff as far as I can see. So, I would not say that there is ahistoricism. I agree that this is a big tent view, abstracting from the intense dog fights that linguists love. But I would be happy to defend it. In fact, I am on record as believing that most of the "framework" issues often pointed to as indicating divergent conceptions of grammar are severely overwrought and that most of what we consider different frameworks are little more than notational variants, at least over large parts of the relevant data. So, I guess I am willing to stand by my overall characterization of "our field" (GG) and "its methodology."

      Delete
    3. @ Mark:

      3. Let me add one more point, for I fear it will be misinterpreted. The results of many different people using very different frameworks have been integrated into the consensus view I think roughly right. Bresnan, Postal, Perlmutter, Ross, Burzio, Baker, Rizzi, Cinque, Steedman, Benveniste, Vergnaud, Lasnik, Reinhart, Borer, Levin, Sag, etc. etc. etc. There are debates, but the stuff is more or less methodologically continuous, and, IMO, despite apparent framework differences, really pretty much the same stuff. So, though I personally clock GG time using Chomsky coordinates, I don't think that the interesting work was all his. Indeed, many of the "laws of grammar" I take as fixed were not due to him, but to others.

      Note too that I am not insisting that you buy the Chomsky "theory," though I clearly do, more or less. Here I can be ecumenical. What I insist on is that you buy the "laws" and generalizations we have found, describe them as you will, with whatever framework you want. Here my views echo those of Steedman's quoted by Matthew Gotham above. There exists a very wide consensus and that's what I take the body of doctrine, the results of 60 years, to consist in. Wholesale skepticism of THESE results is inadmissible, IMO. That's what I take to be akin to climate science denial (and yes, them fighting words, and yes, I mean them to be).

      Delete
    4. @Mark:
      Like you I am traveling, so just briefly a comment on Norbert's "What I insist on is that you buy the "laws" and generalizations we have found, describe them as you will, with whatever framework you want." - a while back Paul Postal asked on this blog what the laws are that had been found by GGers. It appeared from Norbert's answers that there really are none...

      Delete
    5. @Norbert, with regard to the ahistoricism, what I wanted to find out was whether you think linguistics is about more than the window of six decades that you focus on here.

      So for instance Bloomfield's constituent structure, Malinowski's observations about the social functions of language, Sapir's work language description, von der Gabelentz observations about language as structured system, Peirce's pragmatist semiotics, De Saussure's langue/parole distinction, Harris' discovery procedures and his transformational grammar, Zipf's observations on language as a toolbox, Wundt's cutting edge views on spoken and gestural forms of language, Greenberg's research program into language universals — all of these produced foundational insights that I think deserve pride of place as discoveries of linguistics. Yet your horizon here seems strictly limited to the past six decades, and in particular to generative grammar. As if there was nothing before it, and as if there could be nothing besides it. That puzzles me.

      Delete
    6. Syntax barely existed before GG. So for me that's when things began. I do think Chomsky defined a new question, one that enthrallwd me. So, nope, I don't think much of what came before. I don't much care about language. I care about FL and it's structure. That's what GG studies. That's what I find interesting. So yes, past 6 decades it is.

      Delete
    7. I should add that I have nothing against other studying this. But modern GG of the Chomsky stripe, my kind of linguistics, started in the mid 1950s. Linguistics as the study of a mental faculty, not language. This started with Chomsky, so far as I can tell. That's why the blog is called the faculty of language.

      Delete
    8. @Christina I think the thesis that syntax is mildly context sensitive would be an excellent candidate for a very general and far-reaching law - LFG and HPSG are currently not set up in such a way as to yield only languages in that class, but I doubt very much that their extra power is being used in any legitimate way, so some way to trim it off will hopefully become apparent.

      Beyond that there are large numbers of more limited observations that somebody suffiently knowledgeable and clever might be able to find good explanations for. So for example 'agreement' (head-marking, cross-referencing, displaying features of a subconstituent of something that you are a head of) never seems to go down past a lexical head of an argument (no agreement with a possessor subconstituent of a subject NP; if it looks like that is happening, something else is going on), whereas 'concord', showing features of something you're contained in, does, and can manifest multiple levels at once (Kayardild, Lardil, etc).

      Ash Asudeh's observation that 'resumptive pronouns' are always identical in form to ordinary anaphoric pronouns would be another; a different way of putting this woud be that if the relativized NP in an adnominal RC is specially marked, it must move (original observation by Ed Keenan; note that for corelative clauses as found in South Asian languages, the NP_rel's, which have a special 'j-' determiner, don't have to move).

      Lack of mathematical insight into the nature of the differences between the frameworks, especially in terms of how they capture generalizations and what their implications are for learnability, would be my candidate for the major difficulty in formulating laws.

      Delete
    9. Avery, apologies that I did not see your comment way down-under here earlier. First, note that me saying [for any X discovered by generativists] "X is not a law" does not imply me thinking "X is worthless crap". My comment was merely meant to draw Mark's attention to a discussion Norbert and Paul had earlier on the topic. Further, when Norbert provides a list that is meant to inform those who do not already know about the successes of GG it could not hurt to provide a bit more detail. Again, I am in no way implying Norbert could not provide such detail, I merely quoted someone saying he did not.

      We agree on the difference between discovering laws and formulating laws in mathematically precise ways. Now my worry here concerns two things:

      [1] If, at this time, we have no precise descriptions why call discoveries 'laws'? I think GG is pretty unique re the number of laws that allegedly have been discovered. [For this reason I find the derogatory comments by Confused Academic quite puzzling: why would he think MORE original discoveries are needed - Norbert is certainly right to say the program has been very fecund.]

      [2] In his early career Chomsky attempted to provide a scientific approach to doing linguistic research and he set out to give an account for [linguistic] creativity and language acquisition. I think, largely [but not exclusively] due to his efforts linguistics is a science today. By now Chomsky has abandoned the idea that we can account for creativity [any time soon], but he and those working in his framework still claim that children need some domain specific innate endowment to overcome POS. And this is where the problem of 'exceptions to generalizations' arises: IF all human kids have some kind of innate machinery that provides the information they could not get from the input then it needs to be applicable to all cases [otherwise how could kids know the exceptions]. So there is a tension here that needs to be addressed to overcome challenges like say those posed by Ben Armbridge et al. here: http://ling.auf.net/lingbuzz/001936

      I realize of course that for many working linguists the problem of language acquisition is of little interest. So from their perspective it matters little if a generalization they propose has some exceptions, as long as the exceptions are noted and an explanation for them is offered (people like Confused Academic seem to miss that linguists can be engaged in very different projects). So I do not think you and I disagree on anything of importance.

      Delete
    10. @Avery: I said no such things (a-b), nor did I insinuate anything to that effect. Christina continues her wonderful tour of world make-believe. Perhaps, she just needed to insert some more baseless claims because she is cool like that.

      a) "why would he think MORE original discoveries are needed"

      b) "people like Confused Academic seem to miss that linguists can be engaged in very different projects"

      I am perfectly happy for people to derive these generalizations through domain-general (learning) mechanisms (like Norbert, if I understand him correctly). Who wants more innate info if we can get away with less!! Really! Sadly, these generalizations are not what people try to account for with such mechanisms, with rare exceptions, of course. People instead talk about sentences being beads on a string and other such ludicrosities!

      Delete
    11. @CA: you understood Norbert correctly.

      Delete
    12. @Christina: Thanks to Bayes, I don't understand why anybody thinks there have to be many, or indeed any, exceptionless 'laws'; `all' that's needed is a ranking of possible grammars such as the learner stops learning (or, probably better, slows down its learning rate in some asymptotic manner) when its grammar fits the data 'well enough'. 'All' in quotes, because finding the right ranking is clearly not trivial. My proposed 'laws' were chosen to be less involved in various kinds of theoretical complexities and uncertainties that Norbert's, and also things that could I think have been explained to people like Ken Pike, Joe Greenberg, or Otto Jespersen. Of course they are also isolated factoids that make no overarching sense.

      Regardless of what your position is, there are people who think that GG is complete nonsense, and the fate of Freud, & also the pre-Behaviorist 'Structuralist' psychology (Titchenor et al.) shows that this is something that people could reasonably spend at least some time worrying about.

      @CA: I'm not aware of having said anything about (a-b), perhaps you've mixed me up with somebody else?

      Delete
    13. This comment has been removed by the author.

      Delete
    14. (error in previous version)

      @Avery: It was Christina who said them. I was clearing my pseudonym of those accusations (baseless, as usual). I addressed it to you because you were the recipient of those claims by C.

      On Bayes. I like Rev. Bayes a lot, but one needs to be a little careful in trying to account for exceptions through Bayes, since there are at least two kinds of exceptions (a) principled or systematic exceptions (similar to the ones Norbert mentioned in 23-27 above) (b) probabilistic exceptions - those that are exception only due to the probabilistic nature of the generalisation. Bayes or any other appropriately nuanced/structured statistical learner will work well with (b), but (a) doesn't need Bayes/statistical learners. What it needs a better understanding of the generalisations themselves.

      Delete
  8. The kind of thing I'm thinking of is Principle C, which is surface-false in some languages, such as Old English, but seems to be favored somehow in linguistic evolution, as Ellie van Gelderen described in her book on the subject; over a rather long period of time, the original purely intensive 'pro-self' combination relentlessly gained ground as a reflexive, starting from nothing, and eventually becoming obligatory in most contexts where 'principle A' is applicable. Of course, as many people have pointed out here and elsewhere, 'naked Bayes' is unlikely to explain anything, or serve as a satisfactory account of acquisition.

    ReplyDelete
  9. Thanks for the comments Avery. I do not think it makes sense to engage with "Confused Appendix" until s/he decides to reveal his/her true identity. A person who hides in anonymity, distorts what others say, and, seemingly, steals phrases like "baseless, as usual" from Chomsky does not engender much trust...

    Your suggestions make a lot of sense when one looks at the problem of language acquisition from the perspective of someone who wants to give a computational/machine learning account for modern-time language learning. Such a person knows what possible target languages are, what input a child might be exposed to, and how different hypothetical grammars could be ranked to minimize the input needed to decide between them. So you are right to say that for this scenario one might not need exception less accounts but: "...`all' that's needed is a ranking of possible grammars such as the learner stops learning (or, probably better, slows down its learning rate in some asymptotic manner) when its grammar fits the data 'well enough'.

    But this is where the problem of language evolution raises it's ugly head. Unless you're a Postalian Platonist [which I gather no one here is - no worries, Alex C.. I do not want to start that debate again] you have to assume that at one point in the distant past our ancestors had no language at all. And whatever story we assume, our slightly less distant ancestors had probably a simpler language than we have now and, if language just started in one small breeding group before dispersal from Africa, just ONE language. So it seems odd that evolution would equip these ancestors with a variety of possible grammars that the learner then has to rank. At this point in the distant past one grammar would have done just fine. Further, even if through some miracle-mutation the distant ancestors would have been equipped with a large number of potential grammars [slow down for a moment to contemplate how likely that could be accomplished in a single mutation that slightly rewired the brain] - unless the language learners received input from a variety of languages one would expect that soon just the grammar for the one language spoken in that group would have remained innate. [Recall here Chomsky's argument from the Norman conquest: all the differences you speak about above arose AFTER the miracle installed the language faculty]

    ReplyDelete
    Replies
    1. @Christina: Alright, I shouldn't, but I'll bite. Your causal chain hinges on the assumption that specific grammars are genetically encoded, which is not what nativists claim. The principles restricting the class of viable grammars are, according to them.

      even if through some miracle-mutation the distant ancestors would have been equipped with a large number of potential grammars [slow down for a moment to contemplate how likely that could be accomplished in a single mutation that slightly rewired the brain]
      This makes the same mistake as the infinity-fallacy (the assumption that there is no small evolutionary step that could take language from finite to infinite stringset). The infinite-fallacy ignores how grammars are encoded: the addition of loops, a minor step, takes finite-state automata from finite string languages to infinite ones. Such simple steps also exist for tree automata or rewrite grammars.

      What does that have to do with the quoted passage above? Your argument also ignores how grammars are encoded. Unless your grammar is tiny, writing it down requires more bits than specifying the class of grammars. So if evolutionary parsimony is a requirement, as you assume, it argues in favor of multiple languages, not against it.

      unless the language learners received input from a variety of languages one would expect that soon just the grammar for the one language spoken in that group would have remained innate.
      Here you assume 1) that specific grammars are innately encoded, as before, and 2) that learners uniquely identify the correct language from finite input, which simply isn't the case. That's how diachronic change happens, after all.

      Delete
    2. Just to amplify: if what is specified is a set of grammars, say a finite set of n grammars then one way to represent it might be to have a list of all of the grammars explicitly represented. But that is not the only way. If the set of grammars is all MGs that can be written down in less than 10^{100) bits, then you can specific that very large class much more compactly in the way I have just done, which takes much less than the 2^(10^100) bits that it would take to write them all out one by one. Of course, often nativists are not very clear about what they mean; I myself am often lead to the sort of interpretation that CB makes.

      Delete
    3. @Thomas: thank you for the education. If you now just could specify what the principles restricting the class of grammars are and how they are genetically encoded I'd be much obliged. I do not deny that one can specify very large classes of grammars as compactly as Alex says but now we have again the acquisition problem: POSA are initially very persuasive because it appears indeed that the child lacks the input needed for some of the finer details of the linguistic structures she produces. But in order to succeed in getting the difference between say
      [1] Noam is eager to please.
      and
      [2] Noam is easy to please.
      the innate information needs to be a lot more specific than what Alex proposes. So exactly WHAT is innately provided?

      Further, from your point "...learners uniquely identify the correct language from finite input, which simply isn't the case. That's how diachronic change happens, after all." I assume that you disagree with Chomsky's argument from the Norman Conquest - because you seem to say diachronic change matters - and presumably then LF has changed since the first mutation?

      Finally, can you please clarify what you mean by the infinity fallacy? I assume you claim that "there is a small evolutionary step that could take language from finite to infinite stringset" What is the ontological status of the infinite stringset on your view? Since you do not talk about encoding here I would assume the stringset must have some neurophysiological reality - is that correct?

      Delete
    4. @Christina: I don't see the problem with language evolution; the original proto language was plausibly some sort of cultural discovery, lacking any genetically specified grammar, but something that could be picked up by early humans on the basis of whatever facilities they already had for communicating and learning by imitation (as social animals, they would have already had some, just like the magpies in my backyard do); Dan Everett likes to assert (whether because he really thinks it's true, or to get people to think harder about the possibilities I'm not really sure) that that's all there has ever been behind language. I think this is probably wrong, and that there probably are some specific adaptations that facilitate learning certain kinds of languages rather than others ('narrow UG', I'd like to call it, as apposed to 'broad UG', the entire cheatsheet for the acquisition of language, regardless of where it came from or what else it applies to), but I see no reason why we need to suppose that such adaptations would accumulate to the point of allowing only one grammar (not least because, as Dan points out, it is possible that they don't exist at all).

      Fully independently of whether narrow UG exists, most of the things on Norbert's list are very suggestive about the possible nature of an acquisition bias (tho some of them are a bit too theory-internal for my tastes), and that many of them have arguable exceptions doesn't affect this.

      Delete
    5. Avery: we agree that it is a hard problem. We also agree that the claim 'there is nothing but [cumulative] cultural innovation to language" is probably false. Assuming that something gotta be innate one has to ask what that could be. In the scenario you sketch I just do not see why an innate mechanism restricting the class of possible grammars would be required/helpful for human learners [as opposed to hypothetical Turing Machines]: there was always a finite amount of grammars in the PLD, for individual children probably a number well below 10. These grammars had differences to be sure but all of them were say mildly context sensitive. So why then would we need an innate mechanism restricting the class of grammars to mildly context sensitive if no other grammar is ever encountered by the learner? I think many of the suggestions by generative grammarians already assume what needs to be established, namely that humans have some computational mechanism that generates grammars. If you at least consider this assumption could be wrong, then the genetically built in restriction of the class of possible grammars makes little sense ...

      If we assume there was more to language evolution than a single mutation, then we have to look at the adaptationist issues the Hauser et al. paper mentions: if something IS innate it best confers some kind of selectional advantage or at least does not cause any disadvantage. So anyone defending a version of your scenario faces the question: WHY did languages become so hugely complex that [seemingly] kids no longer can learn them from the PLD; if they started out much simpler? What IS the advantage [cultural or otherwise] to have languages that have Norbert's 1-27 [exceptionless or not] over languages that have say 28-56? As long as we look from the adaptationist perspective quite simple languages are probably all that was needed to secure a huge advantage over creatures that have no language. There are a few interesting chapters dealing with these issues in the 2009 volume "Language Complexity as an Evolving Variable" edited by Geoffrey Sampson, David Gil, and Peter Trudgill.

      Now if there is no selectional advantage of complex languages over simple languages then genetic evolution might be the wrong place to look at when we want to explain why [most ] languages are so complex. But advantages can be indirect and there are a few people who try to address these kinds of questions [Terry Deacon's 'Relaxed selection' proposals come to mind, but anyone who works on brain evolution/current brain research might be able to contribute].

      It seems to me that because the problem is so complex no one person/group is likely to come up with the solution on their own. And for this reason alone the vile hostility some participants of the debates express towards those who disagree with them should become a thing of the past.

      Finally, I hope Thomas will answer the questions I asked, especially re the ontological status of the infinite string sets.

      Delete
    6. @Christina: If you now just could specify what the principles restricting the class of grammars are and how they are genetically encoded I'd be much obliged
      I'm afraid I don't understand why that is relevant here. My point was that the evolutionary argument against strong nativism you provided doesn't fly because one of its antecedents is inconsistent with the strong nativist position. That doesn't mean the strong nativist position is in safe waters from an evolutionary perspective, but that's a different argument.

      But if you want a full list of axioms, I suppose that the formalization of GB in Jim Rogers' Descriptive Approach to Language-Theoretic Complexity is a good start for what would need to be encoded under the assumption that all of GB is innate. As for how they are encoded, whatever method you assumed in your thought experiment for genetically encoding a grammar will do.

      I assume that you disagree with Chomsky's argument from the Norman Conquest - because you seem to say diachronic change matters - and presumably then LF has changed since the first mutation?
      I was referring to diachronic change in the absence of language contact, which is a well-established phenomenon. This shows that learning is imperfect, wherefore even a single language as input will give rise to multiple grammars in the population. Hence there is no convergence towards a single grammar.

      Finally, can you please clarify what you mean by the infinity fallacy?
      One sometimes finds the sentiment that language could not have evolved as a series of small steps because no small step can take a finite stringset to an infinite one. But speakers know grammars, not stringsets, and a small change in a grammar formalism can bring about exactly this change. In hindsight, it's not the best example for illustrating my point, since it is about the representation of languages via grammars rather than the representation of grammars themselves.

      Delete
    7. Thank you for the reply Thomas. As I said in my conversation with Avery: as someone who does not subscribe [yet] to the generativist framework, I am not convinced by the argument that having innately specified limits on the classes of possible grammars is required/helpful. It may very well be within the generativist framework. But what evolutionary story do you have to convince me [or others] that the generativist framework ought to be preferred? Thanks for the literature reference, do you have by chance anything in pdf form one can access when away from university libraries?

      Now to a couple of points we disagree on or I'd like more detail:

      1. You say: "As for how they are encoded, whatever method you assumed in your thought experiment for genetically encoding a grammar will do."

      I am afraid not. Lets assume on my story only general domain components A,B,C are genetically encoded and lets further assume I have an account for how this is done. If your story requires components A,C, D,E.F to be genetically encoded and E,F are domain specific, then there is no a priori reason to assume that what works for A,B,and C will work for E and F as well. So it would seem the burden of proof is on you to provide some kind of proposal for the encoding mechanisms required by your account. ['you' here is not the personal you - I do not expect you have answers to everything but the generic 'you' for someone on 'your team'. But it needs to be a person who is familiar with the requirements of your account not someone who works on very different accounts]

      2. You say "learning is imperfect, wherefore even a single language as input will give rise to multiple grammars in the population. Hence there is no convergence towards a single grammar"

      This seems compatible with very weak nativist views, what exactly is the contribution of strong nativism then?

      3. You say: "One sometimes finds the sentiment that language could not have evolved as a series of small steps because no small step can take a finite stringset to an infinite one"

      I am not aware of anyone making such a claim, can you please provide an example from the literature?

      4. You say: "But speakers know grammars, not stringsets,"
      On Chomsky's view I know a grammar when my innate language organ is in a certain state S. Do you share this view? Depending on how you answer I'll have some additional questions but it does not make sense to ask them before I understand your view of 'knowing a grammar'

      Delete
    8. @Christina "In the scenario you sketch I just do not see why an innate mechanism restricting the class of possible grammars would be required/helpful for human learners [as opposed to hypothetical Turing Machines]: there was always a finite amount of grammars in the PLD, for individual children probably a number well below 10. " I don't understand this. The PLD contains no grammars at all, just stuff that a witness might try to imitate & extract patterns from, which GG's cognitive metaphor tends to describe (harmlessly I think, although controversially to some people) as 'drawing conclusions about what the grammar is'. Early people would presumably have some bias towards drawing some conclusions more readily than others from experiences, 'proto-broad UG', I'll call it. An example I noticed from dog obedience training is that it's relatively easy to train dogs to 'stay' in whatever posture their in, or to 'sit' from a standing position, or 'lie' from a sitting one, but considerably harder to get them to 'sit' from either a lying or a standing position (that is, it's in course 4, which I dropped out of). I doubt that anybody has a theory of why dogs generalize in this way, but it appears to be the case that they do - commands are more easily associated by them to 'actions' than to final states, and holding your present position counts as an 'action'.

      Why and how language got complicated than what can be taught to dogs (and, very likely, shifted from visual/gestural/mime mode to vocal mode) is probably not knowable, but some kind of Broad UG (proto-Broad UG as updated by subsequent cognitive changes not specific to language) would have been there from the beginning, and Narrow UG might for example have arisen as a side effect for adaptions to producing impressive and useful performances (production is harder than comprehension, so a good performer might get various advantages, including direct reproductive ones via sexual selection (I think this still works today, to some extent)). Anybody can substitute their own favorite speculations here. & as Everett claims, Narrow UG might not have ever arisen at all. But there is still a bias that we can find things out about. Greenberg's universal 20 would btw be another interesting addition to Norbert's list.

      Delete
    9. @Christina:I am not convinced by the argument that having innately specified limits on the classes of possible grammars is required/helpful.
      The learner needs to know the target class of languages. Even seemingly minor points of uncertainty can bring down learning algorithms. For instance, the class of strictly 2-local languages is learnable in the Gold paradigm, and so is the class of strictly 3-local languages, but the union of those two classes is not. Even in your scenario where there is only one language, the learner needs to know the target class, because there are infinitely many languages that it could generalize to from a finite amount of input. And since a learner learns a language by constructing a grammar for it, the prior of the learner amounts to a restriction on the class of grammars.

      But what evolutionary story do you have to convince me [or others] that the generativist framework ought to be preferred?
      I'm not trying to convince you of anything except that the argument you presented above does not show that strong nativism is evolutionary implausible. That being said, if you're asking why generative frameworks should be preferred to other alternatives for the analysis of language, why would an evolutionary argument be necessary? The frameworks tackling the kind of phenomena Norbert listed above are all generativist (GB, Minimalism, TAG, CCG, LFG, HPSG, GPSG, Postal's work, Jackendoff & Culicover, Arc-Pair grammar), there is no other game in town.

      Thanks for the literature reference, do you have by chance anything in pdf form one can access when away from university libraries?
      Here's an overview paper. You'll have to unzip it and convert it from ps to pdf.

      Lets assume on my story only general domain components A,B,C are genetically encoded
      This strikes me as incompatible with the assumption in your thought experiment that a specific grammar is innate, i.e. genetically encoded. At any rate, if you can encode a grammar, you can also encode restrictions on the class of grammars. The former requires more than the latter since the latter can be done by giving what amounts to a partial definition of a grammar, e.g. in Roger's L2KP (see the linked paper).

      You say "learning is imperfect, wherefore even a single language as input will give rise to multiple grammars in the population. Hence there is no convergence towards a single grammar". This seems compatible with very weak nativist views, what exactly is the contribution of strong nativism then?
      Yes, of course it is. The point was not to support strong nativism but to argue against your argument against strong nativism.

      I am not aware of anyone making such a claim, can you please provide an example from the literature?
      This has come up in personal discussions with biologist friends of mine, who do not work on language evolution, so their ideas about language are often very "unmentalistic". My point was not that this is a common fallacy that utterly discredits biologists, I just thought it would be a good example for why encoding matters. But I already said above that it probably isn't all that great an example.

      On Chomsky's view I know a grammar when my innate language organ is in a certain state S. Do you share this view?
      I'm agnostic. Formally, knowing a grammar means that the learning algorithm has converged to a grammar that generates all and only the strings of the language from which the input strings were sampled. Realistically, this exact generation requirement is too strong, but for everything I care about the formal abstraction is good enough.

      Delete
    10. One aspect of Alex' views that I don't understand is why he thinks that capturing linguistically significant generalizations isn't empirical - what is true about them, like all the other sources of evidence, is that they don't uniquely determine grammars. There does appear to me to be a kind of 'justificatory discontinuity', whereby it is relatively easy to motivate basic ideas about constituency, phrase types and features, and a few other points such as wh-gaps, but then, in order to build up a system that can actually relate sound to meaning in the manner envisioned in the late 1960s and maintained as the main goal ever since, a lot of stuff seems to have to be made up.

      Originally, transformations seemed to address this problem by letting us work back from overt to covert structures one small step at a time, but they had apparently insuperable problems, and people started making up different kinds of stuff to deal with that, none of these inventions being sufficiently well justified for almost everybody to accept them. But this problem is presumably one of too much ignorance, not one specific to capturing generalizations, which in fact allow quite a lot of pruning of alternatives, excluding, for example, versions of PSG without something like features (since the internal structure of NPs with different feature combinations is (almost?) always almost (completely?) identical.

      Delete
    11. @Avery: the notion of " linguistically significant generalization" is quite hard to pin down. Pullum, Hurford etc have written on this. There are obviously generalizations that one can make that shouldn't be represented in the grammar, and how to draw the distinction between those and the LSGs is a subtle question. Maybe I wouldn't say that they aren't empirical, rather that the boundary between the empirical and the non-empirical is not clear in this case.
      E.g. the first part of Hurford 1977 "The signficance of Linguistic Generalizations" where he says "The motto of such linguists could fairly be represented as 'Seek out all the LSG's you can and capture these in your theory.' To cite examples of this attitude would be superfluous: every linguist is familiar with the kudos of 'capturing' LSG's and the odium of 'missing' them.".

      Delete
    12. @Thomas: Thank you for your patience. I think it slowly emerges that we are talking about different issues when we talk about 'the language learner'. You seem to have the formal learner in mind while I am concerned about what an actual child in the real world does. Obviously the tasks faced by those 2 learners are very different. Yours indeed needs to settle on the correct grammar while for mine that really does not seem to matter [assume 10 grammars overlap completely in the range of the PLD encountered by the kid; why would it matter to her if she 'constructs' grammar 4 or 8?]. Also it makes sense for your learner to consider grammar-construction in isolation while for my learner if grammar construction is a goal at all it is one among many - most important is probably being able to communicate with parents and other kids. As long as she accomplishes that any grammar [fragment] she might settle on should be fine.

      Also note that POS is less persuasive for real world learning: as you say, kids do not achieve complete mastery of any grammar and they do receive all kind of linguistic and non-linguistic feedback - [even though that may be difficult to measure]. I notice often that non-nativists commit the following error of reasoning: when they have succeeded to model some aspect of language acquisition with a domain general model they will say: see therefore kids do not need a domain specific mechanism either. But this does not follow at all - kids still may have a domain specific mechanism. Similarly; if you show that an abstract learner that depends on PLD alone could not succeed in grammar construction unless there are strong innate constraints, it does not follow that kids [who rely on a host of other information] could not succeed either.

      As for the evolutionary account: imagine you construct a 'perfect acquisition model' but it turns out such a model could not have evolved in humans. Then you'd know that humans must acquire language in a different way [unless you're a Platonist about language and do not claim it is part of human biology - then it truly does not matter].

      I guess for the time being we can just agree that we're interested in different things when we talk about language acquisition. Thanks for the literature files, I'll try to read them soon. One last point: Postal has said repeatedly [in print] that his work is no longer in the GG framework. I think as a matter of curtesy one should accept that Postal knows best what framework Postal's work falls under.

      Delete
  10. 1. @ Alex/Thomas
    I reply here because the damn site prevented me from replying above. Excuse me.
    Let me modulate my very last comment. Say that Scandinavian languages do not obey the CNPC (which, to repeat, I don't believe is true in full generality). But let's say it is. So what? It simply makes Plato's Problem harder. So, as a research strategy here's what we do. First assume that islands have no exceptions, then see how to factor some exceptions in. Why? Well, first, it holds pretty well. Recall I have been saying that they are "roughly" accurate. Second, where it holds it is pretty clear it cannot be holding because we have evidence that it holds. So, consider islands the analogue if the ideal gas laws: in ideal languages islands hold. Ok, now we have a question. Why? They could not have been learned in those languages (English being a good example). Why because there is no relevant data in the PLD. So we must assume etc. etc. etc.
    \In sum. the facts of Scandinavian bear on how to refine the generalization, not what we should conclude about FL regarding this generalization. Again, it's exactly like the gas laws. They held approximately, but served well enough to undergird statistical mechanics.

    Now, let's return to Scandinavian. Let's say they really never conform to the CNPC (again, I don't believe this, but hey, let's say). Could this be accommodated? Sure, there are ways of doing this even within GB. In fact, it would argue for the conclusion that islands are not primitives of UG (a conclusion we have come to long ago). What's the right analysis then. Good question. Say it is A. Now we can ask, if A is innate or learned? Well, given English it cannot be learned (recall A covers both). And off we go.

    Alex's skepticism would be worth considering if factoring in the exception that is Scandinavian made the learning problem easier. But of course, if Alex is right, the problem is yet MORE difficult, not less. So, as a research strategy it make sense to assume that the Scandinavian facts are interesting outliers that require a somewhat more subtle description of island effects but they do not change the overall lay of the land.

    ReplyDelete
    Replies
    1. 2. @ Alex/Thomas
      Note that this is what happens in the sciences all the time (again the Gas Laws). So, when I argue that we should take the results of 60 years of work as given, I do not mean that we should not refine these generalizations. I am arguing that we have done more than enough to show that something very like these generalizations is correct and that assuming this is a good way of moving forward. In other words, I am recommending a research strategy based on what we have discovered. From what I gather from Alex, his view is that the fact that the generalization have counter-examples shows us that there is nothing there to explain and so we can ignore these facts. His learning theory need not accommodate them because they are false. This way, IMO, sterility lies. You want to go that way, I cannot stop you. But really, if you take that road, you are blocking research. We can discuss how to fix the CNPC, but really not much will change conceptually. Of course I could be wrong and if Alex wants to show me that his learning theories will work fine if we complexify the island descriptions (i.e. he can show how we acquire English and Scandinavian islands using the available PLD but his story cannot explain English alone) well, I would sing his praises to the stars. But, we all know that Alex cannot do this, and is not really interested in doing this. He is interested in stopping discussion along these lines in any way he can (right Alex?).

      So, yes Scandinavian is interesting. I doubt the data as described, but I may be wrong. But, whatever the correct description, it will only make the problem harder, so I suggest for theoretical purposes we ignore it for now. Sort of like ignoring the fact that we are not ideal speaker hearers, that we attain multiple grammars and that learning is not instantaneous. All fine idealizations. Add CNPC to the list.

      Last point: same goes for the other 29 laws.

      Delete