Tuesday, October 16, 2012

How to Play the Game

Imagine the following not uncommon scenario: Theory T claims that explaining the particulars of phenomenon P requires assumptions A1….An. Someone, S, doesn’t like one or all of these assumptions for a variety of reasons (never discount the causal efficacy of dyspepsia) and decides that T is wrong.  Here’s the question: what is S obliged to do? Answer: It depends.

There is no moral, religious, legal, or social obligation that S do anything at all. You can think anything you want, and say anything you feel like saying (as my daughter used to say: “Nobody is the boss of me!”).  But, if you want to play the explanation game, the “science” game, then you are obliged to do more, a lot more.  You are obliged to explain why you think the assumptions are faulty and (usually, though there are some exceptions) you are obliged to offer an (at least sketchy) non-trivial question begging account of P.  S cannot simply note that s/he thinks that T is really really wrong, or that T is unappealing and makes her/him feel ill, or that s/he wished T were wrong for some unspecified, no doubt, humanitarian reason.  Doing this is just deciding not to play the game.  Sadly, many critics of Generative Grammar have decided that they don’t like the theory, passionately articulate their dissent but do not follow the rules.  To repeat: nobody needs to play, but unless you follow the rules nobody should take your views (prejudices?) particularly seriously. Adherence to the rules of the game is the price of admission to the discussion.

You’ve probably guessed why I mention this. A lot of people want their views to be taken seriously despite not playing by the rules. For some odd reason they think they should be exempt because it’s just so clear that they are right and generative grammarians (especially Chomsky and his intoxicated minions) are wrong.  But, though this might sustain a warm glow of amour propre it does not admit you to the game.  Lest you think that I have descended into caricature, consider a recent short paper by David Adger - Constructionsand grammatical explanation – that does the heavy lifting exposing how far from serious certain well-known forms of construction grammar are. Those interested in the agnotology of science will enjoy this spirited well-aimed take-down. Here’s a teaser quote to whet your appetite:

…CxG [Construction Grammar, NH] proponents have to provide a theory of how learning takes place so as to give rise to a constructional hierarchy, but even book length studies on this, such as Tomasello (2003), provide no theory beyond analogy combined with vague pragmatic principles.

Let me leave you with a simple piece of advice that has served me well: when you hear the word ‘analogy’ reach for your wallet.


  1. Couldn't one make much the same criticism though of work in the Minimalist Program? In the recent handbook on Minimalism, there is a paper by Yang and Roeper defending what is in essence a P & P model of acquisition which seems superficially at least to be incompatible with the modern lightweight conception of UG. The slack seems to being taken up by '3rd party principles': and that is a term (like 'analogy') that makes *me* reach for my wallet.

    Thanks for starting this blog, by the way -- lots of interesting and thought provoking posts!

  2. I have not read the Yang/Roeper piece so I cannot comment on it directly. However, I agree that there have been times when third factor principles are alluded to that are unlikely to carry much explanatory burden. But I think that in one sense the parallel here is a bit unfair to the Minimalists. We have known that 'analogy' means nothing for a very long time. The main problem with third factor concerns is not that they necessarily mean nothing but that they are very underdetermined. I find Chomsky's allusions quite suggestive, the last being that there are very general, obvious useful rules of thumb, e.g. search can be costly, boundless look ahead is bad, recalculations are to be avoided. These are not yet principles, but they hint at some that we might make more precise as time goes on. Am I just saying this to be "nice" to my side? Maybe. But they have informed my work so I am sympathetic. If after a decade or so we are no closer to getting anything concrete, then time to rethink. The 'analogy'-meme has been around for a long long time (centuries?). Time to dump it and recognize it for the BS it is.

    1. 'Analogy', you are right, is not an explanation , but just a type of explanation, and it needs to be fleshed out in some concrete way to have explanatory power. I think that *can* be done is some cases, as has been done by for example Walter Daelemans and Antal van den Bosch with memory-based learning for e.g. stress assignment and so on. So I think, if you stretch analogy to include instance based learning, then it is really not all BS. However it does crucially need that extra step, I completely agree, or it is just hand-waving; and Construction grammar has been criticised from the other side of the fence as well for its lack of precision on what I think is the key point.

      But then, parameter setting is also not an explanation, but a type of explanation, and in order to give it some content you have to specify a) what the parameters are and b) how they are set; and there hasn't been much progress on either. Maybe I am being overly pessimistic about b) given Sakas, Yang and so on.

      So is there really a coherent model of learning in "mainstream generative grammar"? At the moment the situation seems, to an outsider like myself, a bit murky.

  3. Agreed about parameter setting being a type of explanation. Chomsky's main reason for replacing evaluation metrics with parameter setting models is his belief that the former are unfeasible (cannot be made to work). Whether they are is above my pay grade, frankly. However, even if parameter setting is the right approach (at least there are a finite number of these), the problem of setting these successfully and incrementally when they are not independent of one another (the typical case if Dresher, Fodor etc have shown to my satisfaction) is very problematic (btw: I say this in my 2009 book). Fodor and Sakas have considered some options for getting around this which include a rejuvenated evaluation metric. Dresher proposes an intrinsic ordering of parameters with a subset of information triggering its indelible value. Interesting work has been done exploring both alternatives (Lisa Pearl's thesis explored a version of Dresher's idea). So yes, there are models I know of, they are imperfect and there are problems. It just seems to me like every day science.

    Last point: P&P models have generated a lot of really interesting cross linguistic research in the absence of any final settlement of the above mentioned problems. I take this as proof that the idea is fecund and usable even if the overall conception has real problems. Whenever I have seen "analogical" accounts mooted, it has served more to block investigation than to promote it. If this assessment is correct, then the real problem with resort to "analogy" is not that it is wrong but that it is sterile, a far worse vice.

    1. You claim “the P&P models have generated a lot of really interesting cross linguistic research” and “take this as proof that the idea is fecund and usable even if the overall conception has real problems”. Translated into plain English this seems to mean: as long as the idea produces a lot of ‘linguistic work’ it does not really matter whether it’s true or false.

      As BIOinguist you are of course familiar with similarly ingenious arguments in ecological debates. It is widely accepted that we ought to protect ecosystems from harmful human influence. One simplistic way to evaluate ecosystems is to measure their biomass production: the more biomass a given ecosystem produces the better. Sadly it turns out that such simplistic view is problematic. In the 1980s eutrophication lead to drastically reduced oxygen availability for benthic North-Sea fauna. Mobile species who needed oxygen got out of there, sessile species died. However, some species that are adapted to anoxic and hypoxic environments (e.g., Phoronis architecta) greatly prospered and increased absolute benthic biomass significantly. This illustrated to naïve politicians that ecosystem productivity is a poor indicator for ecosystem health, and they accepted what biologists had said all along: biodiversity is a much better ‘measuring stick’.

      Similarly in linguistics we should not measure the value of a research program by its ability to produce massive amounts of literature and suffocate other research programs, but by its ability to produce literature of high quality and to interact productively with a wide variety of research programs. It seems that in spite of undeniable ‘fecundity’ the UG community fails rather miserably on criteria that indicate a healthy scientific research program.

    2. The problems that P&P models have is primarily with the parameter side. The principles seems pretty good empirically, at least to a first approximation. How parameters are set IS a problem, one that has been known for a while. There are some ideas out there (e.g. Fodor and Skas, Dresher and Kaye, Robin Clark, Yang to name some of the more prominent). I personally think that it's time to start investigating other approaches, but even though the problem in my view is still outstanding, we have learned something interesting (e.g. how we might impose independence on the parameter space and allow learning). It appears to me that you don't know much about the actual research (do you?). If you did you would see that underneath a lot of the rhetoric there is a fair amount of agreement about what the generalizations amount to. There is some differences at the margins, but the overall lay of the land is pretty similar across a wide array of different "frameworks." Actually, for precisely this reason I have always considered the "frameworks" more or less the same.

      At any rate, what do we know? That the GB (i.e. HPSG, LFG, GPSG) theories have converged on roughly similar descriptions for a core set of interesting phenomena. Something like locality principles for movement (two kinds actually, A and A'), locality conditions on binding, Cross over effects of various sorts, island effects, case marking and phrase structure generalizations etc. Not bad for about 50 years. We have learned all of this and it is roughly right.

      What next? Well to try and explain why these properties and not others, essentially Chomsky's program. I reject your "plain English" translation of my estimates, but given that you don't actually work on the problems I'm not sure what to say to you. Skepticism is easy from the cheap seats. Btw, I am pretty sure that we would also disagree about what constitutes a healthy research program, but as you are not part of mine (indeed are you part of anyones?) it doesn't really pay to convert you. I am ready to let others play with whatever ideas they want to play with. I am however, tired of hearing of the demise of mine from people who seem to know little about it except that they don't like Chomsky and his evil ways.

    3. Based on what do you conclude that i know little about your work or Chomsky's? From my cheap seat it sure looked like Chomsky was abandoning binding conditions in 1995 when he proposed minimalism as “a theory of language that takes linguistic expressions to be nothing other than the formal object that satisfies the interface conditions in the optimal way” (p. 171).

      So i would assume the binding theory you talk about has survived the 1995 changes? Can you please provide specifics [publications are fine, page references even better]. I also assume your version of the P&P approach is not subject to the criticism expressed by Fritz Newmeyer (2005). Possible and Probable Languages: A Generative Perspective on Linguistic Typology? Maybe you can let me know how you overcome Newmeyers points - they sounded fairly convincing to me.

  4. Thanks for pointing out the Adger piece here, hadn't seen it.

  5. In this blog the sun of your humour becomes almost blinding. You inform us science is a game played by rules (sadly you never tell us what these rules are), and that – surprise – non-Chomskyans do not play by these rules:

    “A lot of people want their views to be taken seriously despite not playing by the rules. For some odd reason they think they should be exempt because it’s just so clear that they are right and generative grammarians (especially Chomsky and his intoxicated minions) are wrong”.

    I would love to get the ‘intoxicated minions’ quote from you: who said this and where? But I digress. Your claim about refusal to play by the rules caught me completely off guard. For a moment I even thought you were talking about Chomsky because this is what he wrote about how he conducts his “science”:

    “You just see that some ideas simply look right, and then you sort of put aside the data that refute them and think, somebody else will take care of it” (Chomsky, 2009, p. 36).

    From my days as biologist I do not recall that ‘seeing that ideas simply look right’ or ‘sort of putting aside data that refute these ideas’ are rules of playing the science game. So actually it is Chomsky who openly proclaims that he does not want to be bogged down by pedantic rules. Instead he confidently relies on his abductive instinct, telling him that “conceptually it has to be like this” (ibid., p. 40), that his view is “close to true … so close to true that you think it’s really true ... overwhelmingly true” (ibid., p. 393). In case you’re skeptical that I have gotten Chomsky’s attitude towards science right, don’t take my word for it. One of your esteemed colleagues arrives at the same conclusion:

    “Can [Chomsky’s] goal be pursued by ‘normal scientific procedure’? Let us remember we are talking about someone who tries to reinvent the field every time he sits down to write. Why should we expect Chomsky to follow normal scientific practice...?” (Fiengo, 2006, p. 471).

    Apparently, you have an answer for Fiengo: “…nobody needs to play [the science game], but unless you follow the rules nobody should take your views (prejudices?) particularly seriously. Adherence to the rules of the game is the price of admission to the discussion”. Bingo – I could not have said this any better. Obviously, the implication of your proposal is that nobody doing science should take Chomsky particularly seriously.

  6. Scare quotes, they always get me and other embers of the "intoxicated minions" in trouble! Not much for tongue in cheek are you?

    Blinding, huh. I think you're pulling my leg, right? Funny.

    As you might guess, I don't think Chomsky reinvents the game every couple of years. I think that there is a pretty consistent line of thought that has persisted over the last 60 years of research, though the explanations have deepened, the earlier data and proferred generalizations have pretty much stayed the same. You can see what I think (if interested) by reading chapter 1 of my 'A theory of syntax.' The main point is that the results of earlier inquiry have been largely retained in later work, though they have been taken as "effective" rather than "fundamental."

    I believe that Fiengo is simply incorrect, as are you, if you believe that this correctly describes Chomsky's practice. Btw, your conclusion in the last paragraph should be EITHER one should not take Fiengo's description as accurate, OR should not take Chomsky as scientifically serious. I know where you stand and you know where I do. Suffice it to say, we are not on the same side.

  7. Last reply: 'embers' should be 'members.'

  8. i am not about to split hairs over obvious typos. Also i am glad you noticed "your conclusion in the last paragraph should be EITHER one should not take Fiengo's description as accurate, OR should not take Chomsky as scientifically serious". Indeed it should have been either/or given that you disagree with Fiengo. If you apply the same diligence to your own arguments we might just end up on the same side occasionally.

    But you have piqued my interest: you disagree with Fiengo who made this claim to defend Chomsky against Pieter Seuren's shall we say less than favourable evaluation of the Minimalist Program. I take a wild guess here but think you also disagree with Seuren? Maybe even with Peter Culicover?

    "The advent of minimalism in the mainstream of syntactic theorizing highlights an interesting shift in scientific values. At least from the Aspects theory through Principles and Parameters theory it has often been remarked that the syntax of natural language has some surprising, or at least abstract, non-obvious properties. One example is the transformational cycle, another is rule ordering, another is the ECP, and so on. Such properties are not predictable on the basis of ‘common sense’, and do not appear in any way to be logically necessary. The fact that they appear to be true of natural language thus tells us something, albeit indirectly, about the architecture of the language faculty in the human mind/brain... With the M[inimalist] P[rogram] we see a shift to a deep skepticism about formal devices of traditional syntactic theory that are not in some sense reducible to ‘virtual conceptual necessity’. Such a perspective thus explicitly rules out precisely the major theoretical achievements of the past. All of them.” (Culicover, 1999:137-138)

    Now you claim there was no such shift - so is Culicover wrong here?

  9. If you are interested in my views they are elaborated at length in things I have written, e.g. chapter 1 of of 'A theory of syntax.' In a nutshell, I take the minimalist program as aiming to deduce roughly the generalizations of GB from more principled assumptions, not unlike what Chomsky did with Islands via subjacency. This leaves the empirical basis of minimalism more or less that of GB, though I also think there have been some non-trivial empirical discoveries. On this way of looking at things, minimalism added addition a question to the menu of those syntacticians addressed (I discuss this in the last post). This does not repudiate the work in GB, though the new added question suggests rethinking some earlier results. Like I said, I elaborate this view at length in several places if you are interested.

    So do I think Culicover is wrong. Yes in part. The skepticism minimalism evinces (or should evince, people differ this is a reconstruction) regards not the rough empirical validity of the generalizations of traditional generative grammar, but denies that they are fundamental. Physicists have a good set of terms for this. They distinguish effective from fundamental theories. GB and its ilk are effective on this view. The aim of minimalism is to develop a fundamental theory to deduce the generalizations of GB.

    It is reasonable to ask how well this is going. In my view, quite well, though it is hard to do and people will disagree as they do in other theoretical fields. So Culicover is right that minimalists deny that the earlier theoretical achievements are fundamental, but he is wrong if he thinks that minimalists deny or need deny that the results are important theoretical achievements. I don't intend to draw a comparison wrt the richness of the two domains, but Newton's theory of gravitation was important without being fundamental, as were the ideal gas laws, the laws of thermodynamics, Bohr's atom, etc. They were important achievements, as was GB and its cousins GPSG, LFG, HPSG etc. However if Chomsky is right in his minimalist musings, then none of these are fundamental. They are way stations to a much more principled theory.

    My turn for a question: Why do you take these people to be right and Chomsky to be wrong? Because you don't like him or because you have evaluated these arguments and decided that Culicover etc had the better of the debate? Which of CHomsky's arguments in particular did you not like? Which of Culicover's compelling? Do you think it's dumb to add a new question to the research agenda? Do you think that the "best" theories never get reduced to more fundamental ones? It's your turn to pique my interest.

  10. Why do i take Chomsky to be wrong? fair question. To be honest i do not take him to be wrong because i do not know what his position IS. [and after reading 'The science of Language' i have doubts he does] I first got interested in Chomsky's work because what he said about language acquisition made a lot more sense than what some philosophers had to say. So for me it was very disappointing to learn over the years that the only thing that had really **improved** in Chomsky's work on acquisition was the nastiness of rhetoric and the degree to which he distorts the work of others. As you say in one of your posts, the POSA arguments of 2005 are not really any different from those of the 1960s. Now in the 1950s/60s there was not much evidence available suggesting that acquisition might not quite be as difficult and stimulus not quite as impoverished as Chomsky assumed - so back then i imagine his was quite an ingenious insight. But by now empiricists have done some amazing work and it seems at least possible that there might be an alternative to Chomsky's style of innatism. I say 'might be' because [1] none of the work done by empiricists to date could disprove Chomsky's hypothesis [only brain research has the potential to and we're a safe distance away from that happening] and [2] the actual success of empiricists so far is fairly modest - so 'on fence sitting' seems the most sensible position at the moment. However, I find it fairly frustrating that so much work [especially by philosophers but also in modelling and partly in developmental psychology] is done with either the intent to prove Chomsky wrong or to vindicate him - i think there is a huge risk to overlook what is really important when the focus is so skewed.

    Now to answer the other part: why do i take 'the others' to be right? Again that is too strong. I take their work seriously first of all because it focusses on language/language acquisition, not on bee communication or ant locomotion or nematode-neurons or comet-trajectories or quantum mechanics or... All these are interesting fields of inquiry but at the end of the day the will not inform us about what is SPECIAL about language - to find that out we have to look at language, at how actual kids actually learn it [not abstract away from it]. For me one of the take home messages of the work of generative semanticists was that we need not just account for how the child learns the difference between 'John is eager/easy to please" but also all the *crazy* data these guys focussed on. And to me the question that needs to be answered first and foremost is how an innate mechanism have evolved that could accomplish complete acquisition and result in 'steady state'. Minimalism seems to move away from answering that question

    So let me ask you a provocative question; would it be possible to convince you that the kind of innatism you defend is wrong? Would there be any X such that if someone discovers X you'd be willing to admit the view you currently hold is wrong or do you believe in the kind of innatism you defend no matter what empirical evidence turns up?

    1. In part it would not be possible. I could be convinced that what is special about humans re language is not domain specific or even species specific. That I could imagine, though nothing ever proposed comes close to being halfway compelling as it never discusses the actual data of interest (see the post on BPOC's paper on POS). But, I could see how such an argument might go. Take something interesting, e.g. binding theory, and show how to derive its properties using the proposed domain general operations. Until someone does something like this there is nothing to discuss. As Chomsky rightly says, everyone is a nativist in that they need to assume some bias in the learning function. Again, this is a truism and is not up for grabs. The only question is the nature of the bias. I happen to think that the evidence right now is that the bias is very linguistically specific. You may not. Ok, show me. Derive some parts of the GB principles on "general" grounds and I will bite. But until you do this, why insist that you know that the process is primarily environmentally driven? Read the Berwick et al paper and take the case they describe and show us how to get constrained homophony from a general inductive learner based on PLD. Go ahead. When you've done that, we can talk as there will be something to talk about.

      Let me end with a question: how much linguistics do you know? What don't you like about binding theory, or bounding theory, or case theory, or X'-theory, or...This is the guts of the empirical results to date. What's your beef? That there are some problems and even anomalies? Big deal. There always are and always will be. It sounds to me like you are not in a position to judge how well the field has gone. Am I wrong? Your comments are very generic. And it indicates to me that you don't really know what people like me take to be the basic results. IF that is so, I can see how you might find it hard to judge. I do and so I don't.

      In judging a program of research it helps a lot to know the details. I can imagine many ways of arguing against the specific proposals Chomsky and colleagues have made. To date, most of the debate has been in house. There has been very little of substance delivered by the environmentalists and their reliance on empiricist principles of induction and general learning. Prove me wrong: reanalyze the data. Nobody else has. Be the first on your block.

    2. "take the case they describe and show us how to get constrained homophony from a general inductive learner based on PLD. Go ahead. When you've done that, we can talk as there will be something to talk about."

      Maybe you could start the ball rolling by demonstrating how a *domain specific learner* starting from the PLD gets these principles (working in the MP) ? I.e. what is the solution that we should compare candidate empiricist learners with?

      If you could be explicit about the inputs you assume that would be helpful.

    3. First this was one example. But as a service here it is:
      What does the kid need to learn to master the anaphoric system. Take an LGB style theory: Principles A, B and C. What's part of FL/UG on this view: The notion antecedent, binding, binding domain, and the principles. What does the kind need to do to master these: s/he needs to identify whcih lexical forms are the anaphors, which the pronouns. Thats it. How does the kid do this: well sentences like 'John likes himself' help. As does the absence of 'John likes him' with John understood as bearing a pair of theta roles. So if we assume that the kid can "parse" a scene theta role wise (I assume this which is why theta roles are taken to have epistemological priority in the technical sense) and can parse the sentence morpheme wise (again I assume this has been mastered, though I don't know how the speech stream is segmented) then the job is to pair up the thematic meaning (there is of course more to meaning than this) with the morphemes. The kid notes (the PLD has information to this effect) that whenever 'himself' appears that the antecedent has two theta roles. He is looking for this because he is looking to find the anaphors given the properties of FL/UG. He is looking to distinguish reflexives from pronouns given the structure of UG and finds that 'John likes him' never attributes two theta roles to John. Once the kid knows which 'word' goes into which category (i.e. reflexive=A, pronoun=B, everything else= C) he is finished. Done.

      So the learning problem so construed comes down to a problem of categorizing morphemes into given categories. I have not consulted CHILDES but I am pretty sure that such sentences exist and that kids can identify them. There is a lot of kid work on principle C and it seems that this is manifested VERY early and kids never screw it up. FOr B and C the results have been hazier but I buy the recent stuff by Jeff Lidz, Colin Phillips and students showing that B is NOT actually screwed up when things are controlled for.

      At any rate, that's the story. On this account, the kid need not use the data to learn that "John believes himself is tall" is not a good version of 'John believes himself to be tall," that "John believes Mary to like himself" does not like 'John' as antecedent, that "John's book praised himself" does not like 'John as antecedent but that "John's book praised him' 'him' can take 'John as antecedent, etc etc etc. These facts follow NOT from the environmental input but from the categorization onto the A,B,C baskets.

      This is just another POS argument. UG kicks in not to acquire the categorization (though knowing what to look for helps) but in generalizing beyond the simple cases to the complex, e.g. the complementary distribution of bound pronouns and reflexives, the locality conditions on binding etc. So UG describes the inductive bias.

      The end.

    4. Just clarifying -- the syntactic and lexical categories are also presumably all innate? And are we assuming a some sort of X-bar schema?

      Is the input just flat -- a sequence of morphemes, or has it been formed into some sort of phrase marker at this point?

      And no parameter setting taking place anywhere?

    5. Categoris may or may not be. Need enough to define binding domains, which requires some notion of a clause (forget the picture noun cases for now) and if we go LGB some idea of government or accessible subject. X' is irrelevant, though branching is not as we need to define 'binding' which is part of FL/UG. I also abstract from the parameter setting issues for the main way binding differs is in the addition of an extra kind of anaphor (viz. long distance reflexives). How they fit in, we can abstract away from for now, though I think it is an interesting question. So, let's just take the simplest case. It seems that this occurs ubiquitously cross linguistically (English, Chinese, Kannada, Arabic etc) and for now this is enough. So let's tentatively dump the bells and whistles and see what can be done. Of course, before I get accused of "ignoring the data" we want to extend the account to include these complications, but, to my knowledge the core of the binding theory has proven to part of the extensions proposed.

    6. This comment has been removed by the author.

    7. (I deleted a previous comment)

      On reflection I don't really understand what 'parsing a scene into theta roles is'. Theta roles are syntactic -- so how can the child do this without already knowing the language? Are the theta roles innate in this model?
      And don't you need a notion of c-command to define binding so surely the child must already have got some notion of hierarchical structure?

      Is there a paper on this I could read?

    8. Yes you did. And this got me into trouble with Christina who thinks that I ducked your answer. Scene parsing into theta roles means representing the scene as an event with participants who are agents and patients a.o. roles. So when I watch Bill kiss Mary I can parse the scene as one involving a kissing, an agent who is Bill and a patient who is Mary. This representation is non-linguistic (not part of FL). Given this and maybe UTAH (most likely part of UG) we can provide a sentence with a meaning, i.e. assign theta roles to DPs. That's what we need. The rest then turns into a categorization problem; which DPs are reflexives, which pronouns, which other.

      By the way, this way of looking at matters is pretty standard. It is the GB story. Matter would not change except in detail if you liked some other theory of binding, e.g. William's or Pollard & Sag's, Reinhart and Reuland's. What the innate mechanisms are might change but they would be treated in the same way as outlined here. Note that in a GB theory this makes sense given the modular nature of the system. In more recent Minimalist accounts (e.g. Idsardi and Lidz, Hornstein, Kayne, Zwart) the theory is less modular but the effects cover those in binding theory though they make reflexivization and pronominalization parasitic on chains. As the net result is to derive, more or less, A and B of the Binding Theory, the story is less modular, but again learning is essentially word categorization. The rest is given by UG.

  11. I would have asked a question similar to Alex's - so i am interested in your answer.

    I probably know as much about linguistics as you about biology but nevertheless have asked people who know a lot more than me and will get back to you on binding theory etc.

    Meantime lets cover a philosophical/methodological point. You defend Chomsky's way to do science which, among other things, advocates to set aside data that refute one's theory [Chomsky, 2002, 2009, 2012]. So why are empiricists not allowed to 'set aside' the data you want me to account for? Here is what Chomsky advocates:

    "Take the Norman Conquest. The Norman Conquest had a huge effect on what became English. But it clearly had nothing to do with the evolution of language - which was finished long before the Norman Conquest. So if you want to study distinctive properties of language - what really makes it different from the digestive system ... you’re going to abstract away from the Norman Conquest. But that means abstracting away from the whole mass of data that interests the linguist who wants to work on a particular language." (Chomsky 2012, p. 84)

    You seem to say we are not allowed to abstract away from the whole mass of data that interests the linguist - some of these data are extremely important. Is there a theory independent way to determine WHICH data have this privileged position?

  12. People can abstract away from whatever they want to abstract away from. However, what one should abstract away from depends on the question one wants to address. If your question is how the LAD uses PLD to converge on a G then abstracting away from the Norman Conquest seems reasonable. Indeed idealizing to an ideal speaker hearer in a homogeneous speech community and assuming that learning is instantaneous is reasonable. How does one know. You shall judge me by my works! There is no other way of evaluating an idealization.

    Second point: everyone knows that the idealization is in some sense "false." We do not instantaneously acquire a grammar, there is not ideal speaker hearer and no homogeneous speech community, indeed, there is not such thing as English! That said, making these assumptions for the purposes at hand don't we think distort the problem, though they do make it easier. Non homogeneity of the speech community means having to first segregate the PLD into separate streams, a non ideal speaker hearer had memory issues, dynamic learning suggests that order of data presentation makes a difference etc. These poiints have been made and I at least am comfortable with the idealization. If you are not make others and see where you get. Find a question and answer it.

    What I object to is not abstracting away from some points rather than others. What I object to is making idealizations that abstract away from the question of interest and then dumping on work that doesn't. For MY questions much of the inductive learning literature begs the question. Maybe there is another question. Fine, but I have to judge whether the simplifications are apposite for MY question. They are not most of the time and what is irksome is that I am constantly being told otherwise. That's why I attack these approaches. I don't care if people want to study something else. Good luck. What I don't like is being told that I have to study what I want to study in THEIR way. So far, their way has added 0-value and I say so to prevent those who want to address the same questions as I do that there is no utility in looking at this stuff.

    Which data? None. The proof is in the pudding. Again, it seems you are not in a position to judge the pudding. Too bad.

  13. Oh yes: I don't make pronouncements about the state of biology given my tyro status. I make very lead footed analogies. I say nothing about the state of the art. This is a big difference between us.

  14. This comment has been removed by the author.

    1. I think we've milked this for all it's worth so I cam going to cut off discussion after giving myself the last word (it is my blog after all).

      The problem is not much changed if you adopt Pollard & Sag's theory so go ahead and adopt it and then derive their version of the binding theory using your general learning algorithms. Fine with me. I chose GB because it is pretty standard, Pollard & Sag's account covers at least the standard GB data. You want a more complex theory to deduce, be my guest. If you can derive their account, I will listen. Till then...

      As for questions we cannot currently answer: OF COURSE! But you knew this. There are tons of questions we cannot answer. The question is whether some piece of work helps us answer those unanswered questions. As regards the general learning literature, I believe that the answer is no. Some work (Berwick's thesis, De Maracken, Charles Yang, Wexler) has been very insightful, but mainly because it addresses the problem Chomsky posed rather than some other problem. I have nothing against addressing other problems, but then don't be surprised if I think that this is irrelevant to my interests. To repeat, most of the general learning results are simply irrelevant and have added nothing of value to the POS problem as Berwick et al demonstrate.

      As for controlling what others look at: Let em look. I don't control what people read or think about. But, if they want my opinion, there it is. They are free to discard it and I am free to offer it.