Tuesday, November 8, 2016

More on model organisms

A few posts back (here), I made the point that biologists, unashamedly, establish biological processes based on a relatively small number of model organisms. This kind of argument, I noted, is regarded very suspiciously within linguistics where many (indeed, I would guess, most) practitioners consider it scientifically irresponsible to infer properties of UG based on the inspection of a handful of language particular Gs. I recently ran across a nice discussion of model organisms and their role in fundamental biology (see here). The most recent Nobel in bio was awarded for work on autophagy based on this process on brewer's yeast (here). The conclusions concerning the fundamental mechanisms are taken to be obviously relevant to biology in general despite its being based entirely on what takes place in  yeast, a unicellular organism. In other words, what's good for yeast is good for humans, plants, insects despite their rather obvious differences.

The assumptions behind the logic of model organisms and the biological inferences it licenses is embodied in the featured quote by Jacques Monod that heads the linked to paper: "Anything found to be true of E.coli must also be true of elephants." Read this again. And again. Now translate this into the linguistic analogue (anything true of English must be true of Swahili). Does it sound natural to you? I would bet not. Within linguistics it is, I think, widely assumed that the only legit way to establish universals is bottom up, by generalizing over the properties of Gs. This is why linguists are so defensive when someone remarks that GGers only study English. Of course this is false, and that is worth pointing out. But the reflexive riposte (i.e. that GGers DO study TONS of non Indo-European languages and have FOR A VERY LONG TIME) is evidence that GGers believe that were the criticism accurate then this would be scientifically shameful.

It is curious (and important) that biologists don't feel the same way. In fact, the opposite. But so far as I can tell, the logic is the same: the fundamental principles of G organization will not and should not differ across various Gs. Why not? Because these are rooted in biologically general properties that guide human linguistic facility. We should expect conclusions drawn about UG in English to generalize to all other Gs. The only serious impediment is being misled by the surface idiosyncrasies of English (or whatever other G one is studying). And that's where PoS arguments come in. They are useful precisely because they attempt to establish UG principles by abstracting away from the surface noise that PLD can induce.

Does this make such arguments infallible? No. Does this mean that cross G study of typologically diverse Gs is of marginal utility? No. What it means for me is that the reflexive suspicion that contemporary GGers have for PoS arguments of reasoning from the properties of a small set of Gs to general properties of FL/UG is methodologically misplaced. Or, more accurately, if it is suspect in linguistics then it is also dubious in biology. And those that believe this should contact the Nobel committee before it errs again in supporting scientific malpractice.

19 comments:

  1. A big difference here is that when we look at a model organism, we get to apply centuries worth of physics, chemistry and thermodynamics to it, whereas for a language, the only thing we have that is on the same level as the grand principles behind those sciences is the no-telepathy constraint. Furthermore, with organisms, we're pretty good at tracking everything that goes into and comes out of them, whereas, for language, we have no access to what is really going into the child's LAD via their perceptual systems, which are different from ours. And until recently, we couldn't get very much of the output, either.

    Looking at lots of languages probably can't really make for this deficiency, but is surely better than nothing.

    And that's not all ... a lot of normal scientific data comes in the form of nice curves that you can write equations for, but in syntax we get a baffling distribution of stars that never include all of the data points you want in order to test whatever theory lies behind your analysis. The effect is that the patterns in the data as they come to the theorist are far less compelling.

    ReplyDelete
    Replies
    1. I am not sure I agree with this. The state of biology was also pretty fraught, but the idea was that there was a uniformity of mechanism across the fundamentals. This could have been false, but it wasn't. In fact, part of the revolution in bio was to displace naturalists and replace them with molecular/cellular types. Departments were destroyed over this, with the cellular/molecular types winning. I am also far less skeptical than you are about the fundamental facts in linguistics. I don't find the stars that bewildering. I think we have real "effects." I have enumerated these in the past and stick by my view that these are pretty widely valid.

      Delete
    2. What era are you talking about? I'm thinking of roughly the last 100 years or so, with plenty of chemistry, physics and thermodynamics established, and also evolution (the single organism that you're focussing on can't be a total one-off).

      Evolution is an example of a crucial ingredient which cannot even in principle be established by looking at a single organism.

      Delete
  2. You treat an individual language as a model system. Another approach is to treat a recurring phenomenon across languages as a model system, e.g., unbounded movement, or agreement, or evidentials.

    ReplyDelete
    Replies
    1. @Colin –

      Obviously I'm much closer to the approach you lay out here than to the one Norbert discusses, but I'll give Norbert this: when you're looking at a single speech community & language, then you can be sort-of-reasonably sure that you're looking at a uniform phenomenon. Well, modulo some important caveats about interspeaker variation, as well as the possibility of single speakers commanding multiple variants of their language and switching between them based on sociolinguistic factors. I don't mean to downplay the significance of these caveats; but at least there's a shot that you're looking at something uniform.

      But when someone says "let's look at agreement across all languages," well, personally I think the onus is on that person to even establish that the phenomenon is uniform to begin with. Let me make this less abstract: suppose you're looking for languages that show verbal agreement with the transitive subject. You open up WALS, and find that Basque is listed as such a language (http://wals.info/valuesets/102A-bsq). Well, it turns out that this is simply false; Basque has no agreement with transitive subjects, only obligatory clitic doubling of transitive subjects (Arregi & Nevins 2008, 2012; Preminger 2009).

      It would be more surprising to me (not unheard of or impossible; but surprising) if a given speech community had two populations of speakers, one of whom had obligatory clitic doubling of transitive subjects and no agreement, and the other which had agreement with transitive subjects, but no clitic doubling.

      tl;dr – the very idea of "phenomenon X across different languages" is fraught, and invites a kind of 'surfacism' that is pernicious. It demands at least as much care and caution as the "single language as model organism" approach.

      (NB: None of this is news to Colin, of course. I'm just writing a response that was prompted by his comment.)

      Delete
    2. One additional point to Omer's: one cannot tell what a phenomenon is by looking at the surface facts. What's a long distance dependency? What's agreement? Only detail G investigation can say. There are no "constructions" that are visible.

      Delete
    3. Another thing that's worth noting is that the extent of "inter-speaker variation" seems to be exaggerated. After working though numerous cases of language variation at the individual level, Bill Labov speaks of "the enigma of uniformity" (see e.g., the third volume of his Principles of Linguistic Change") that the speakers within a linguistic community generally have remarkably similar in their abstract knowledge of language, as seen in variation and change. Enigma, I suppose, because everyone's linguistic and social experience is transparently different.

      Delete
    4. But not always completely, as the literature on the 'new passive/impersonal' in Icelandic seems to show

      Delete
    5. Avery: Of course ;-) See the paper here (http://repository.upenn.edu/pwpl/vol19/iss2/11/)

      Delete
    6. I am quite intrigued also by the Korean data -- "V-raising and grammar competition in Korean: Evidence from negation and quantifier scope". (here).

      Delete
    7. And here: http://www.pnas.org/content/113/4/942.abstract

      Delete
  3. There is obviously a lot you can learn from a single organism, a single language, a single phenomenon, and I think sometimes this approach is absolutely necessary in science. In doing this I think it's important to establish a promissory note that we want our theory to be adequate for human languages generally.

    Sociologically, right now in academia there is a very dominant perspective that inclusion and diversity are essential values. This is for very good reasons and is absolutely correct. I think the academic culture often extends the social value of diversity to a scientific value as well, which leads to a distaste of model languages, which is unfortunate.

    ReplyDelete
  4. "Anything found to be true of E.coli must also be true of elephants." Read this again. And again. Now translate this into the linguistic analogue (anything true of English must be true of Swahili)

    Both statement are false, at least as stated. For example, E. Coli reproduces by cellular division. Elephants do not. For another example, E. Coli can cause serious, occasionally fatal, illness in humans. Elephants do not.

    Similarly, Swahili has a bunch of noun classes, English does not. Swahili has agglutinative morphology, English does not. Swahili is a Bantu language, English is not. And so on.

    In both cases, you have to make certain assumptions for the statements to be true. I'm not a biologist, so I can't say exactly how to restrict the range of phenomena in question so that anything true of E. Coli is true of Elephants. But with respect to language, I would have expected you to say something along the lines of "Anything found to be true of the cognitive systems that enable children to acquire English is true of the cognitive systems that enable children to acquire Swahili."

    ReplyDelete
    Replies
    1. You put the right words into my mouth. Yes, the issue is not langauges but FL/UG.

      Delete
    2. While I'm entirely sold on the usefulness of the analogy of "model organism" to linguistics (especially of the GG variety), I think it is important to keep in mind that this is a somewhat idealized view of biology.

      From my admittedly tenuous grasp on what happens in the vast world of biology, the prevalence of usage of "model organisms" varies quite a lot from subfield to subfield, and even in those fields that do rely on them extensively, your mileage may vary quite a bit.

      My impression is that model organisms really shine in fields like molecular and cellular biology, but that they get progressively less successful/useful and less prevalent the higher you go on the complexity chain. There are plenty of reasons to believe that what is true of E.Coli will be true of elephants in terms of the most basic molecular and cellular processes (at that level we are basically investigating simple physical processes), but as soon as you start talking about other model organisms in areas like models of diseases or animal models for drug/treatment testing the rate of success (and insight) falls rather abruptly.

      More importantly, in other important areas, like evolutionary theory, it seems that it is the exact opposite approach that leads to insight: i.e., you have to have a large catalogue of descriptive data about a ton of different organisms, their historical development and their environment (present and past) to even start finding the right sort of cases that can be used to test interesting hypotheses about evolutionary processes.

      All of this to say that while I personally am convinced on how useful the model organism view of linguistics can be, I cannot fault those who are not so convinced that this is the only or best way to approach things.

      Delete
    3. I wonder if a good comparison for linguistics might be the pre-Darwinian comparative anatomists ... according to my foggy recollections, they looked a lot of organisms, but did so in terms of their well-developed theory of mechanics (if it's big, its legs have to be big if its going to walk, if its bones are solid, it can't have been a flier, etc etc). The Icelandic New Passive paper by Charles and others that he linked to above might be a beginning example of, or a further step in the development of that kind of style of work. Enough more of the same kind of thing might make an impression on Everett, Levinson and the two Evanses, but I don't think there are yet enough of these kinds of notes to constitute a tune that they would think they have to listen to.

      Delete
  5. The quote is not "Everything true of elephants is true of E.Coli". Model organisms are normally simple -- but languages are typically assumed to be all of the same complexity. (I have my doubts about the validity of this assumption). So it doesn't seem clear that studying English gives any advantage, whereas E. Coli are a bit more convenient than elephants.

    ReplyDelete
  6. Anything found to be true of E.coli must also be true of elephants." Read this again. And again. Now translate this into the linguistic analogue

    (Haven't posted here before--I'm a neuroscientist who found this blog on my way somewhere else.)

    As the discussion you linked points out, model species aren't chosen in a particularly systematic way. They don't necessarily occupy critical places on the phylogenetic tree, and in fact they've often been bred so much that they're pretty much out there on a long limb of their own. At best, we choose them as a matter of convenience: Drosophila reproduces quickly, Aplysia is a simple organism with a nervous system you can get your head around, Arabidopsis is easy to grow, etc. You *could* study the same processes in turtles, dolphins, and sequoias, but you'd never get anywhere during the typical human lifespan.

    On top of that, focusing on a model organism eliminates one source of variability (or so we like to think). For instance, if you're studying cell reproduction in brewer's yeast and I'm studying it in Candida, we don't know how to interpret conflicting results: Is one of us wrong, or do the two organisms work in different ways?

    Once we have a mechanism pretty well nailed down in a model organism (which is a heck of a feat, hence the Nobel prize), we can start seeing if other organisms work in the same way. Sometimes they do, of course. Sometimes we find things like C3 and C4 photosynthesis, extremophiles, anaerobic and aerobic respiration, etc. These are the very cases that expand our understanding of how life works, but you don't know they exist until you go and look.

    The same seems to be true for languages--if you stick to English, you won't find out much about ergativity, agglutination, etc. If instead you go and look at Euskara, Navajo, or whatever, you might find examples that tell you that the underlying principles aren't what you thought. My impression is that these investigations led people to start edging away from P&P towards the more general (vaguer?) concept of Merge; it just got too hard to nail down any P&Ps that really seemed universally applicable, because somebody kept finding an exception.

    ReplyDelete
    Replies
    1. I am not sure I would disagree. But there is a difference, I believe. It seems that many biologists buy into the Jacob quote I referred to and that the basic mechanisms stay content across large swaths of biology. It is hard to do things in two entirely different ways and this is considered a kind of strong rule of thumb: if A works like this here then it is likely to work this way there. Recent work on the eye, long taken to be indicative of entirely different biological bases in different species, is a recent vindication of this view of things. Linguists don't really believe that basic G mechanisms are constant across Gs. How do I know? Because they are loathe to accept that results on one G can be indicative of universal principles/operations. This is the assumption that I would like to dislodge.

      Delete