Thursday, April 9, 2015

Yet more things to read

Here are some papers I’ve come across lately that you might also find interesting.

1. “Is evolvability evolvable” (here) is by Massimo Pigliucci. Pigliucci has a background in biology (he was a practicing professional evolutionary biologist for many years) but is currently in the phil dept at CUNY (see here). He currently also manages a web blog called Scientia Salon that is often quite amusing (I subscribe) that deals with larger philo questions of interest to the general public. At any rate, the paper above is an interesting run down of the current state of the Modern Synthesis (MS) (the standard theory in evolution) and, in particular, how biologists are trying to supplement it with what is effectively a theory of variation; where does variation come from? Is it actually random? How’s it link to development? The discussion makes all sorts of nice distinctions, like the one between variation and variability. The former depends on “the standing genetic variation,” the latter the potential variation that is as yet unrealized in the population (a distinction that those familiar with the G vs UG distinction should find congenial). This extension of MS is all the rage now, and it will transform how we conceive of evolution, if Pigliucci is right. As he puts it

The heritability end of the spectrum sits squarely within the Modern Synthesis. However, the end of the continuum that deals with major transitions is squarely in the territory that should be covered by the EES (Extended Evolutionary Synthesis-NH). This is not because the new ideas are incompatible with the Modern Synthesis (arguably, nothing in the EES is), but because they introduce new processes that enlarge the scope of the original synthesis and cannot be reasonably subsumed by it without resorting to anachronistic post facto reinterpretations of what that effort was historically about.

In other words, the EES provides new mechanisms to account for evolutionary change. The MS’s main mechanism was natural selection. EES wants to expand the range of explanatory processes. How? Well by constricting the range of possible variation. In other words, natural selection operates over a restricted space of options that the theory of variation aims to explicate. This should all sound very familiar.

Let me add one point: I suspect that to the degree that the theory of variability constrains evolutionary options, to that degree natural selection qua mechanism will seem less and less important. It won’t go away as the options are never unique. But, as in the debate over acquisition, the interesting action may begin to shift from Natural Selection to Variability (learning in a UG context). These are points that people like Chomsky and Fodor have been making for a long time (and have been ridiculed for making it).[1] It is interesting to see that sometimes logic suffices to see which way things ought to go.  

2. It seems that the Gallistel-King conjecture is getting traction in the popular science press. We discussed this paper before (here). Chris Dyer sent me this link to a SciAm piece on the topic. It seems that intra-neuron computations are capturing the popular science imagination (see here). The discussion in this article is not that informative, but I think it indicates that the Gallistel conjecture has legs. If so, we might be witnessing a truly magnificent intellectual event: an understanding of how things are from considerations of how they must be. This is theoretical speculation at its best. Very exciting.

3. Here is more on my hobby-rodent: mouse songs. Bill Idsardi sent me this little article (along with sound files). It seems that male mice are real crooners with at least two types of songs at their paw-tips. Interestingly, it seems that the female doesn’t do much singing, though it appears that she can. She just doesn’t. Were female mice Piraha we would conclude form their reluctance to sing that they couldn’t. In other words, we could conclude that mice can’t sing qua mice. We would be wrong, but hey, no reason to ever go beyond the data, right? At least these mice biologists have not confused capacity and behavior.

4. This paper is relevant to our discussion of the receptivity of the general (scientific) public to our kinds of results. This work got big play, including on NPR apparently. What it shows is that kids know a ton and that knowing a ton is what makes it possible for them to know anything else. I don’t recall any papers getting big play where it is argued that kids know nothing and learn it all. That is the default assumption, perhaps, which is why it is not “news.” So these kinds of results are not only news, they are eagerly taken up. I am told that it even got featured on NPR. I would add that child development is not, so far as I know, a general high school subject. Yet this doesn’t prevent the GP from lapping this stuff up. So, though I agree that getting linguistics into high schools would be a fine thing (and a pretty cheap and efficient way to teach the scientific method as Wayne O’Neil has argued for quite a while) there is plenty of room for improvement publicity wise in getting our views out there in the public domain.

5. As many may know, Dennis Ott and Angel Gallego are editing a 50th anniversary of Aspects volume. I’ve just read Paul Pietroski’s contribution and I cannot recommend it highly enough (here). It concentrates on elucidating Chomsky’s related conceptions of descriptive and explanatory adequacy and outlines how these notions are related to questions of language acquisition. The position he comes down on is quite closely related to the one discussed in an early paper by Fodor (discussed here).  

To me the most interesting feature of the paper was the way Paul relates these discussions to Goodman’s problem of induction. He notes that Chomsky noted that the central issue is finding the right “vocabulary that makes it possible to construct a certain range of grammars.” Note, the focus is not on the weak or even the strong generative capacity of Gs, but the vocabulary that they are written in. The conclusion: “linguists who want a descriptively adequate theory presumably need to aim for the “higher goal” of characterizing the vocabulary that children use to formulate grammars” (p. 7). Why the emphasis on basic vocabulary? Because the basic vocabulary determines the natural projectable predicates. This is what Goodman showed in 1954 and it is what makes the basic vocabulary the main event. Indeed, absent a specification of the main predicates induction just cannot work as we think it should. Paul rehearses these points in a very accessible manner and shows how central they are to the explanatory enterprise. As he puts it “kids project gruesomely” (read the paper to understand this Goodmanism) and the linguist’s problem is to find the predicates that support such projections. Terrific little paper. Bodes well for the whole volume.

[1] Indeed, Pigliucci (here) wrote a very critical review of Fodor and Piatelli-Palmarini’s book indicating that their main criticism, which to my mind amounted to observing that the MS needed a theory of variation. It is not clear whether he disagreed with their point or whether he thought that the field had already internalized it. If the latter, then whether or not this is “news” or not seems less relevant than whether or not this is true. In this piece Pigliucci seems to agree that it would be a very important addition to MS.


  1. I thought Paul P's piece was a very good read. One question about this section:
    "The general point is that for any given language,
    a descriptively adequate grammar assigns
    exactly n structural descriptions to a string of words
    —or more precisely, to a string of formatives—
    that can be understood in exactly n ways"

    So what is the argument for this: why not more than n (or less than n if you do underspecified semantic representations)?
    It seems like it is hard to satisfy this if you want to get e.g. RNR facts right.
    It seems to rule out CG approaches by stipulation. Any thoughts?

    (I copied from the pdf to be sure of getting the quote right ...)

  2. Hi Alex,
    Thanks. I don't think this is as tendentious as it might initially sound, so long as we take grammars to be procedures that (generate expressions that somehow) connect meanings with pronunciations in constrained ways.

    If a single structural description can support more than one meaning, then even if a grammar assigns exactly one structural description to ‘The duck is eager to eat’, that still doesn’t explain why the corresponding pronunciation doesn’t go with two meanings, including that of ‘The duck is eager for us to eat it’. With regard to this case, one might say that if the one structural description links ‘The duck’ (only) to the subject position of ‘eat’, then ‘The duck’ cannot also be understood in association with object position of ‘eat’. But that would be an extra stipulation. (If a single structural description can support more than one meaning, why doesn’t this one?) IN general, capturing the boundlessly many absent-reading facts would require a lot of stipulations. Suppose the structural description corresponding to the adverbial reading of the prepositional phrase in ‘Al saw a man with a telescope’ is roughly: [Al [saw a man] [with a telescope]]. Why doesn’t this way of structuring of the lexical items support the following meaning: Al is such that he both saw a man and possesses a telescope?

    We can write a semantic theory according to which the structure only supports the following meaning: Al is such that he saw a man by using a telescope. But if a single structure can support multiple meanings, why doesn’t [Al [saw a man] [with a telescope]] support the meanings it doesn’t support? For any particular case, one can perhaps craft a construction-specific constraint. But not only does this raise questions about the vocabulary in terms of which the constraints are formulated, the point is pervasive. If [[no dog] [chased [every cat]]] can support more than one meaning, how come it doesn’t support an “inverted scope” construal? Indeed, why doesn’t [Romeo [loved Juliet]] also have a Juliet-loved-Romeo reading?

    To be sure, one can formulate grammars that permit slight relaxations of a one-meaning-per-structure principle—allowing for homophony that is neither lexical nor structural, but only in specific cases where this is explicitly licensed by a semantic principle that does not overgenerate. But often, such grammars can be recoded (even if initially formulated in CG terms) by positing a little more structural homophony. Such positing might seem ad hoc if one brackets the (boundlessly many) absent-reading facts. But I think the simplest—and perhaps only—systematic explanation of such facts assumes that kids assume a one-meaning-per-structure principle. And there seem to be many confirming instances of the one-meaning-per-structure generalization. That said, I don’t doubt that many particular facts are hard to square any such principle; providing descriptively adequate structural descriptions is very hard, even allowing for abstract descriptions. But I think this is a principle we relax at our peril. I don’t think it follows that no CG approach can be explanatorily adequate. But it has been wondered how well such approaches do when it comes to explaining absent-reading facts.

  3. So I am not sure that this is the right way to think about it, but one can draw a distinction between the problem and the likely properties of the solution.
    So for example suppose we have two grammars (or sets of grammars) A,B which both generate the same set of sound meaning pairs, and one of which (A) assigns to each sentence with n distinct readings exactly n distinct structural descriptions each of which generates exactly one sound/meaning pair, and the other of which (B) assigns m distinct SDs (where m >> n), so that each sound meaning pair is generated by
    many distinct SDs.
    (Technical side note : there is a beautiful recent paper by Kanazawa and Salvati that shows exactly this property for CFGs and Lambek grammars under certain restrictions: see here).
    So there are many different reasons to choose between A and B -- learnabIlity, succinctness, processing efficiency ..-- but all else being equal is there some reason to prefer A to B just on the grounds that B does not hold to the one SD per meaning constraint?

    I think this way round is a more interesting problem than the one meaning per SD constraint.

    The second question is whether there are technical arguments for thinking that all of the reasonable answers are likely to be of type A rather than type B. My take is that to deal with certain phenomena it seems likely that one cannot have good descriptions of type A. Example:
    1) John gave Mary a biscuit
    2) John gave Mary a biscuit and David a slice of cake.
    I don't know how this is dealt with in MGs, but it seems hard to deal with Ex 2 without ending up with two SDs for Ex 1.

    I think the technical result I mentioned above is sufficient to show that the simple examples you gave can be dealt with in a system with spurious ambiguity. But overgeneration in modern CG systems (eg Baldridge style CCG) is maybe adequate.

    1. @Alex: Why would it be hard to generate ex 2 without using multiple SDs for ex 1? One feasible account is VP coordination with ATB head movement, and that doesn't require changing anything about the structure of 1 (copies are in round brackets):

      John T [vP (John) v-give [VP [VP [DP Mary] [V' (give) [DP a biscuit] ] ] and [VP [DP David] [V' (give) [DP a slice of cakes] ] ] ] ]

      If you don't like ATB-movement, the second conjunct can be treated as a full TP with empty subject and empty head (the distribution of which is suitably restricted by an MSO-definable constraint). That would be similar to a deletion analysis.

      That said, I agree that a less strict bound on the number of SDs makes for a more stategic starting point. In Paul's statement, it is hard to discern for me what is meant by

      - identical SDs (perfect identity? isomorphism? structural identity with non-isomorphic labeling differences? something even more relaxed?)
      - different meanings (different logical formulas? different truth values in some arbitrary model? different truth values in some restricted class of models? If so, restricted by world knowledge alone or also discourse?).

      I'd prefer a less specific conjecture as a starting point. E.g., is the number of SDs for a given sentence linearly bounded by its length? Probably not, because the number of conjunction/disjunction ambiguities blows up with every additional coordination. Similar problems arise with QR, assuming that a sentence with n quantifiers can indeed have up to 2^n distinct meanings that are structurally encoded in the SD. So then the question is whether a linear bound holds if we ignore those cases, e.g. by a finely tuned version of Paul's distinct meanings clause. I have no idea what the answer will be in that case, but it's an interesting problem.

    2. Maybe that's not the best example then.
      Perhaps something simpler like "Peter likes Norbert and Noam and Jeff".
      I guess in that case it depends on the details of the identity conditions on meanings.

    3. A simple illustration might be the classic analysis of strict/sloppy ambiguities. On this analysis, "John loves his mother and Bill does too" has two[*] interpretations because there are two different structures for "John loves his mother" (one where 'John' binds 'his' as a variable and one where 'John' and 'his' are coreferential). But then if there are two structures for "John loves his mother", we have two distinct structural descriptions for something which appears to be understood only in one way. (One might deny that the distinction between variable binding and coreference is encoded in structural descriptions at all, but I think that would create other problems for the hypothesis that there's a one-to-one mapping between SDs and ways of understanding.)

      Lots of possible problems with this example, of course. For one thing, depending on how we count "ways of understanding", it might be claimed that we actually can understand "John loves his mother" in two ways. And it may very well be that the classic analysis of strict/sloppy ambiguities is wrong to link the perceptible ambiguity of the second conjunct to a hidden ambiguity in the first. But I think the example would illustrate the logic of the situation.

      [*] Leaving aside interpretations where the first pronoun doesn't pick out John.

    4. Hi Alex,
      I’m not so worried if a grammar generates many SDs for one meaning. If human grammars are like that, I have some rethinking to do in other parts of my life. But I don’t think this affects the import of the point I was (perhaps over-)making in the passage you cited. There, I think what really mattered is that a single SD not support two meanings. And if that is granted, then it may be that any remaining nonterminological differences are small. Do we allow for lots of intralinguistic synonymies, even though kids generally try to avoid homophonous synonymies? Or do we start chopping meanings more finely (a la Cresswell)?

      I’m certainly not assuming that meanings are sets of possible worlds, or that logically equivalent claims are synonymous. On any such view, the mapping from SDs to meanings is very plausibly many-to-one. (If the meaning of a sentence is its truth value, or the meaning of a name is its referent, the mapping from SDs to meanings is boundlessly-many-to-one.) It’s notoriously hard to come up with an independently motivated way of counting meanings. But I guess one question is whether we should (i) try to find a way of counting them that is plausibly fine-grained, but not so fine-grained that a grammar cannot generate (interestingly many) distinct but synonymous SDs, or (ii) say that meanings are individuated by SDs, at least to a first approximation, and use that as a wedge to help figure out what meanings are.